924 resultados para RNA secondary structure
Resumo:
Mutations at position 912 of Escherichia coli 16S rRNA result in two notable phenotypes. The C-->U transition confers resistance to streptomycin, a translational-error-inducing antibiotic, while a C-->G transversion causes marked retardation of cell growth rate. Starting with the slow-growing G912 mutant, random mutagenesis was used to isolate a second site mutation that restored growth nearly to the wild-type rate. The second site mutation was identified as a G-->C transversion at position 885 in 16S rRNA. Cells containing the G912 mutation had an increased doubling time, abnormal sucrose gradient ribosome/subunit profile, increased sensitivity to spectinomycin, dependence upon streptomycin for growth in the presence of spectinomycin, and slower translation rate, whereas cells with the G912/C885 double mutation were similar to wild type in these assays. Comparative analysis showed there was significant covariation between positions 912 and 885. Thus the second-site suppressor analysis, the functional assays, and the comparative data suggest that the interaction between nt 912 and nt 885 is conserved and necessary for normal ribosome function. Furthermore, the comparative data suggest that the interaction extends to include G885-G886-G887 pairing with C912-U911-C910. An alternative secondary structure element for the central domain of 16S rRNA is proposed.
Resumo:
Linkage disequilibrium between polymorphisms in a natural population may result from various evolutionary forces, including random genetic drift due to sampling of gametes during reproduction, restricted migration between subpopulations in a subdivided population, or epistatic selection. In this report, we present evidence that the majority of significant linkage disequilibria observed in introns of the alcohol dehydrogenase locus (Adh) of Drosophila pseudoobscura are due to epistatic selection maintaining secondary structure of precursor mRNA (pre-mRNA). Based on phylogenetic-comparative analysis and a likelihood approach, we propose secondary structure models of Adh pre-mRNA for the regions of the adult intron and intron 2 where clustering of linkage disequilibria has been observed. Furthermore, we applied the likelihood ratio test to the phylogenetically predicted secondary structure in intron 1. In contrast to the other two structures, polymorphisms associated with the more conserved stem-loop structure of intron 1 are in low frequency, and linkage disequilibria have not been observed. These findings are qualitatively consistent with a model of compensatory fitness interactions. This model assumes that mutations disrupting pairing in a secondary structural element are individually deleterious if they destabilize a functionally important structure; a second "compensatory" mutation, however, may restabilize the structure and restore fitness.
Resumo:
The tendency of a polypeptide chain to form alpha-helical or beta-strand secondary structure depends upon local and nonlocal effects. Local effects reflect the intrinsic propensities of the amino acid residues for particular secondary structures, while nonlocal effects reflect the positioning of the individual residues in the context of the entire amino acid sequence. In particular, the periodicity of polar and nonpolar residues specifies whether a given sequence is consistent with amphiphilic alpha-helices or beta-strands. The importance of intrinsic propensities was compared to that of polar/nonpolar periodicity by a direct competition. Synthetic peptides were designed using residues with intrinsic propensities that favored one or the other type of secondary structure. The polar/nonpolar periodicities of the peptides were designed either to be consistent with the secondary structure favored by the intrinsic propensities of the component residues or in other cases to oppose these intrinsic propensities. Characterization of the synthetic peptides demonstrated that in all cases the observed secondary structure correlates with the periodicity of the peptide sequence--even when this secondary structure differs from that predicted from the intrinsic propensities of the component amino acids. The observed secondary structures are concentration dependent, indicating that oligomerization of the amphiphilic peptides is responsible for the observed secondary structures. Thus, for self-assembling oligomeric peptides, the polar/nonpolar periodicity can overwhelm the intrinsic propensities of the amino acid residues and serves as the major determinant of peptide secondary structure.
Resumo:
Motivation: Conformational flexibility is essential to the function of many proteins, e.g. catalytic activity. To assist efforts in determining and exploring the functional properties of a protein, it is desirable to automatically identify regions that are prone to undergo conformational changes. It was recently shown that a probabilistic predictor of continuum secondary structure is more accurate than categorical predictors for structurally ambivalent sequence regions, suggesting that such models are suited to characterize protein flexibility. Results: We develop a computational method for identifying regions that are prone to conformational change directly from the amino acid sequence. The method uses the entropy of the probabilistic output of an 8-class continuum secondary structure predictor. Results for 171 unique amino acid sequences with well-characterized variable structure (identified in the 'Macromolecular movements database') indicate that the method is highly sensitive at identifying flexible protein regions, but false positives remain a problem. The method can be used to explore conformational flexibility of proteins (including hypothetical or synthetic ones) whose structure is yet to be determined experimentally.
Resumo:
Background: The structure of proteins may change as a result of the inherent flexibility of some protein regions. We develop and explore probabilistic machine learning methods for predicting a continuum secondary structure, i.e. assigning probabilities to the conformational states of a residue. We train our methods using data derived from high-quality NMR models. Results: Several probabilistic models not only successfully estimate the continuum secondary structure, but also provide a categorical output on par with models directly trained on categorical data. Importantly, models trained on the continuum secondary structure are also better than their categorical counterparts at identifying the conformational state for structurally ambivalent residues. Conclusion: Cascaded probabilistic neural networks trained on the continuum secondary structure exhibit better accuracy in structurally ambivalent regions of proteins, while sustaining an overall classification accuracy on par with standard, categorical prediction methods.
Resumo:
Kunjin virus is a member of the Flavivirus genus and is an Australian variant of West Nile virus. The C-terminal domain of the Kunjin virus NS3 protein displays helicase activity. The protein is thought to separate daughter and template RNA strands, assisting the initiation of replication by unwinding RNA secondary structure in the 3' nontranslated region. Expression, purification and preliminary crystallographic characterization of the NS3 helicase domain are reported. It is shown that Kunjin virus helicase may adopt a dimeric assembly in absence of nucleic acids, oligomerization being a means to provide the helicases with multiple nucleic acid-binding capability, facilitating translocation along the RNA strands. Kunjin virus NS3 helicase domain is an attractive model for studying the molecular mechanisms of flavivirus replication, while simultaneously providing a new basis for the rational development of anti-flaviviral compounds.
Resumo:
Coleoptera is the most diverse group of insects with over 360,000 described species divided into four suborders: Adephaga, Archostemata, Myxophaga, and Polyphaga. In this study, we present six new complete mitochondrial genome (mtgenome) descriptions, including a representative of each suborder, and analyze the evolution of mtgenomes from a comparative framework using all available coleopteran mtgenomes. We propose a modification of atypical cox1 start codons based on sequence alignment to better reflect the conservation observed across species as well as findings of TTG start codons in other genes. We also analyze tRNA-Ser(AGN) anticodons, usually GCU in arthropods, and report a conserved UCU anticodon as a possible synapomorphy across Polyphaga. We further analyze the secondary structure of tRNA-Ser(AGN) and present a consensus structure and an updated covariance model that allows tRNAscan-SE (via the COVE software package) to locate and fold these atypical tRNAs with much greater consistency. We also report secondary structure predictions for both rRNA genes based on conserved stems. All six species of beetle have the same gene order as the ancestral insect. We report noncoding DNA regions, including a small gap region of about 20 bp between tRNA-Ser(UCN) and nad1 that is present in all six genomes, and present results of a base composition analysis.
Resumo:
Background Strand specific RNAseq data is now more common in RNAseq projects. Visualizing RNAseq data has become an important matter in Analysis of sequencing data. The most widely used visualization tool is the UCSC genome browser that introduced the custom track concept that enabled researchers to simultaneously visualize gene expression at a particular locus from multiple experiments. Our objective of the software tool is to provide friendly interface for visualization of RNAseq datasets. Results This paper introduces a visualization tool (RNASeqBrowser) that incorporates and extends the functionality of the UCSC genome browser. For example, RNASeqBrowser simultaneously displays read coverage, SNPs, InDels and raw read tracks with other BED and wiggle tracks -- all being dynamically built from the BAM file. Paired reads are also connected in the browser to enable easier identification of novel exon/intron borders and chimaeric transcripts. Strand specific RNAseq data is also supported by RNASeqBrowser that displays reads above (positive strand transcript) or below (negative strand transcripts) a central line. Finally, RNASeqBrowser was designed for ease of use for users with few bioinformatic skills, and incorporates the features of many genome browsers into one platform. Conclusions The features of RNASeqBrowser: (1) RNASeqBrowser integrates UCSC genome browser and NGS visualization tools such as IGV. It extends the functionality of the UCSC genome browser by adding several new types of tracks to show NGS data such as individual raw reads, SNPs and InDels. (2) RNASeqBrowser can dynamically generate RNA secondary structure. It is useful for identifying non-coding RNA such as miRNA. (3) Overlaying NGS wiggle data is helpful in displaying differential expression and is simple to implement in RNASeqBrowser. (4) NGS data accumulates a lot of raw reads. Thus, RNASeqBrowser collapses exact duplicate reads to reduce visualization space. Normal PC’s can show many windows of NGS individual raw reads without much delay. (5) Multiple popup windows of individual raw reads provide users with more viewing space. This avoids existing approaches (such as IGV) which squeeze all raw reads into one window. This will be helpful for visualizing multiple datasets simultaneously. RNASeqBrowser and its manual are freely available at http://www.australianprostatecentre.org/research/software/rnaseqbrowser webcite or http://sourceforge.net/projects/rnaseqbrowser/ webcite
Resumo:
We study the secondary structure of RNA determined by Watson-Crick pairing without pseudo-knots using Milnor invariants of links. We focus on the first non-trivial invariant, which we call the Heisenber invariant. The Heisenberg invariant, which is an integer, can be interpreted in terms of the Heisenberg group as well as in terms of lattice paths. We show that the Heisenberg invariant gives a lower bound on the number of unpaired bases in an RNA secondary structure. We also show that the Heisenberg invariant can predict allosteric structures for RNA. Namely, if the Heisenberg invariant is large, then there are widely separated local maxima (i.e., allosteric structures) for the number of Watson-Crick pairs found.
Resumo:
Rrp1B (ribosomal RNA processing1 homolog B) is a novel candidate metastasis modifier gene in breast cancer. Functional gene assays demonstrated that a physical and functional interaction existing between Rrp1b and metastasis modifier gene SIPA1 causes reduction in the tumor growth and metastatic potential. Ectopic expression of Rrp1B modulates various metastasis predictive extra cellular matrix (ECM) genes associated with tumor suppression. The aim of this study is to determine the functional significance of single nucleotide polymorphism (SNP) in human Rrp1B gene (1307 T > C; rs9306160) with breast cancer development and progression. The study consists of 493 breast cancer cases recruited from Nizam's Institute of Medical Sciences, Hyderabad, and 558 age-matched healthy female controls from rural and urban areas. Genomic DNA was isolated by non-enzymatic method. Genotyping was done by amplification refractory mutation system (ARMS-PCR) method. Genotypes were reconfirmed by sequencing and results were analyzed statistically. We have performed Insilco analysis to know the RNA secondary structure by using online tool m fold. The TT genotype and T allele frequencies of Rrp1B1307 T > C polymorphism were significantly elevated in breast cancer (chi (2); p = < 0.008) cases compared to controls under different genetic models. The presence of T allele had conferred 1.75-fold risk for breast cancer development (OR = 1.75; 95 % CI = 1.15-2.67). The frequency of TT genotype of Rrp1b 1307T > C polymorphism was significantly elevated in obese patients (chi (2); p = 0.008) and patients with advanced disease (chi (2); p = 0.01) and with increased tumor size (chi (2); p = 0.01). Moreover, elevated frequency of T allele was also associated with positive lymph node status (chi (2); p = 0.04) and Her2 negative receptor status (chi (2); p = 0.006). Presence of Rrp1b1307TT genotype and T allele confer strong risk for breast cancer development and progression.
Resumo:
The forming mechanism of the three - dimensional structures of proteins,i.e.the mechanism of protein folding,is a basic problem in molecular biology which is still unsolved unitl now. In which a core problem is whether there is the three – dimensional genetic information that decide the three - dimensional structures of proteins. However, the research on this field has mot yet been reported. Recently,we made a comparative study on the folded structures of more than 70 mature messeneger RNAs (mRNAs) and the three - dimensional structures of the proteins encoded by them,it has been found that there exist marked correspondences between their featured structures in the following aspects: 1.The number of the structural units. An RNA molecule can form a secondary structure(stem and loop structure) by the folding and the base pairing of itself. The elementary structural unit of an RNA secondary structure is hairpin(or compound hair pin).The regular structural unit in the secondary structure of a protein is # alpha # - helix or #beta# - sheet . We have found that the hairpin number in the secondary structure of each mature mRNA is equal or approximately equal to the number of the regular secondary structural unis of the encoded protein. 2 .Turning region. Turn is a main structrual element in the secondary structure of a protein, which decides the backbone orientation of a protein molecule to some extent .Our analysis shows that the nucleotide sequence segments in an mRNA which encode the turns of the corresponding protein are overall situated in the turning regions of the mRNA secondary structure such as haipin,bulge loop or multibaranch loops. 3 .The arrangement of structural elements in space. In order to understand the backbone orientation of an RNA molecule and the arangement of its structural elements in space,we have modeled the three一dimensional structure of the mRNA molecule on SGI workstation based on its secondary structure.The result shows that the spatial arrangement of most of the nucleotide sequence segments encoding the structural elements of a protein is consistent with that of these stretural exements in the protein. For instance,the nucleotide sequences corresponding to each pleated sheet of a # beta # - sheet structure are close to each other in the mRNA secondary stucture and in the three - dimensional structure,although some of the nucleotide segments are far apart from each other in the one - dimensional sequence. For another instance,the two triplet codons of cysteines which form a disulphide bridge geneal1y are very close to each other in the mRNA folded structure. In addition,we also analyzed the locations of the codons proline - coding and the distrbution of the nucleotide sequences #alpha# - helix - coding in the folded structures of mRNAs . Some distribution laws have been found. All of these results suggest that the transfer of the genetic information from mRNA to protein not only is one – dimensional but also is three - dime ns ional. That is,there exists the genetic information that decide the three - dimensional structures of proteins. To a certain extent,we could say that the mRNA folding detemines the protein folding. Based on these results,it would be possible to predict the three - dimensional structures of proteins from the primary,secondary and tertiary structures of the m RNAs at a higher accuracy.And more important is that a new clue has been provided to uncover the“spatial coding" of the genetic information.
Resumo:
Purpose: Current understanding of the genetic risk factors for age-related macular degeneration (AMD) is not sufficiently predictive of the clinical course. The VEGF pathway is a key therapeutic target for treatment of neovascular AMD; however, risk attributable to genetic variation within pathway genes is unclear. We sought to identify single nucleotide polymorphisms (SNPs) associated with AMD within the VEGF pathway.
Methods: Using a tagSNP, direct sequencing and meta-analysis approach within four ethnically diverse cohorts, we identified genetic risk present in FLT1, though not within other VEGF pathway genes KDR, VEGFA, or VASH1. We used ChIP and ELISA in functional analysis.
Results: The FLT1 SNPs rs9943922, rs9508034, rs2281827, rs7324510, and rs9513115 were significantly associated with increased risk of neovascular AMD. Each association was more significant after meta-analysis than in any one of the four cohorts. All associations were novel, within noncoding regions of FLT1 that do not tag for coding variants in linkage disequilibrium. Analysis of soluble FLT1 demonstrated higher expression in unaffected individuals homozygous for the FLT1 risk alleles rs9943922 (P = 0.0086) and rs7324510 (P = 0.0057). In silico analysis suggests that these variants change predicted splice sites and RNA secondary structure, and have been identified in other neovascular pathologies. These data were supported further by murine chromatin immunoprecipitation demonstrating that FLT1 is a target of Nr2e3, a nuclear receptor gene implicated in regulating an AMD pathway.
Conclusions: Although exact variant functions are not known, these data demonstrate relevancy across ethnically diverse genetic backgrounds within our study and, therefore, hold potential for global efficacy.
Resumo:
L'acide désoxyribonucléique (ADN) et l'acide ribonucléique (ARN) sont des polymères de nucléotides essentiels à la cellule. À l'inverse de l'ADN qui sert principalement à stocker l'information génétique, les ARN sont impliqués dans plusieurs processus métaboliques. Par exemple, ils transmettent l’information génétique codée dans l’ADN. Ils sont essentiels pour la maturation des autres ARN, la régulation de l’expression génétique, la prévention de la dégradation des chromosomes et le ciblage des protéines dans la cellule. La polyvalence fonctionnelle de l'ARN résulte de sa plus grande diversité structurale. Notre laboratoire a développé MC-Fold, un algorithme pour prédire la structure des ARN qu'on représente avec des graphes d'interactions inter-nucléotidiques. Les sommets de ces graphes représentent les nucléotides et les arêtes leurs interactions. Notre laboratoire a aussi observé qu'un petit ensemble de cycles d'interactions à lui seul définit la structure de n'importe quel motif d'ARN. La formation de ces cycles dépend de la séquence de nucléotides et MC-Fold détermine les cycles les plus probables étant donnée cette séquence. Mon projet de maîtrise a été, dans un premier temps, de définir une base de données des motifs structuraux et fonctionnels d'ARN, bdMotifs, en terme de ces cycles. Par la suite, j’ai implanté un algorithme, MC-Motifs, qui recherche ces motifs dans des graphes d'interactions et, entre autres, ceux générés par MC-Fold. Finalement, j’ai validé mon algorithme sur des ARN dont la structure est connue, tels que les ARN ribosomaux (ARNr) 5S, 16S et 23S, et l'ARN utilisé pour prédire la structure des riborégulateurs. Le mémoire est divisé en cinq chapitres. Le premier chapitre présente la structure chimique, les fonctions cellulaires de l'ARN et le repliement structural du polymère. Dans le deuxième chapitre, je décris la base de données bdMotifs. Dans le troisième chapitre, l’algorithme de recherche MC-Motifs est introduit. Le quatrième chapitre présente les résultats de la validation et des prédictions. Finalement, le dernier chapitre porte sur la discussion des résultats suivis d’une conclusion sur le travail.