1000 resultados para NONCODING DNA


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The correspondence between the transversion/transition ratio and the neighboring base composition in chloroplast DNA is examined. For 18 noncoding regions of the chloroplast genome, alignments between rice (Oryza sativa) and maize (Zea mays) were generated by two different methods. Difficulties of aligning noncoding DNA are discussed, and the alignments are analyzed in a manner that reduces alignment artifacts. Sequence divergence is < 10%, so multiple substitutions at a site are assumed to be rare. Observed substitutions were analyzed with respect to the A+T content of the two immediately flanking bases. It is shown that as this content increases, the proportion of transversions also increases. When both the 5'- and 3'-flanking nucleotides are G or C (A+T content of 0), only 25% of the observed substitutions are transversions. However, when both the 5'- and 3'-flanking nucleotides are A or T (A+T content of 2), 57% of the observed substitutions are transversions. Therefore, the influence of flanking base composition on substitutions, previously reported for a single noncoding region, is a general feature of the chloroplast genome.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An important topic in genomic sequence analysis is the identification of protein coding regions. In this context, several coding DNA model-independent methods based on the occurrence of specific patterns of nucleotides at coding regions have been proposed. Nonetheless, these methods have not been completely suitable due to their dependence on an empirically predefined window length required for a local analysis of a DNA region. We introduce a method based on a modified Gabor-wavelet transform (MGWT) for the identification of protein coding regions. This novel transform is tuned to analyze periodic signal components and presents the advantage of being independent of the window length. We compared the performance of the MGWT with other methods by using eukaryote data sets. The results show that MGWT outperforms all assessed model-independent methods with respect to identification accuracy. These results indicate that the source of at least part of the identification errors produced by the previous methods is the fixed working scale. The new method not only avoids this source of errors but also makes a tool available for detailed exploration of the nucleotide occurrence.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The main focus of the human genome sequencing project has been gene discovery, but a great additional benefit is that it offers the chance to examine the large proportion of the genome that does not contain human genes. The nature of this ‘noncodingDNA is poorly understood, both as an evolutionary question (how did it get there?) and in the functional sense (what is it doing now?). Much of the noncoding DNA is derived from retroviruses that have inserted their DNA into the genome. The availability of complete genomic sequences will revolutionize studies of the number and location of endogenous retroviruses, their role in genome evolution, and their contribution to human disease.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The distribution of transposable elements (TEs) in a genome reflects a balance between insertion rate and selection against new insertions. Understanding the distribution of TEs therefore provides insights into the forces shaping the organization of genomes. Past research has shown that TEs tend to accumulate in genomic regions with low gene density and low recombination rate. However, little is known about the factors modulating insertion rates across the genome and their evolutionary significance. One candidate factor is gene expression, which has been suggested to increase local insertion rate by rendering DNA more accessible. We test this hypothesis by comparing the TE density around germline- and soma-expressed genes in the euchromatin of Drosophila melanogaster. Because only insertions that occur in the germline are transmitted to the next generation, we predicted a higher density of TEs around germline-expressed genes than soma-expressed genes. We show that the rate of TE insertions is greater near germline- than soma-expressed genes. However, this effect is partly offset by stronger selection for genome compactness (against excess noncoding DNA) on germline-expressed genes. We also demonstrate that the local genome organization in clusters of coexpressed genes plays a fundamental role in the genomic distribution of TEs. Our analysis shows that-in addition to recombination rate-the distribution of TEs is shaped by the interaction of gene expression and genome organization. The important role of selection for compactness sheds a new light on the role of TEs in genome evolution. Instead of making genomes grow passively, TEs are controlled by the forces shaping genome compactness, most likely linked to the efficiency of gene expression or its complexity and possibly their interaction with mechanisms of TE silencing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The plant-beneficial bacterium Pseudomonas brassicacearum forms phenotypic variants in vitro as well as in planta during root colonization under natural conditions. Transcriptome analysis of typical phenotypic variants using microarrays containing coding as well as noncoding DNA fragments showed differential expression of several genes relevant to secondary metabolism and of the small RNA (sRNA) genes rsmX, rsmY, and rsmZ. Naturally occurring mutations in the gacS-gacA system accounted for phenotypic switching, which was characterized by downregulation of antifungal secondary metabolites (2,4-diacetylphloroglucinol and cyanide), indoleacetate, exoenzymes (lipase and protease), and three different N-acyl-homoserine lactone molecules. Moreover, in addition to abrogating these biocontrol traits, gacS and gacA mutations resulted in reduced expression of the type VI secretion machinery, alginate biosynthesis, and biofilm formation. In a gacA mutant, the expression of rsmX was completely abolished, unlike that of rsmY and rsmZ. Overexpression of any of the three sRNAs in the gacA mutant overruled the pleiotropic changes and restored the wild-type phenotypes, suggesting functional redundancy of these sRNAs. In conclusion, our data show that phenotypic switching in P. brassicacearum results from mutations in the gacS-gacA system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In eukaryotes, small RNAs (sRNAs) have key roles in development, gene expression regulation, and genome integrity maintenance. In ciliates, such as Paramecium, sRNAs form the heart of an epigenetic system that has evolved from core eukaryotic gene silencing components to selectively target DNA for deletion. In Paramecium, somatic genome development from the germline genome accurately eliminates the bulk of typically gene-interrupting, noncoding DNA. We have discovered an sRNA class (internal eliminated sequence [IES] sRNAs [iesRNAs]), arising later during Paramecium development, which originates from and precisely delineates germline DNA (IESs) and complements the initial sRNAs ("scan" RNAs [scnRNAs]) in targeting DNA for elimination. We show that whole-genome duplications have facilitated successive differentiations of Paramecium Dicer-like proteins, leading to cooperation between Dcl2 and Dcl3 to produce scnRNAs and to the production of iesRNAs by Dcl5. These innovations highlight the ability of sRNA systems to acquire capabilities, including those in genome development and integrity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Stylonychia lemnae is a classical model single-celled eukaryote, and a quintessential ciliate typified by dimorphic nuclei: A small, germline micronucleus and a massive, vegetative macronucleus. The genome within Stylonychia's macronucleus has a very unusual architecture, comprised variably and highly amplified "nanochromosomes," each usually encoding a single gene with a minimal amount of surrounding noncoding DNA. As only a tiny fraction of the Stylonychia genes has been sequenced, and to promote research using this organism, we sequenced its macronuclear genome. We report the analysis of the 50.2-Mb draft S. lemnae macronuclear genome assembly, containing in excess of 16,000 complete nanochromosomes, assembled as less than 20,000 contigs. We found considerable conservation of fundamental genomic properties between S. lemnae and its close relative, Oxytricha trifallax, including nanochromosomal gene synteny, alternative fragmentation, and copy number. Protein domain searches in Stylonychia revealed two new telomere-binding protein homologs and the presence of linker histones. Among the diverse histone variants of S. lemnae and O. trifallax, we found divergent, coexpressed variants corresponding to four of the five core nucleosomal proteins (H1.2, H2A.6, H2B.4, and H3.7) suggesting that these ciliates may possess specialized nucleosomes involved in genome processing during nuclear differentiation. The assembly of the S. lemnae macronuclear genome demonstrates that largely complete, well-assembled highly fragmented genomes of similar size and complexity may be produced from one library and lane of Illumina HiSeq 2000 shotgun sequencing. The provision of the S. lemnae macronuclear genome sets the stage for future detailed experimental studies of chromatin-mediated, RNA-guided developmental genome rearrangements.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Numerous island-inhabiting species of predominantly herbaceous angiosperm genera are woody shrubs or trees. Such "insular woodiness" is strongly manifested in the genus Echium, in which the continental species of circummediterranean distribution are herbaceous, whereas endemic species of islands along the Atlantic coast of north Africa are woody perennial shrubs. The history of 37 Echium species was traced with 70 kb of noncoding DNA determined from both chloroplast and nuclear genomes. In all, 239 polymorphic positions with 137 informative sites, in addition to 27 informative indels, were found. Island-dwelling Echium species are shown to descend from herbaceous continental ancestors via a single island colonization event that occurred < 20 million years ago. Founding colonization appears to have taken place on the Canary Islands, from which the Madeira and Cape Verde archipelagos were invaded. Colonization of island habitats correlates with a recent origin of perennial woodiness from herbaceous habit and was furthermore accompanied by intense speciation, which brought forth remarkable diversity of forms among contemporary island endemics. We argue that the origin of insular woodiness involved response to counter-selection of inbreeding depression in founding island colonies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cross-species comparative genomics is a powerful strategy for identifying functional regulatory elements within noncoding DNA. In this paper, comparative analysis of human and mouse intronic sequences in the breast cancer susceptibility gene (BRCA1) revealed two evolutionarily conserved noncoding sequences (CNS) in intron 2, 5 kb downstream of the core BRCA1 promoter. The functionality of these elements was examined using homologous-recombination-based mutagenesis of reporter gene-tagged cosmids incorporating these regions and flanking sequences from the BRCA1 locus. This showed that CNS-1 and CNS-2 have differential transcriptional regulatory activity in epithelial cell lines. Mutation of CNS-1 significantly reduced reporter gene expression to 30% of control levels. Conversely mutation of CNS-2 increased expression to 200% of control levels. Regulation is at the level of transcription and shows promoter specificity. Both elements also specifically bind nuclear proteins in vitro. These studies demonstrate that the combination of comparative genomics and functional analysis is a successful strategy to identify novel regulatory elements and provide the first direct evidence that conserved noncoding sequences in BRCA1 regulate gene expression. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

DNA sequence variation is currently a major source of data for studying human origins, evolution, and demographic history, and for detecting linkage association of complex diseases. In this dissertation, I investigated DNA variation in worldwide populations from two ∼10 kb autosomal regions on 22q11.2 (noncoding) and 1q24 (introns). A total of 75 variant sites were found among 128 human sequences in the 22q11.2 region, yielding an estimate of 0.088% for nucleotide diversity (π), and a total of 52 variant sites were found among 122 human sequences in the 1q24 region with an estimated π value of 0.057%. The data from these two regions and a 10 kb noncoding region on Xq13.3 all show a strong excess of low-frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The effective population sizes estimated from the three regions were 11,000, 12,700, and 8,600, respectively, which are close to the commonly used value of 10,000. In each of the two autosomal regions, the age of the most recent common ancestor (MRCA) was estimated to be older than 1 million years among all the sequences and ∼600,000 years among non-African sequences, providing first evidence from autosomal noncoding or intronic regions for a genetic history of humans much more ancient than the emergence of modern humans. The ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck. This study strongly suggests that both the “out of Africa” and the multiregional models are too simple for explaining the evolution of modern humans. A compilation of genome-wide data revealed that nucleotide diversity is highest in autosomal regions, intermediate in X-linked regions, and lowest in Y-linked regions. The data suggest the existence of background selection or selective sweep on Y-linked loci. In general, the nucleotide diversity in humans is low compared to that in chimpanzee and Drosophila populations. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We have analyzed the level of intraindividual sequence variability (heteroplasmy) of mtDNA in human brain by denaturing gradient gel electrophoresis and sequencing. Single base substitutions, as well as insertions or deletions of single bases, were numerous in the noncoding control region (D-loop), and 35-45% of the molecules from a single tissue showed sequence differences. By contrast, heteroplasmy in coding regions was not detected. The lower level of heteroplasmy in the coding regions is indicative of selection against deleterious mutations. Similar levels of heteroplasmy were found in two brain regions from the same individual, while no heteroplasmy was detected in blood. Thus, heteroplasmy seems to be more frequent in nonmitotic tissues. We observed a 7.7-fold increase in the frequency of deletions/insertions and a 2.2-fold increase in the overall frequency of heteroplasmic mutations in two individuals aged 96 and 99, relative to an individual aged 28. Our results show that intraindividual sequence variability occurs at a high frequency in the noncoding regions of normal human brain and indicate that small insertions and deletions might accumulate with age at a lower rate than large rearrangements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transcripts that lack any protein-coding potential represent at least half of the identified elements transcriptome. We review the evidence for the existence of such transcripts in the mammalian transcriptome, and argue that there may be many more noncoding RNAs (ncRNAs) still to be discovered. Relatively few ncRNA “genes” have been ascribed a function based upon mutation analysis. The review discusses possible roles of ncRNAs as cis-acting and trans-acting elements in epigenetic transcriptional control, including monoallelic gene silencing and imprinting. We also consider the evidence that the production of ncRNAs is a common feature of transcriptional enhancers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Eukaryotic phenotypic diversity arises from multitasking of a core proteome of limited size. Multitasking is routine in computers, as well as in other sophisticated information systems, and requires multiple inputs and outputs to control and integrate network activity. Higher eukaryotes have a mosaic gene structure with a dual output, mRNA (protein-coding) sequences and introns, which are released from the pre-mRNA by posttranscriptional processing. Introns have been enormously successful as a class of sequences and comprise up to 95% of the primary transcripts of protein-coding genes in mammals. In addition, many other transcripts (perhaps more than half) do not encode proteins at all, but appear both to be developmentally regulated and to have genetic function. We suggest that these RNAs (eRNAs) have evolved to function as endogenous network control molecules which enable direct gene-gene communication and multitasking of eukaryotic genomes. Analysis of a range of complex genetic phenomena in which RNA is involved or implicated, including co-suppression, transgene silencing, RNA interference, imprinting, methylation, and transvection, suggests that a higher-order regulatory system based on RNA signals operates in the higher eukaryotes and involves chromatin remodeling as well as other RNA-DNA, RNA-RNA, and RNA-protein interactions. The evolution of densely connected gene networks would be expected to result in a relatively stable core proteome due to the multiple reuse of components, implying,that cellular differentiation and phenotypic variation in the higher eukaryotes results primarily from variation in the control architecture. Thus, network integration and multitasking using trans-acting RNA molecules produced in parallel with protein-coding sequences may underpin both the evolution of developmentally sophisticated multicellular organisms and the rapid expansion of phenotypic complexity into uncontested environments such as those initiated in the Cambrian radiation and those seen after major extinction events.