Biblioteca Digital

86 resultados para EUKARYOTIC GENOMES

em National Center for Biotechnology Information - NCBI

Compositional differences within and between eukaryotic genomes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Eukaryotic genome similarity relationships are inferred using sequence information derived from large aggregates of genomic sequences. Comparisons within and between species sample sequences are based on the profile of dinucleotide relative abundance values (The profile is ρ*XY = f*XY/f*Xf*Y for all XY, where f*X denotes the frequency of the nucleotide X and f*XY denotes the frequency of the dinucleotide XY, both computed from the sequence concatenated with its inverted complement). Previous studies with respect to prokaryotes and this study document that profiles of different DNA sequence samples (sample size ≥50 kb) from the same organism are generally much more similar to each other than they are to profiles from other organisms, and that closely related organisms generally have more similar profiles than do distantly related organisms. On this basis we refer to the collection {ρ*XY} as the genome signature. This paper identifies ρ*XY extremes and compares genome signature differences for a diverse range of eukaryotic species. Interpretations on the mechanisms maintaining these profile differences center on genome-wide replication, repair, DNA structures, and context-dependent mutational biases. It is also observed that mitochondrial genome signature differences between species parallel the corresponding nuclear genome signature differences despite large differences between corresponding mitochondrial and nuclear signatures. The genome signature differences also have implications for contrasts between rodents and other mammals, and between monocot and dicot plants, as well as providing evidence for similarities among fungi and the diversity of protists.

Characteristic enrichment of DNA repeats in different genomes

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Using computer programs developed for this purpose, we searched for various repeated sequences including inverted, direct tandem, and homopurine–homopyrimidine mirror repeats in various prokaryotes, eukaryotes, and an archaebacterium. Comparison of observed frequencies with expectations revealed that in bacterial genomes and organelles the frequency of different repeats is either random or enriched for inverted and/or direct tandem repeats. By contrast, in all eukaryotic genomes studied, we observed an overrepresentation of all repeats, especially homopurine–homopyrimidine mirror repeats. Analysis of the genomic distribution of all abundant repeats showed that they are virtually excluded from coding sequences. Unexpectedly, the frequencies of abundant repeats normalized for their expectations were almost perfect exponential functions of their size, and for a given repeat this function was indistinguishable between different genomes.

Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The SWISS-PROT group at EBI has developed the Proteome Analysis Database utilising existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archaea and eukaryotes (http://www.ebi.ac.uk/proteome/). The two main projects used, InterPro and CluSTr, give a new perspective on families, domains and sites and cover 31–67% (InterPro statistics) of the proteins from each of the complete genomes. CluSTr covers the three complete eukaryotic genomes and the incomplete human genome data. The Proteome Analysis Database is accompanied by a program that has been designed to carry out InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.

Polymorphic simple sequence repeat regions in chloroplast genomes: applications to the population genetics of pines.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Simple sequence repeats (SSRs), consisting of tandemly repeated multiple copies of mono-, di-, tri-, or tetranucleotide motifs, are ubiquitous in eukaryotic genomes and are frequently used as genetic markers, taking advantage of their length polymorphism. We have examined the polymorphism of such sequences in the chloroplast genomes of plants, by using a PCR-based assay. GenBank searches identified the presence of several (dA)n.(dT)n mononucleotide stretches in chloroplast genomes. A chloroplast (cp) SSR was identified in three pine species (Pinus contorta, Pinus sylvestris, and Pinus thunbergii) 312 bp upstream of the psbA gene. DNA amplification of this repeated region from 11 pine species identified nine length variants. The polymorphic amplified fragments were isolated and the DNA sequence was determined, confirming that the length polymorphism was caused by variation in the length of the repeated region. In the pines, the chloroplast genome is transmitted through pollen and this PCR assay may be used to monitor gene flow in this genus. Analysis of 305 individuals from seven populations of Pinus leucodermis Ant. revealed the presence of four variants with intrapopulational diversities ranging from 0.000 to 0.629 and an average of 0.320. Restriction fragment length polymorphism analysis of cpDNA on the same populations previously failed to detect any variation. Population subdivision based on cpSSR was higher (Gst = 0.22, where Gst is coefficient of gene differentiation) than that revealed in a previous isozyme study (Gst = 0.05). We anticipate that SSR loci within the chloroplast genome should provide a highly informative assay for the analysis of the genetic structure of plant populations.

Three novel families of miniature inverted-repeat transposable elements are associated with genes of the yellow fever mosquito, Aedes aegypti

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Three novel families of transposable elements, Wukong, Wujin, and Wuneng, are described in the yellow fever mosquito, Aedes aegypti. Their copy numbers range from 2,100 to 3,000 per haploid genome. There are high degrees of sequence similarity within each family, and many structural but not sequence similarities between families. The common structural characteristics include small size, no coding potential, terminal inverted repeats, potential to form a stable secondary structure, A+T richness, and putative 2- to 4-bp A+T-biased specific target sites. Evidence of previous mobility is presented for the Wukong elements. Elements of these three families are associated with 7 of 16 fully or partially sequenced Ae. aegypti genes. Characteristics of these mosquito elements indicate strong similarities to the miniature inverted-repeat transposable elements (MITEs) recently found to be associated with plant genes. MITE-like elements have also been reported in two species of Xenopus and in Homo sapiens. This characterization of multiple families of highly repetitive MITE-like elements in an invertebrate extends the range of these elements in eukaryotic genomes. A hypothesis is presented relating genome size and organization to the presence of highly reiterated MITE families. The association of MITE-like elements with Ae. aegypti genes shows the same bias toward noncoding regions as in plants. This association has potentially important implications for the evolution of gene regulation.

Molecular origin of the mosaic sequence arrangements of higher primate α-globin duplication units

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The human adult α-globin locus consists of three pairs of homology blocks (X, Y, and Z) interspersed with three nonhomology blocks (I, II, and III), and three Alu family repeats, Alu1, Alu2, and Alu3. It has been suggested that an ancient primate α-globin-containing unit was ancestral to the X, Y, and Z and the Alu1/Alu2 repeats. However, the evolutionary origin of the three nonhomologous blocks has remained obscure. We have now analyzed the sequence organization of the entire adult α-globin locus of gibbon (Hylobates lar). DNA segments homologous to human block I occur in both duplication units of the gibbon α-globin locus. Detailed interspecies sequence comparisons suggest that nonhomologous blocks I and II, as well as another sequence, IV, were all part of the ancestral α-globin-containing unit prior to its tandem duplication. However, sometime thereafter, block I was deleted from the human α1-globin-containing unit, and block II was also deleted from the α2-globin-containing unit in both human and gibbon. These were probably independent events both mediated by independent illegitimate recombination processes. Interestingly, the end points of these deletions coincide with potential insertion sites of Alu family repeats. These results suggest that the shaping of DNA segments in eukaryotic genomes involved the retroposition of repetitive DNA elements in conjunction with simple DNA recombination processes.

Generation of longer cDNA fragments from serial analysis of gene expression tags for gene identification

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We have developed a technique called the generation of longer cDNA fragments from serial analysis of gene expression (SAGE) tags for gene identification (GLGI), to convert SAGE tags of 10 bases into their corresponding 3′ cDNA fragments covering hundred bases. A primer containing the 10-base SAGE tag is used as the sense primer, and a single base anchored oligo(dT) primer is used as an antisense primer in PCR, together with Pfu DNA polymerase. By using this approach, a cDNA fragment extending from the SAGE tag toward the 3′ end of the corresponding sequence can be generated. Application of the GLGI technique can solve two critical issues in applying the SAGE technique: one is that a longer fragment corresponding to a SAGE tag, which has no match in databases, can be generated for further studies; the other is that the specific fragment corresponding to a SAGE tag can be identified from multiple sequences that match the same SAGE tag. The development of the GLGI method provides several potential applications. First, it provides a strategy for even wider application of the SAGE technique for quantitative analysis of global gene expression. Second, a combined application of SAGE/GLGI can be used to complete the catalogue of the expressed genes in human and in other eukaryotic species. Third, it can be used to identify the 3′ cDNA sequence from any exon within a gene. It can also be used to confirm the reality of exons predicted by bioinformatic tools in genomic sequences. Fourth, a combined application of SAGE/GLGI can be applied to define the 3′ boundary of expressed genes in the genomic sequences in human and in other eukaryotic genomes.

Structure-based assignment of the biochemical function of a hypothetical protein: A test case of structural genomics

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many small bacterial, archaebacterial, and eukaryotic genomes have been sequenced, and the larger eukaryotic genomes are predicted to be completely sequenced within the next decade. In all genomes sequenced to date, a large portion of these organisms’ predicted protein coding regions encode polypeptides of unknown biochemical, biophysical, and/or cellular functions. Three-dimensional structures of these proteins may suggest biochemical or biophysical functions. Here we report the crystal structure of one such protein, MJ0577, from a hyperthermophile, Methanococcus jannaschii, at 1.7-Å resolution. The structure contains a bound ATP, suggesting MJ0577 is an ATPase or an ATP-mediated molecular switch, which we confirm by biochemical experiments. Furthermore, the structure reveals different ATP binding motifs that are shared among many homologous hypothetical proteins in this family. This result indicates that structure-based assignment of molecular function is a viable approach for the large-scale biochemical assignment of proteins and for discovering new motifs, a basic premise of structural genomics.

ACTIVITY: a database on DNA/RNA sites activity adapted to apply sequence-activity relationships from one system to another

Relevância:

60.00% 60.00%

Publicador:

Resumo:

ACTIVITY is a database on DNA/RNA site sequences with known activity magnitudes, measurement systems, sequence-activity relationships under fixed experimental conditions and procedures to adapt these relationships from one measurement system to another. This database deposits information on DNA/RNA affinities to proteins and cell nuclear extracts, cutting efficiencies, gene transcription activity, mRNA translation efficiencies, mutability and other biological activities of natural sites occurring within promoters, mRNA leaders, and other regulatory regions in pro- and eukaryotic genomes, their mutant forms and synthetic analogues. Since activity magnitudes are heavily system-dependent, the current version of ACTIVITY is supplemented by three novel sub-databases: (i) SYSTEM, measurement systems; (ii) KNOWLEDGE, sequence-activity relationships under fixed experimental conditions; and (iii) CROSS_TEST, procedures adapting a relationship from one measurement system to another. These databases are useful in molecular biology, pharmacogenetics, metabolic engineering, drug design and biotechnology. The databases can be queried using SRS and are available through the Web, http://wwwmgs.bionet.nsc.ru/systems/Activity/.

Complex mtDNA constitutes an approximate 620-kb insertion on Arabidopsis thaliana chromosome 2: Implication of potential sequencing errors caused by large-unit repeats

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Previously conducted sequence analysis of Arabidopsis thaliana (ecotype Columbia-0) reported an insertion of 270-kb mtDNA into the pericentric region on the short arm of chromosome 2. DNA fiber-based fluorescence in situ hybridization analyses reveal that the mtDNA insert is 618 ± 42 kb, ≈2.3 times greater than that determined by contig assembly and sequencing analysis. Portions of the mitochondrial genome previously believed to be absent were identified within the insert. Sections of the mtDNA are repeated throughout the insert. The cytological data illustrate that DNA contig assembly by using bacterial artificial chromosomes tends to produce a minimal clone path by skipping over duplicated regions, thereby resulting in sequencing errors. We demonstrate that fiber-fluorescence in situ hybridization is a powerful technique to analyze large repetitive regions in the higher eukaryotic genomes and is a valuable complement to ongoing large genome sequencing projects.

Identification and characterization of a lysosomal transporter for small neutral amino acids

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In eukaryotic cells, lysosomes represent a major site for macromolecule degradation. Hydrolysis products are eventually exported from this acidic organelle into the cytosol through specific transporters. Impairment of this process at either the hydrolysis or the efflux step is responsible of several lysosomal storage diseases. However, most lysosomal transporters, although biochemically characterized, remain unknown at the molecular level. In this study, we report the molecular and functional characterization of a lysosomal amino acid transporter (LYAAT-1), remotely related to a family of H+-coupled plasma membrane and synaptic vesicle amino acid transporters. LYAAT-1 is expressed in most rat tissues, with highest levels in the brain where it is present in neurons. Upon overexpression in COS-7 cells, the recombinant protein mediates the accumulation of neutral amino acids, such as γ-aminobutyric acid, l-alanine, and l-proline, through an H+/amino acid symport. Confocal microscopy on brain sections revealed that this transporter colocalizes with cathepsin D, an established lysosomal marker. LYAAT-1 thus appears as a lysosomal transporter that actively exports neutral amino acids from lysosomes by chemiosmotic coupling to the H+-ATPase of these organelles. Homology searching in eukaryotic genomes suggests that LYAAT-1 defines a subgroup of lysosomal transporters in the amino acid/auxin permease family.

The physical and genomic organization of microsatellites in sugar beet.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Microsatellites, tandem arrays of short (2-5 bp) nucleotide motifs, are present in high numbers in most eukaryotic genomes. We have characterized the physical distribution of microsatellites on chromosomes of sugar beet (Beta vulgaris L.). Each microsatellite sequence shows a characteristic genomic distribution and motif-dependent dispersion, with site-specific amplification on one to seven pairs of centromeres or intercalary chromosomal regions and weaker, dispersed hybridization along chromosomes. Exclusion of some microsatellites from 18S-5.8S-25S rRNA gene sites, centromeres, and intercalary sites was observed. In-gel and in situ hybridization patterns are correlated, with highly repeated restriction fragments indicating major centromeric sites of microsatellite arrays. The results have implications for genome evolution and the suitability of particular microsatellite markers for genetic mapping and genome analysis.

Mutations in the MSH3 gene preferentially lead to deletions within tracts of simple repetitive DNA in Saccharomyces cerevisiae.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Eukaryotic genomes contain tracts of DNA in which a single base or a small number of bases are repeated (microsatellites). Mutations in the yeast DNA mismatch repair genes MSH2, PMS1, and MLH1 increase the frequency of mutations for normal DNA sequences and destabilize microsatellites. Mutations of human homologs of MSH2, PMS1, and MLH1 also cause microsatellite instability and result in certain types of cancer. We find that a mutation in the yeast gene MSH3 that does not substantially affect the rate of spontaneous mutations at several loci increases microsatellite instability about 40-fold, preferentially causing deletions. We suggest that MSH3 has different substrate specificities than the other mismatch repair proteins and that the human MSH3 homolog (MRP1) may be mutated in some tumors with microsatellite instability.

Identification of thermophilic species by the amino acid compositions deduced from their genomes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The global amino acid compositions as deduced from the complete genomic sequences of six thermophilic archaea, two thermophilic bacteria, 17 mesophilic bacteria and two eukaryotic species were analysed by hierarchical clustering and principal components analysis. Both methods showed an influence of several factors on amino acid composition. Although GC content has a dominant effect, thermophilic species can be identified by their global amino acid compositions alone. This study presents a careful statistical analysis of factors that affect amino acid composition and also yielded specific features of the average amino acid composition of thermophilic species. Moreover, we introduce the first example of a ‘compositional tree’ of species that takes into account not only homologous proteins, but also proteins unique to particular species. We expect this simple yet novel approach to be a useful additional tool for the study of phylogeny at the genome level.

A minimal gene set for cellular life derived by comparison of complete bacterial genomes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The recently sequenced genome of the parasitic bacterium Mycoplasma genitalium contains only 468 identified protein-coding genes that have been dubbed a minimal gene complement [Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., et al. (1995) Science 270, 397-403]. Although the M. genitalium gene complement is indeed the smallest among known cellular life forms, there is no evidence that it is the minimal self-sufficient gene set. To derive such a set, we compared the 468 predicted M. genitalium protein sequences with the 1703 protein sequences encoded by the other completely sequenced small bacterial genome, that of Haemophilus influenzae. M. genitalium and H. influenzae belong to two ancient bacterial lineages, i.e., Gram-positive and Gram-negative bacteria, respectively. Therefore, the genes that are conserved in these two bacteria are almost certainly essential for cellular function. It is this category of genes that is most likely to approximate the minimal gene set. We found that 240 M. genitalium genes have orthologs among the genes of H. influenzae. This collection of genes falls short of comprising the minimal set as some enzymes responsible for intermediate steps in essential pathways are missing. The apparent reason for this is the phenomenon that we call nonorthologous gene displacement when the same function is fulfilled by nonorthologous proteins in two organisms. We identified 22 nonorthologous displacements and supplemented the set of orthologs with the respective M. genitalium genes. After examining the resulting list of 262 genes for possible functional redundancy and for the presence of apparently parasite-specific genes, 6 genes were removed. We suggest that the remaining 256 genes are close to the minimal gene set that is necessary and sufficient to sustain the existence of a modern-type cell. Most of the proteins encoded by the genes from the minimal set have eukaryotic or archaeal homologs but seven key proteins of DNA replication do not. We speculate that the last common ancestor of the three primary kingdoms had an RNA genome. Possibilities are explored to further reduce the minimal set to model a primitive cell that might have existed at a very early stage of life evolution.

«
1
2
3
4
5
6
»