969 resultados para coding sequence


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A 9.9 kb DNA fragment from the right arm of chromosome VII of Saccharomyces cerevisiae has been sequenced and analysed. The sequence contains four open reading frames (ORFs) longer than 100 amino acids. One gene, PFK1, has already been cloned and sequenced and the other one is the probable yeast gene coding for the beta-subunit of the succinyl-CoA synthetase. The two remaining ORFs share homology with the deduced amino acid sequence (and their physical arrangement is similar to that) of the YHR161c and YHR162w ORFs from chromosome VIII.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have isolated a clone of Trypanosoma cruzi genimic DNA, lambda 3b2-5, which contains sequences that are reiterated in the genome. Northtern blot analysis showed that clone 3b2-5 hybridizes to 1,200-5,000 bases different mRNA species. The number of mRNAs species hybridized to clone 3b2-5 exceeds its coding capacity showing that this clone carries sequences that are common to several mRNAs species and conserved in the poly A(+) RNA. These sequences are not homologous to the T. cruzi spliced leader sequence, since clone 3b2-5 hybridize to a synthetic 20 nucleotice complementary to the spliced leader sequence. Clone 3b2-5 does not hybridize to DNA and RNA from several genera of Trypanosomatidae and other Trypanosoma species indicating that it carries T. cruzi species-specific sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The comparison of complete genomes has revealed surprisingly large numbers of conserved non-protein-coding (CNC) DNA regions. However, the biological function of CNC remains elusive. CNC differ in two aspects from conserved protein-coding regions. They are not conserved across phylum boundaries, and they do not contain readily detectable sub-domains. Here we characterize the persistence length and time of CNC and conserved protein-coding regions in the vertebrate and insect lineages. RESULTS: The persistence length is the length of a genome region over which a certain level of sequence identity is consistently maintained. The persistence time is the evolutionary period during which a conserved region evolves under the same selective constraints.Our main findings are: (i) Insect genomes contain 1.60 times less conserved information than vertebrates; (ii) Vertebrate CNC have a higher persistence length than conserved coding regions or insect CNC; (iii) CNC have shorter persistence times as compared to conserved coding regions in both lineages. CONCLUSION: Higher persistence length of vertebrate CNC indicates that the conserved information in vertebrates and insects is organized in functional elements of different lengths. These findings might be related to the higher morphological complexity of vertebrates and give clues about the structure of active CNC elements.Shorter persistence time might explain the previously puzzling observations of highly conserved CNC within each phylum, and of a lack of conservation between phyla. It suggests that CNC divergence might be a key factor in vertebrate evolution. Further evolutionary studies will help to relate individual CNC to specific developmental processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A total of 880 expressed sequence tags (EST) originated from clones randomly selected from a Trypanosoma cruzi amastigote cDNA library have been analyzed. Of these, 40% (355 ESTs) have been identified by similarity to sequences in public databases and classified according to functional categorization of their putative products. About 11% of the mRNAs expressed in amastigotes are related to the translational machinery, and a large number of them (9% of the total number of clones in the library) encode ribosomal proteins. A comparative analysis with a previous study, where clones from the same library were selected using sera from patients with Chagas disease, revealed that ribosomal proteins also represent the largest class of antigen coding genes expressed in amastigotes (54% of all immunoselected clones). However, although more than thirty classes of ribosomal proteins were identified by EST analysis, the results of the immunoscreening indicated that only a particular subset of them contains major antigenic determinants recognized by antibodies from Chagas disease patients.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In Xenopus laevis four estrogen-responsive genes are expressed simultaneously to produce vitellogenin, the precursor of the yolk proteins. One of these four genes, the gene A2, was sequenced completely, as well as cDNAs representing 75% of the coding region of the gene. From this data the exon-intron structure of the gene was established, revealing 35 exons that give a transcript of 5,619 bp without the poly A-tail. This A2 transcript encodes a vitellogenin of 1,807 amino acids, whose structure is discussed with respect to its function. At the nucleic acid as well as at the protein level no extensive homologies with any sequences other than vitellogenin were observed. Comparison of the amino acid sequence of the vitellogenin A2 molecule with biochemical data obtained from the different yolk proteins allowed us to localize the cleavage products on the vitellogenin precursor as follows: NH2 - lipovitellin I - phosvitin (or phosvette II - phosvette I) - lipovitellin II - COOH.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two allelic genomic fragments containing ribosomal protein S4 encoding genes (rpS4) from Trypanosoma cruzi (CL-Brener strain) were isolated and characterized. One allele comprises two complete tandem repeats of a sequence encoding an rpS4 gene. In the other, only one rpS4 gene is found. Sequence comparison to the accessed data in the genome project database reveals that our two-copy allele corresponds to a variant haplotype. However, the deduced aminoacid sequence of all the gene copies is identical. The rpS4 transcripts processing sites were determined by comparison of genomic sequences with published cDNA data. The obtained sequence data demonstrates that rpS4 genes are expressed in epimastigotes, amastigotes, and trypomastigotes. A recombinant version of rpS4 was found to be an antigenic: it was recognized by 62.5% of the individuals with positive serology for T. cruzi and by 93.3% of patients with proven chronic chagasic disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The high occurrence of nosocomial multidrug-resistant (MDR) microorganisms is considered a global health problem. Here, we report the draft genome sequence of a MDR Pseudomonas aeruginosa strain isolated in Brazil that belongs to the endemic clone ST277. The genome encodes important resistance determinant genes and consists of 6.7 Mb with a G+C content of 66.86% and 6,347 predicted coding regions including 60 RNAs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bacillus thuringiensisis a ubiquitous Gram-positive and sporulating bacterium. Its crystals and secreted toxins are useful tools against larvae of diverse insect orders and, as a consequence, an alternative to recalcitrant chemical insecticides. We report here the draft genome sequence ofB. thuringiensis147, a strain isolated from Brazil and with high insecticidal activity. The assembled genome contained 6,167,994 bp and was distributed in seven replicons (a chromosome and 6 plasmids). We identified 12 coding regions, located in two plasmids, which encode insecticidal proteins.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This book gives a general view of sequence analysis, the statistical study of successions of states or events. It includes innovative contributions on life course studies, transitions into and out of employment, contemporaneous and historical careers, and political trajectories. The approach presented in this book is now central to the life-course perspective and the study of social processes more generally. This volume promotes the dialogue between approaches to sequence analysis that developed separately, within traditions contrasted in space and disciplines. It includes the latest developments in sequential concepts, coding, atypical datasets and time patterns, optimal matching and alternative algorithms, survey optimization, and visualization. Field studies include original sequential material related to parenting in 19th-century Belgium, higher education and work in Finland and Italy, family formation before and after German reunification, French Jews persecuted in occupied France, long-term trends in electoral participation, and regime democratization. Overall the book reassesses the classical uses of sequences and it promotes new ways of collecting, formatting, representing and processing them. The introduction provides basic sequential concepts and tools, as well as a history of the method. Chapters are presented in a way that is both accessible to the beginner and informative to the expert.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Conserved non-coding sequences in the human genome are approximately tenfold more abundant than known genes, and have been hypothesized to mark the locations of cis-regulatory elements. However, the global contribution of conserved non-coding sequences to the transcriptional regulation of human genes is currently unknown. Deeply conserved elements shared between humans and teleost fish predominantly flank genes active during morphogenesis and are enriched for positive transcriptional regulatory elements. However, such deeply conserved elements account for <1% of the conserved non-coding sequences in the human genome, which are predominantly mammalian. RESULTS: We explored the regulatory potential of a large sample of these 'common' conserved non-coding sequences using a variety of classic assays, including chromatin remodeling, and enhancer/repressor and promoter activity. When tested across diverse human model cell types, we find that the fraction of experimentally active conserved non-coding sequences within any given cell type is low (approximately 5%), and that this proportion increases only modestly when considered collectively across cell types. CONCLUSIONS: The results suggest that classic assays of cis-regulatory potential are unlikely to expose the functional potential of the substantial majority of mammalian conserved non-coding sequences in the human genome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have mapped the genes coding for two major structural polypeptides of the vaccinia virus core by hybrid selection and transcriptional mapping. First, RNA was selected by hybridization to restriction fragments of the vaccinia virus genome, translated in vitro and the products were immunoprecipitated with antibodies against the two polypeptides. This approach allowed us to map the genes to the left hand end of the largest Hind III restriction fragment of 50 kilobase pairs. Second, transcriptional mapping of this region of the genome revealed the presence of the two expected RNAs. Both RNAs are transcribed from the leftward reading strand and the 5'-ends of the genes are separated by about 7.5 kilobase pairs of DNA. Thus, two genes encoding structural polypeptides with a similar location in the vaccinia virus particle are clustered at approximately 105 kilobase pairs from the left hand end of the 180 kilobase pair vaccinia virus genome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider adaptive sequential lossy coding of bounded individual sequences when the performance is measured by the sequentially accumulated mean squared distortion. Theencoder and the decoder are connected via a noiseless channel of capacity $R$ and both are assumed to have zero delay. No probabilistic assumptions are made on how the sequence to be encoded is generated. For any bounded sequence of length $n$, the distortion redundancy is defined as the normalized cumulative distortion of the sequential scheme minus the normalized cumulative distortion of the best scalarquantizer of rate $R$ which is matched to this particular sequence. We demonstrate the existence of a zero-delay sequential scheme which uses common randomization in the encoder and the decoder such that the normalized maximum distortion redundancy converges to zero at a rate $n^{-1/5}\log n$ as the length of the encoded sequence $n$ increases without bound.