924 resultados para Coding Sequences
Resumo:
The ribonucleotide reductase gene tandem bnrdE/bnrdF in SPbeta-related prophages of different Bacillus spp. isolates presents different configurations of intervening sequences, comprising one to three of six non-homologous splicing elements. Insertion sites of group I introns and intein DNA are clustered in three relatively short segments encoding functionally important domains of the ribonucleotide reductase. Comparison of the bnrdE homologs reveals mutual exclusion of a group I intron and an intein coding sequence flanking the codon that specifies a conserved cysteine. In vivo splicing was demonstrated for all introns. However, for two of them a part of the mRNA precursor molecules remains unspliced. Intergenic bnrdE-bnrdF regions are unexpectedly long, comprising between 238 and 541 nt. The longest encodes a putative polypeptide related to HNH homing endonucleases.
Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates.
Resumo:
Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.
Resumo:
BACKGROUND: Conserved non-coding sequences in the human genome are approximately tenfold more abundant than known genes, and have been hypothesized to mark the locations of cis-regulatory elements. However, the global contribution of conserved non-coding sequences to the transcriptional regulation of human genes is currently unknown. Deeply conserved elements shared between humans and teleost fish predominantly flank genes active during morphogenesis and are enriched for positive transcriptional regulatory elements. However, such deeply conserved elements account for <1% of the conserved non-coding sequences in the human genome, which are predominantly mammalian. RESULTS: We explored the regulatory potential of a large sample of these 'common' conserved non-coding sequences using a variety of classic assays, including chromatin remodeling, and enhancer/repressor and promoter activity. When tested across diverse human model cell types, we find that the fraction of experimentally active conserved non-coding sequences within any given cell type is low (approximately 5%), and that this proportion increases only modestly when considered collectively across cell types. CONCLUSIONS: The results suggest that classic assays of cis-regulatory potential are unlikely to expose the functional potential of the substantial majority of mammalian conserved non-coding sequences in the human genome.
Resumo:
During infection of a new host, the first surfaces encountered by herpes simplex viruses are the apical membranes of epithelial cells of mucosal surfaces. These cells are highly polarized, and the protein composition of their apical and basolateral membranes are very different, so that different viral entry pathways have evolved for each surface. To determine whether the viral glycoprotein G (gG) is specifically required for efficient infection of a particular surface of polarized cells, apical and basal surfaces were infected with wild-type virus or a gG deletion mutant. After infection of polarized cells in culture, the gG− virus was deficient in infection of apical surfaces but was able to infect cells through basal membranes, replicate, and spread into surrounding cells. The gG-dependent step in apical infection was a stage beyond attachment. After in vivo infection of apical surfaces of epithelial cells of nonscarified mouse corneas, infection by glycoprotein C− or gG− virus was considerably reduced as compared with that observed after infection with wild-type virus. In contrast, when corneas were scarified, allowing virus access to other cell surfaces, the gG and glycoprotein C deletion mutants infected eyes as efficiently as wild-type viruses. A secondary mutation allowing infection of apical surfaces by gG− virus arose readily during passage of the virus in nonpolarized cells, indicating that either the gG-dependent step of apical infection can be bypassed or that another viral protein can acquire the same function.
Resumo:
Adenoviral vector-mediated gene transfer offers significant potential for gene therapy of many human diseases. However, progress has been slowed by several limitations. First, the insert capacity of currently available adenoviral vectors is limited to 8 kb of foreign DNA. Second, the expression of viral proteins in infected cells is believed to trigger a cellular immune response that results in inflammation and in only transient expression of the transferred gene. We report the development of a new adenoviral vector that has all viral coding sequences removed. Thus, large inserts are accommodated and expression of all viral proteins is eliminated. The first application of this vector system carries a dual expression cassette comprising 28.2 kb of nonviral DNA that includes the full-length murine dystrophin cDNA under control of a large muscle-specific promoter and a lacZ reporter construct. Using this vector, we demonstrate independent expression of both genes in primary mdx (dystrophin-deficient) muscle cells.
Resumo:
Despite the wide distribution of transposable elements (TEs) in mammalian genomes, part of their evolutionary significance remains to be discovered. Today there is a substantial amount of evidence showing that TEs are involved in the generation of new exons in different species. In the present study, we searched 22,805 genes and reported the occurrence of TE-cassettes in coding sequences of 542 cow genes using the RepeatMasker program. Despite the significant number (542) of genes with TE insertions in exons only 14 (2.6%) of them were translated into protein, which we characterized as chimeric genes. From these chimeric genes, only the FAST kinase domains 3 (FASTKD3) gene, present on chromosome BTA 20, is a functional gene and showed evidence of the exaptation event. The genome sequence analysis showed that the last exon coding sequence of bovine FASTKD3 is similar to 85% similar to the ART2A retrotransposon sequence. In addition, comparison among FASTKD3 proteins shows that the last exon is very divergent from those of Homo sapiens, Pan troglodytes and Canis familiares. We suggest that the gene structure of bovine FASTKD3 gene could have originated by several ectopic recombinations between TE copies. Additionally, the absence of TE sequences in all other species analyzed suggests that the TE insertion is clade-specific, mainly in the ruminant lineage.
Resumo:
Three species of flatworms from the genus Echinococcus (E. granulosus, E. multilocularis and E. vogeli) and four strains of E. granulosus (cattle, horse, pig and sheep strains) were analysed by the PCR-SSCP method followed by sequencing, using as targets two non-coding and two coding (one nuclear and one mitochondrial) genomic regions. The sequencing data was used to evaluate hypothesis about the parasite breeding system and the causes of genetic diversification. The calculated recombination parameters suggested that cross-fertilisation was rare in the history of the group. However, the relative rates of substitution in the coding sequences showed that positive selection (instead of purifying selection) drove the evolution of an elastase and neutrophil chemotaxis inhibitor gene (AgB/1). The phylogenetic analyses revealed several ambiguities, indicating that the taxonomic status of the E. granulosus horse strain should be revised
Resumo:
Transcripts similar to those that encode the nonstructural (NS) proteins NS3 and NS5 from flaviviruses were found in a salivary gland (SG) complementary DNA (cDNA) library from the cattle tick Rhipicephalus microplus.Tick extracts were cultured with cells to enable the isolation of viruses capable of replicating in cultured invertebrate and vertebrate cells. Deep sequencing of the viral RNA isolated from culture supernatants provided the complete coding sequences for the NS3 and NS5 proteins and their molecular characterisation confirmed similarity with the NS3 and NS5 sequences from other flaviviruses. Despite this similarity, phylogenetic analyses revealed that this potentially novel virus may be a highly divergent member of the genus Flavivirus. Interestingly, we detected the divergent NS3 and NS5 sequences in ticks collected from several dairy farms widely distributed throughout three regions of Brazil. This is the first report of flavivirus-like transcripts inR. microplus ticks. This novel virus is a potential arbovirus because it replicated in arthropod and mammalian cells; furthermore, it was detected in a cDNA library from tick SGs and therefore may be present in tick saliva. It is important to determine whether and by what means this potential virus is transmissible and to monitor the virus as a potential emerging tick-borne zoonotic pathogen.
Resumo:
We previously introduced two new protein databases (trEST and trGEN) of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Here, we present the updates made on these two databases plus a new database (trome), which uses alignments of EST data to HTG or full genomes to generate virtual transcripts and coding sequences. This new database is of higher quality and since it contains the information in a much denser format it is of much smaller size. These new databases are in a Swiss-Prot-like format and are updated on a weekly basis (trEST and trGEN) or every 3 months (trome). They can be downloaded by anonymous ftp from ftp://ftp.isrec.isb-sib.ch/pub/databases.
Resumo:
Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.
Resumo:
Despite the wide distribution of transposable elements (TEs) in mammalian genomes, part of their evolutionary significance remains to be discovered. Today there is a substantial amount of evidence showing that TEs are involved in the generation of new exons in different species. In the present study, we searched 22,805 genes and reported the occurrence of TE-cassettes in coding sequences of 542 cow genes using the RepeatMasker program. Despite the significant number (542) of genes with TE insertions in exons only 14 (2.6%) of them were translated into protein, which we characterized as chimeric genes. From these chimeric genes, only the FAST kinase domains 3 (FASTKD3) gene, present on chromosome BTA 20, is a functional gene and showed evidence of the exaptation event. The genome sequence analysis showed that the last exon coding sequence of bovine FASTKD3 is ∼85% similar to the ART2A retrotransposon sequence. In addition, comparison among FASTKD3 proteins shows that the last exon is very divergent from those of Homo sapiens, Pan troglodytes and Canis familiares. We suggest that the gene structure of bovine FASTKD3 gene could have originated by several ectopic recombinations between TE copies. Additionally, the absence of TE sequences in all other species analyzed suggests that the TE insertion is clade-specific, mainly in the ruminant lineage. ©FUNPEC-RP.
Resumo:
Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.
Resumo:
In silico analyses of Leishmania spp. genome data are a powerful resource to improve the understanding of these pathogens' biology. Trypanosomatids such as Leishmania spp. have their protein-coding genes grouped in long polycistronic units of functionally unrelated genes. The control of gene expression happens by a variety of posttranscriptional mechanisms. The high degree of synteny among Leishmania species is accompanied by highly conserved coding sequences (CDS) and poorly conserved intercoding untranslated sequences. To identify the elements involved in the control of gene expression, we conducted an in silico investigation to find conserved intercoding sequences (CICS) in the genomes of L major, L infantum, and L braziliensis. We used a combination of computational tools, such as Linux-Shell, PERL and R languages, BLAST, MSPcrunch, SSAKE, and Pred-A-Term algorithms to construct a pipeline which was able to: (i) search for conservation in target-regions, (ii) eliminate CICS redundancy and mask repeat elements, (iii) predict the mRNA's extremities, (iv) analyze the distribution of orthologous genes within the generated LeishCICS-clusters, (v) assign GO terms to the LeishCICS-clusters. and (vi) provide statistical support for the gene-enrichment annotation. We associated the LeishCICS-cluster data, generated at the end of the pipeline, with the expression profile oft. donovani genes during promastigote-amastigote differentiation, as previously evaluated by others (GEO accession: GSE21936). A Pearson's correlation coefficient greater than 0.5 was observed for 730 LeishCICS-clusters containing from 2 to 17 genes. The designed computational pipeline is a useful tool and its application identified potential regulatory cis elements and putative regulons in Leishmania. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Five different clones encoding thioredoxin homologues were isolated from Arabidopsis thaliana cDNA libraries. On the basis of the sequences they encode divergent proteins, but all belong to the cytoplasmic thioredoxins h previously described in higher plants. The five proteins obtained by overexpressing the coding sequences in Escherichia coli present typical thioredoxin activities (NADP(+)-malate dehydrogenase activation and reduction by Arabidopsis thioredoxin reductase) despite the presence of a variant active site, Trp-Cys-Pro-Pro-Cys, in three proteins in place of the canonical Trp-Cys-Gly-Pro-Cys sequence described for thioredoxins in prokaryotes and eukaryotes. Southern blots show that each cDNA is encoded by a single gene but suggest the presence of additional related sequences in the Arabidopsis genome. This very complex diversity of thioredoxins h is probably common to all higher plants, since the Arabidopsis sequences appear to have diverged very early, at the beginning of plant speciation. This diversity allows the transduction of a redox signal into multiple pathways.