991 resultados para Sequence Conservation
Resumo:
BACKGROUND: The availability of the P. falciparum genome has led to novel ways to identify potential vaccine candidates. A new approach for antigen discovery based on the bioinformatic selection of heptad repeat motifs corresponding to alpha-helical coiled coil structures yielded promising results. To elucidate the question about the relationship between the coiled coil motifs and their sequence conservation, we have assessed the extent of polymorphism in putative alpha-helical coiled coil domains in culture strains, in natural populations and in the single nucleotide polymorphism data available at PlasmoDB. METHODOLOGY/PRINCIPAL FINDINGS: 14 alpha-helical coiled coil domains were selected based on preclinical experimental evaluation. They were tested by PCR amplification and sequencing of different P. falciparum culture strains and field isolates. We found that only 3 out of 14 alpha-helical coiled coils showed point mutations and/or length polymorphisms. Based on promising immunological results 5 of these peptides were selected for further analysis. Direct sequencing of field samples from Papua New Guinea and Tanzania showed that 3 out of these 5 peptides were completely conserved. An in silico analysis of polymorphism was performed for all 166 putative alpha-helical coiled coil domains originally identified in the P. falciparum genome. We found that 82% (137/166) of these peptides were conserved, and for one peptide only the detected SNPs decreased substantially the probability score for alpha-helical coiled coil formation. More SNPs were found in arrays of almost perfect tandem repeats. In summary, the coiled coil structure prediction was rarely modified by SNPs. The analysis revealed a number of peptides with strictly conserved alpha-helical coiled coil motifs. CONCLUSION/SIGNIFICANCE: We conclude that the selection of alpha-helical coiled coil structural motifs is a valuable approach to identify potential vaccine targets showing a high degree of conservation.
Resumo:
Pfs230, surface protein of gametocyte/gamete of the human malaria parasite, Plasmodium falciparum, is a prime candidate of malaria transmission-blocking vaccine. Plasmodium vivax has an ortholog of Pfs230 (Pvs230), however, there has been no study in any aspects on Pvs230 to date. To investigate whether Pvs230 can be a vivax malaria transmission-blocking vaccine, we performed evolutionary and population genetic analysis of the Pvs230 gene (pvs230: PVX_003905). Our analysis of Pvs230 and its orthologs in eight Plasmodium species revealed two distinctive parts: an interspecies variable part (IVP) containing species-specific oligopeptide repeats at the N-terminus and a 7.5 kb interspecies conserved part (ICP) containing 14 cysteine-rich domains. Pvs230 was closely related to its orthologs, Pks230 and Pcys230, in monkey malaria parasites. Analysis of 113 pvs230 sequences obtained from worldwide, showed that nucleotide diversity is remarkably low in the non-repeat 8-kb region of pvs230 (theta pi = 0.00118) with 77 polymorphic nucleotide sites, 40 of which results in amino acid replacements. A signature of purifying selection but not of balancing selection was seen on pvs230. Functional and/or structural constraints may limit the level of polymorphism in pvs230. The observed limited polymorphism in pvs230 should ground for utilization of Pvs230 as an effective transmission-blocking vaccine. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
It has been suggested that delayed DNA replication underlies fragility at common human fragile sites, but specific sequences responsible for expression of these inducible fragile sites have not been identified. One approach to identify such cis-acting sequences within the large nonexonic regions of fragile sites would be to identify conserved functional elements within orthologous fragile sites by interspecies sequence comparison. This study describes a comparison of orthologous fragile regions, the human FRA3B/FHIT and the murine Fra14A2/Fhit locus. We sequenced over 600 kbp of the mouse Fra14A2, covering the region orthologous to the fragile epicenter of FRA3B, and determined the Fhit deletion break points in a mouse kidney cancer cell line (RENCA). The murine Fra14A2 locus, like the human FRA3B, was characterized by a high AT content. Alignment of the two sequences showed that this fragile region was stable in evolution despite its susceptibility to mitotic recombination on inhibition of DNA replication. There were also several unusual highly conserved regions (HCRs). The positions of predicted matrix attachment regions (MARs), possibly related to replication origins, were not conserved. Of known fragile region landmarks, five cancer cell break points, one viral integration site, and one aphidicolin break cluster were located within or near HCRs. Thus, comparison of orthologous fragile regions has identified highly conserved sequences with possible functional roles in maintenance of fragility.
Resumo:
We sequenced cDNAs coding for chicken cellular nucleic acid binding protein (CNBP). Two slightly different variations of the open reading frame were found, each of which translates into a protein with seven zinc finger domains. The longest transcript contains an in-frame insert of 3 bp. The sequence conservation between chick CNBP cDNAs with human, rat and mouse CNBP cDNAs is extreme, especially in the coding region, where the deduced amino acid sequence identity with human, rat and mouse CNBP is 99%. CNBP-like transcripts were also found in various tissues from insect, shrimp, fish and lizard. Regions with remarkable nucleotide conservation were also found in the 3' untranslated region, indicating important functions for these regions. Quantitative reverse transcription polymerase chain reaction (RT-PCR) indicated that in the chick, CNBP is present in all tissues examined in approximately equal ratios to total RNA. RT-PCR of total RNA isolated from different phyla indicate CNBP-like proteins art widespread throughout the animal kingdom. The extraordinary level of conservation suggests an important physiological role for CNBP. (C) 1997 Elsevier Science Inc.
Resumo:
BACKGROUND: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species. RESULTS: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility. CONCLUSIONS: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.
Resumo:
The human androgen receptor (AR) gene promoter lies in a GC-rich region containing two principal sites of transcription initiation and a putative Sp1 protein-binding site, without typical "TATA" and "CAAT" boxes. It has been suggested that mutations within the 5'untranslated region (5'UTR) may contribute to the development of prostate cancer by changing the rates of gene transcription and/or translation. In order to investigate this question, the aim of the present study was to search for the presence of mutations or polymorphisms at the AR-5'UTR in 92 prostate cancer patients, where histological diagnosis of adenocarcinoma was established in specimens obtained from transurethral resection or after prostatectomy. The AR-5'UTR was amplified by PCR from genomic DNA samples of the patients and of 100 healthy male blood donors, included as controls. Conformation-sensitive gel electrophoresis was used for DNA sequence alteration screening. Only one band shift was detected in one individual from the blood donor group. Sequencing revealed a new single nucleotide deletion (T) in the most conserved portion of the promoter region at position +36 downstream from the transcription initiation site I. Although the effect of this specific mutation remains unknown, its rarity reveals the high degree of sequence conservation of the human androgen promoter region. Moreover, the absence of detectable variation within the critical 5'UTR in prostate cancer patients indicates a low probability of its involvement in prostate cancer etiology.
Resumo:
P>Modern sugarcane (Saccharum spp.) is the leading sugar crop and a primary energy crop. It has the highest level of `vertical` redundancy (2n = 12x = 120) of all polyploid plants studied to date. It was produced about a century ago through hybridization between two autopolyploid species, namely S. officinarum and S. spontaneum. In order to investigate the genome dynamics in this highly polyploid context, we sequenced and compared seven hom(oe)ologous haplotypes (bacterial artificial chromosome clones). Our analysis revealed a high level of gene retention and colinearity, as well as high gene structure and sequence conservation, with an average sequence divergence of 4% for exons. Remarkably, all of the hom(oe)ologous genes were predicted as being functional (except for one gene fragment) and showed signs of evolving under purifying selection, with the exception of genes within segmental duplications. By contrast, transposable elements displayed a general absence of colinearity among hom(oe)ologous haplotypes and appeared to have undergone dynamic expansion in Saccharum, compared with sorghum, its close relative in the Andropogonea tribe. These results reinforce the general trend emerging from recent studies indicating the diverse and nuanced effect of polyploidy on genome dynamics.
Resumo:
Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.
Resumo:
Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.
Large distribution and high sequence identity of a Copia-type retrotransposon in angiosperm families
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
A porcine BAC clone harboring the tightly linked IFNAR1 and IFNGR2 genes was identified by comparative analysis of the publicly available porcine BAC end sequences. The complete 168,835 bp insert sequence of this clone was determined. Sequence comparisons of the genomic sequence with EST sequences from public databases were performed and allowed a detailed annotation of the IFNAR1 and IFNGR2 genes. The analyzed genes showed a conserved genomic organization with their known mammalian orthologs, however the sequence conservation of these genes across species was relatively low. In addition to the IFNAR1 and IFNGR2 genes, which were completely sequenced, the analyzed BAC clone also contained parts of an orphan gene encoding a putative transmembrane protein (TMEM50B). In contrast to the IFNAR1 and IFNGR2 genes the sequence conservation of the TMEM50B gene across different mammalian species was extremely high.
Resumo:
The mammalian transcriptome contains many nonprotein-coding RNAs (ncRNAs), but most of these are of unclear significance and lack strong sequence conservation, prompting suggestions that they might be non-functional. However, certain long functional ncRNAs such as Air and Xist are also poorly conserved. In this article, we systematically analyzed the conservation of several groups of functional ncRNAs, including miRNAs, snoRNAs and longer ncRNAs whose function has been either documented or confidently predicted. As expected, miRNAs and snoRNAs were highly conserved. By contrast, the longer functional non-micro, non-sno ncRNAs were much less conserved with many displaying rapid sequence evolution. Our findings suggest that longer ncRNAs are under the influence of different evolutionary constraints and that the lack of conservation displayed by the thousands of candidate ncRNAs does not necessarily signify an absence of function.
Resumo:
The seeds of Theobroma cacao (cacao) are the source of cocoa, the raw material for the multi-billion dollar chocolate industry. Cacao`s two most important traits are its unique seed storage triglyceride (cocoa butter) and the flavor of its fermented beans (chocolate). The genome of T. cacao is being sequenced, and to expand the utility of the genome sequence to the improvement of cacao, we are evaluating Theobroma grandiflorum, the closest economically important species of Theobroma for its potential use in a comparative genomic study. T. grandiflorum differs from cacao in important agronomic traits such as flavor of the fermented beans, disease resistance to witches` broom and abscission of mature fruits. By comparing genomic sequences and analyzing viable inter-specific hybrids, we hope to identify the key genes that regulate cacao`s most important traits. We have investigated the utility in T. grandiflorum of three types of markers (microsatellite markers, single-strand conformational polymorphism markers and single nucleotide polymorphism (SNP) markers) developed in cacao. Through sequencing of amplicons of 12 diverse individuals of both cacao and T. grandiflorum, we have identified new intra- and inter-specific SNPs. Two markers which had no overlap of alleles between the species were used to genotype putative inter-specific hybrid seedlings. Sequence conservation was significant and species-specific differences numerous enough to suggest that comparative genomics of T. grandiflorum and T. cacao will be useful in elucidating the genetic differences that lead to a variety of important agronomic trait differences.
Resumo:
in Escherichia coli, the DnaG primase is the RNA polymerase that synthesizes RNA primers at replication forks. It is composed of three domains, a small N-terminal zinc-binding domain, a larger central domain responsible for RNA synthesis, and a C-terminal domain comprising residues 434-581 [DnaG(434-581)] that interact with the hexameric DnaB helicase. Presumably because of this interaction, it had not been possible previously to express the C-terminal domain in a stably transformed E coli strain. This problem was overcome by expression of DnaG(434-581) under control of tandem bacteriophage gimel-promoters, and the protein was purified in yields of 4-6 mg/L of culture and studied by NMR. A TOCSY spectrum of a 2 mM solution of the protein at pH 7.0, indicated that its structured core comprises residues 444-579. This was consistent with sequence conservation among most-closely related primases. Linewidths in a NOESY spectrum of a 0.5 mM sample in 10 mM phosphate, pH 6.05, 0.1 M NaCl, recorded at 36 degreesC, indicated the protein to be monomeric. Crystals of selenomethionine-substituted DnaG(434-581) obtained by the hanging-drop vapor-diffusion method were body-centered tetragonal, space group I4(1)22, with unit cell parameters a = b 142.2 Angstrom, c = 192.1 Angstrom, and diffracted beyond 2.7 Angstrom resolution with synchrotron radiation. (C) 2003 Elsevier Inc. All rights reserved.
Resumo:
The membrane-bound ceruloplasmin homolog hephaestin plays a critical role in intestinal iron absorption. The aims of this study were to clone the rat hephaestin gene and to examine its expression in the gastrointestinal tract in relation to other genes encoding iron transport proteins. The rat hephaestin gene was isolated from intestinal mRNA and was found to encode a protein 96% identical to mouse hephaestin. Analysis by ribonuclease protection assay and Western blotting showed that hephaestin was expressed at high levels throughout the small intestine and colon. Immunofluorescence localized the hephaestin protein to the mature villus enterocytes with little or no expression in the crypts. Variations in iron status had a small but nonsignificant effect on hephaestin expression in the duodenum. The high sequence conservation between rat and mouse hephaestin is consistent with this protein playing a central role in intestinal iron absorption, although its precise function remains to be determined.