972 resultados para Biological Sequence Analysis
Resumo:
Bitter taste has been extensively studied in mammalian species and is associated with sensitivity to toxins and with food choices that avoid dangerous substances in the diet. At the molecular level, bitter compounds are sensed by bitter taste receptor proteins (T2R) present at the surface of taste receptor cells in the gustatory papillae. Our work aims at exploring the phylogenetic relationships of T2R gene sequences within different ruminant species. To accomplish this goal, we gathered a collection of ruminant species with different feeding behaviors and for which no genome data is available: American bison, chamois, elk, European bison, fallow deer, goat, moose, mouflon, muskox, red deer, reindeer and white tailed deer. The herbivores chosen for this study belong to different taxonomic families and habitats, and hence, exhibit distinct foraging behaviors and diet preferences. We describe the first partial repertoires of T2R gene sequences for these species obtained by direct sequencing. We then consider the homology and evolutionary history of these receptors within this ruminant group, and whether it relates to feeding type classification, using MEGA software. Our results suggest that phylogenetic proximity of T2R genes corresponds more to the traditional taxonomic groups of the species rather than reflecting a categorization by feeding strategy.
Resumo:
Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.
Resumo:
The multiple-instance learning (MIL) model has been successful in areas such as drug discovery and content-based image-retrieval. Recently, this model was generalized and a corresponding kernel was introduced to learn generalized MIL concepts with a support vector machine. While this kernel enjoyed empirical success, it has limitations in its representation. We extend this kernel by enriching its representation and empirically evaluate our new kernel on data from content-based image retrieval, biological sequence analysis, and drug discovery. We found that our new kernel generalized noticeably better than the old one in content-based image retrieval and biological sequence analysis and was slightly better or even with the old kernel in the other applications, showing that an SVM using this kernel does not overfit despite its richer representation.
Resumo:
Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes a range of economically important plant diseases. Here we report the complete genome sequence of X. fastidiosa clone 9a5c, which causes citrus variegated chlorosis - a serious disease of orange trees. The genome comprises a 52.7% GC-rich 2,679,305-base-pair (bp) circular chromosome and 'two plasmids of 51,158 bp and 1,285 bp. We can assign putative functions to47% of the 2,904 predicted coding regions. Efficient metabolic functions are predicted, with sugars as the principal energy and carbon source, supporting existence in the nutrient-poor xylem sap. The mechanisms associated with pathogenicity and virulence involve toxins, antibiotics and ion sequestration systems, as well as bacterium-bacterium and bacterium-host interactions mediated by a range of proteins. Orthologues of some of these proteins have only been identified in animal and human pathogens; their presence in X. fastidiosa indicates that the molecular basis for bacterial pathogenicity is both conserved and independent of host. At least 83 genes are bacteriophage-derived and include virulence-associated genes from other bacteria, providing direct evidence of phage-mediated horizontal gene transfer.
Resumo:
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches' broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea.
Resumo:
Transposon mutagenesis and complementation studies previously identified a gene (xabB) for a large (526 kDa) polyketide-peptide synthase required for biosynthesis of albicidin antibiotics and phytotoxins in the sugarcane leaf scald pathogen Xanthomonas albilineans. A cistron immediately downstream from xabB encodes a polypeptide of 343 aa containing three conserved motifs characteristic of a family of S-adenosyl-L-methionine (SAM)-dependent O-methyltransferases. Insertional mutagenesis and complementation indicate that the product of this cistron (designated xabC) is essential for albicidin production, and that there is no other required downstream cistron. The xab promoter region is bidirectional, and insertional mutagenesis of the first open reading frame (ORF) in the divergent gene also blocks albicidin biosynthesis. This divergent ORF (designated thp) encodes a protein of 239 aa displaying high similarity to several IS21-like transposition helper proteins. The thp cistron is not located in a recognizable transposon, and is probably a remnant from a past transposition event that may have contributed to the development of the albicidin biosynthetic gene cluster. Failure of 'in trans' complementation of rhp indicates that a downstream cistron transcribed with thp is required for albicidin biosynthesis. (C) 2000 Elsevier Science B.V. All rights reserved.
Resumo:
The analysis of keratin 6 expression is complicated by the presence of multiple isoforms that are expressed constitutively in a number of internal stratified epithelia, in palmoplantar epidermis, and in the companion cell layer of the hair follicle. In addition, keratin 6 expression is inducible in interfollicular epidermis and the outer root sheath of the follicle, in response to wounding stimuli, phorbol esters, or retinoic acid. In order to establish the critical regions involved in the regulation of keratin 6a (the dominant isoform in mice), we generated transgenic mice with two different-sized mouse keratin 6a constructs containing either 1.3 kb or 0.12 kb of 5' flanking sequence linked to the lacZ reporter gene. Both constructs also contained the first intron and the 3' flanking sequence of mouse keratin 6a. Ectopic expression of either transgene was not observed. Double-label immunofluorescence analyses demonstrated expression of the reporter gene in keratin 6 expressing tissues, including the hair follicle, tongue, footpad, and nail bed, showing that both transgenes retained keratinocyte-specific expression. Quantitative analysis of beta -galactosidase activity verified that both the 1.3 and 0.12 kb keratin 6a promoter constructs produced similar levels of the reporter. Notably, both constructs were constitutively expressed in the outer root sheath and interfollicular epidermis in the absence of any activating stimulus, suggesting that they lack the regulatory elements that normally silence transcription in these cells. This study has revealed that a keratin 6a minigene contains critical cis elements that mediate tissue-specific expression and that the elements regulating keratin 6 induction lie distal to the 1.3 kb promoter region.
Resumo:
Dimethyl sulphide dehydrogenase catalyses the oxidation of dimethyl sulphide to dimethyl sulphoxide (DMSO) during photoautotrophic growth of Rhodovulum sulfidophilum . Dimethyl sulphide dehydrogenase was shown to contain bis (molybdopterin guanine dinucleotide)Mo, the form of the pterin molybdenum cofactor unique to enzymes of the DMSO reductase family. Sequence analysis of the ddh gene cluster showed that the ddhA gene encodes a polypeptide with highest sequence similarity to the molybdop-terin-containing subunits of selenate reductase, ethylbenzene dehydrogenase. These polypeptides form a distinct clade within the DMSO reductase family. Further sequence analysis of the ddh gene cluster identified three genes, ddhB , ddhD and ddhC . DdhB showed sequence homology to NarH, suggesting that it contains multiple iron-sulphur clusters. Analysis of the N-terminal signal sequence of DdhA suggests that it is secreted via the Tat secretory system in complex with DdhB, whereas DdhC is probably secreted via a Sec-dependent mechanism. Analysis of a ddhA mutant showed that dimethyl sulphide dehydrogenase was essential for photolithotrophic growth of Rv. sulfidophilum on dimethyl sulphide but not for chemo-trophic growth on the same substrate. Mutational analysis showed that cytochrome c (2) mediated photosynthetic electron transfer from dimethyl sulphide dehydrogenase to the photochemical reaction centre, although this cytochrome was not essential for photoheterotrophic growth of the bacterium.
Resumo:
Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Resumo:
Context and objective:The molecular characterization of local isolates of Toxoplasma gondii is considered significant so as to assess the homologous variations between the different loci of various strains of parasites.Design and setting:The present communication deals with the molecular cloning and sequence analysis of the 1158 bp entire open reading frame (ORF) of surface antigen 3 (SAG3) of two Indian T. gondii isolates (Chennai and Izatnagar) being maintained as cryostock at the IVRI.Method:The surface antigen 3 (SAG3) of two local Indian isolates were cloned and sequenced before being compared with the available published sequences.Results:The sequence comparison analysis revealed 99.9% homology with the standard published RH strain sequence of T. gondii. The strains were also compared with other established published sequences and found to be most related to the P-Br strain and CEP strain (both 99.3%), and least with PRU strain (98.4%). However, the two Indian isolates had 100% homology between them.Conclusion:Finally, it was concluded that the Indian isolates were closer to the RH strain than to the P-Br strain (Brazilian strain), the CEP strain and the PRU strains (USA), with respect to nucleotide homology. The two Indian isolates used in the present study are known to vary between themselves, as far as homologies related to other genes are concerned, but they were found to be 100% homologous as far as SAG3 locus is concerned. This could be attributed to the fact that this SAG3 might be a conserved locus and thereby, further detailed studies are thereby warranted to exploit the use of this particular molecule in diagnostics and immunoprophylactics. The findings are important from the point of view of molecular phylogeny.
Resumo:
Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.
Resumo:
BACKGROUND: Epidermal growth factor receptor (EGFR) and its downstream factors KRAS and BRAF are mutated in several types of cancer, affecting the clinical response to EGFR inhibitors. Mutations in the EGFR kinase domain predict sensitivity to the tyrosine kinase inhibitors gefitinib and erlotinib in lung adenocarcinoma, while activating point mutations in KRAS and BRAF confer resistance to the anti-EGFR monoclonal antibody cetuximab in colorectal cancer. The development of new generation methods for systematic mutation screening of these genes will allow more appropriate therapeutic choices. METHODS: We describe a high resolution melting (HRM) assay for mutation detection in EGFR exons 19-21, KRAS codon 12/13 and BRAF V600 using formalin-fixed paraffin-embedded samples. Somatic variation of KRAS exon 2 was also analysed by massively parallel pyrosequencing of amplicons with the GS Junior 454 platform. RESULTS: We tested 120 routine diagnostic specimens from patients with colorectal or lung cancer. Mutations in KRAS, BRAF and EGFR were observed in 41.9%, 13.0% and 11.1% of the overall samples, respectively, being mutually exclusive. For KRAS, six types of substitutions were detected (17 G12D, 9 G13D, 7 G12C, 2 G12A, 2 G12V, 2 G12S), while V600E accounted for all the BRAF activating mutations. Regarding EGFR, two cases showed exon 19 deletions (delE746-A750 and delE746-T751insA) and another two substitutions in exon 21 (one showed L858R with the resistance mutation T590M in exon 20, and the other had P848L mutation). Consistent with earlier reports, our results show that KRAS and BRAF mutation frequencies in colorectal cancer were 44.3% and 13.0%, respectively, while EGFR mutations were detected in 11.1% of the lung cancer specimens. Ultra-deep amplicon pyrosequencing successfully validated the HRM results and allowed detection and quantitation of KRAS somatic mutations. CONCLUSIONS: HRM is a rapid and sensitive method for moderate-throughput cost-effective screening of oncogene mutations in clinical samples. Rather than Sanger sequence validation, next-generation sequencing technology results in more accurate quantitative results in somatic variation and can be achieved at a higher throughput scale.
Resumo:
BACKGROUND: The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. RESULTS: Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFbeta, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFbeta. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. CONCLUSION: This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications
Resumo:
Summary [résumé français voir ci-dessous] From the beginning of the 20th century the world population has been confronted with the human immune deficiency virus 1 (HIV-1). This virus has the particularity to mutate fast, and could thus evade and adapt to the human host. Our closest evolutionary related organisms, the non-human primates, are less susceptible to HIV-1. In a broader sense, primates are differentially susceptible to various retrovirus. Species specificity may be due to genetic differences among primates. In the present study we applied evolutionary and comparative genetic techniques to characterize the evolutionary pattern of host cellular determinants of HIV-1 pathogenesis. The study of the evolution of genes coding for proteins participating to the restriction or pathogenesis of HIV-1 may help understanding the genetic basis of modern human susceptibility to infection. To perform comparative genetics analysis, we constituted a collection of primate DNA and RNA to allow generation of de novo sequence of gene orthologs. More recently, release to the public domain of two new primate complete genomes (bornean orang-utan and common marmoset) in addition of the three previously available genomes (human, chimpanzee and Rhesus monkey) help scaling up the evolutionary and comparative genome analysis. Sequence analysis used phylogenetic and statistical methods for detecting molecular adaptation. We identified different selective pressures acting on host proteins involved in HIV-1 pathogenesis. Proteins with HIV-1 restriction properties in non-human primates were under strong positive selection, in particular in regions of interaction with viral proteins. These regions carried key residues for the antiviral activity. Proteins of the innate immunity presented an evolutionary pattern of conservation (purifying selection) but with signals of relaxed constrain if we compared them to the average profile of purifying selection of the primate genomes. Large scale analysis resulted in patterns of evolutionary pressures according to molecular function, biological process and cellular distribution. The data generated by various analyses served to guide the ancestral reconstruction of TRIM5a a potent antiviral host factor. The resurrected TRIM5a from the common ancestor of Old world monkeys was effective against HIV-1 and the recent resurrected hominoid variants were more effective against other retrovirus. Thus, as the result of trade-offs in the ability to restrict different retrovirus, human might have been exposed to HIV-1 at a time when TRIM5a lacked the appropriate specific restriction activity. The application of evolutionary and comparative genetic tools should be considered for the systematical assessment of host proteins relevant in viral pathogenesis, and to guide biological and functional studies. Résumé La population mondiale est confrontée depuis le début du vingtième siècle au virus de l'immunodéficience humaine 1 (VIH-1). Ce virus a un taux de mutation particulièrement élevé, il peut donc s'évader et s'adapter très efficacement à son hôte. Les organismes évolutivement le plus proches de l'homme les primates nonhumains sont moins susceptibles au VIH-1. De façon générale, les primates répondent différemment aux rétrovirus. Cette spécificité entre espèces doit résider dans les différences génétiques entre primates. Dans cette étude nous avons appliqué des techniques d'évolution et de génétique comparative pour caractériser le modèle évolutif des déterminants cellulaires impliqués dans la pathogenèse du VIH- 1. L'étude de l'évolution des gènes, codant pour des protéines impliquées dans la restriction ou la pathogenèse du VIH-1, aidera à la compréhension des bases génétiques ayant récemment rendu l'homme susceptible. Pour les analyses de génétique comparative, nous avons constitué une collection d'ADN et d'ARN de primates dans le but d'obtenir des nouvelles séquences de gènes orthologues. Récemment deux nouveaux génomes complets ont été publiés (l'orang-outan du Bornéo et Marmoset commun) en plus des trois génomes déjà disponibles (humain, chimpanzé, macaque rhésus). Ceci a permis d'améliorer considérablement l'étendue de l'analyse. Pour détecter l'adaptation moléculaire nous avons analysé les séquences à l'aide de méthodes phylogénétiques et statistiques. Nous avons identifié différentes pressions de sélection agissant sur les protéines impliquées dans la pathogenèse du VIH-1. Des protéines avec des propriétés de restriction du VIH-1 dans les primates non-humains présentent un taux particulièrement haut de remplacement d'acides aminés (sélection positive). En particulier dans les régions d'interaction avec les protéines virales. Ces régions incluent des acides aminés clé pour l'activité de restriction. Les protéines appartenant à l'immunité inné présentent un modèle d'évolution de conservation (sélection purifiante) mais avec des traces de "relaxation" comparé au profil général de sélection purifiante du génome des primates. Une analyse à grande échelle a permis de classifier les modèles de pression évolutive selon leur fonction moléculaire, processus biologique et distribution cellulaire. Les données générées par les différentes analyses ont permis la reconstruction ancestrale de TRIM5a, un puissant facteur antiretroviral. Le TRIM5a ressuscité, correspondant à l'ancêtre commun entre les grands singes et les groupe des catarrhiniens, est efficace contre le VIH-1 moderne. Les TRIM5a ressuscités plus récents, correspondant aux ancêtres des grands singes, sont plus efficaces contre d'autres rétrovirus. Ainsi, trouver un compromis dans la capacité de restreindre différents rétrovirus, l'homme aurait été exposé au VIH-1 à une période où TRIM5a manquait d'activité de restriction spécifique contre celui-ci. L'application de techniques d'évolution et de génétique comparative devraient être considérées pour l'évaluation systématique de protéines impliquées dans la pathogenèse virale, ainsi que pour guider des études biologiques et fonctionnelles
Resumo:
Highly quantitative biomarkers of neurodegenerative disease remain an important need in the urgent quest for disease-modifying therapies. For Huntington's disease (HD), a genetic test is available (trait marker), but necessary state markers are still in development. In this report, we describe a large battery of transcriptomic tests explored as state biomarker candidates. In an attempt to exploit the known neuroinflammatory and transcriptional perturbations of disease, we measured relevant mRNAs in peripheral blood cells. The performance of these potential markers was weak overall, with only one mRNA, immediate early response 3 (IER3), showing a modest but significant increase of 32% in HD samples compared with controls. No statistically significant differences were found for any other mRNAs tested, including a panel of 12 RNA biomarkers identified in a previous report [Borovecki F, Lovrecic L, Zhou J, Jeong H, Then F, Rosas HD, Hersch SM, Hogarth P, Bouzou B, Jensen RV, et al. (2005) Proc Natl Acad Sci USA 102:11023-11028]. The present results may nonetheless inform the future design and testing of HD biomarker strategies.