972 resultados para Biological Sequence Analysis
Resumo:
A defect in glucose sensing of the pancreatic beta-cells has been observed in several animal models of type II diabetes and has been correlated with a reduced gene expression of the glucose transporter type 2 (Glut2). In a transgenic mouse model, expression of Glut2 antisense RNA in pancreatic beta-cells has recently been shown to be associated with an impaired glucose-induced insulin secretion and the development of diabetes. To identify factors that may be involved in the specific decrease of Glut2 in the beta-cells of the diabetic animal, an attempt was made to localize the cis-elements and trans-acting factors involved in the control of Glut2 expression in the endocrine pancreas. It was demonstrated by transient transfection studies that only 338 base pairs (bp) of the murine Glut2 proximal promoter are needed for reporter gene expression in pancreatic islet-derived cell lines, whereas no activity was detected in nonpancreatic cells. Three cis-elements, GTI, GTII, and GTIII, have been identified by DNAse I footprinting and gel retardation experiments within these 338 bp. GTI and GTIII bind distinct but ubiquitously expressed trans-acting factors. On the other hand, nuclear proteins specifically expressed in pancreatic cell lines interact with GTII, and their relative abundance correlates with endogenous Glut2 expression. These GTII-binding factors correspond to nuclear proteins of 180 and 90 kilodaltons as defined by Southwestern analysis. The 180-kilodalton factor is present in pancreatic beta-cell lines but not in an alpha-cell line. Mutation of the GTI or GTIII cis-elements decreases transcriptional activity directed by the 338-bp promoter, whereas mutation of GTII increases gene transcription. Thus negative and positive regulatory sequences are identified within the proximal 338 bp of the GLUT2 promoter and may participate in the islet-specific expression of the gene by binding beta-cell specific trans-acting factors.
Resumo:
Genomic plasticity of human chromosome 8p23.1 region is highly influenced by two groups of complex segmental duplications (SDs), termed REPD and REPP, that mediate different kinds of rearrangements. Part of the difficulty to explain the wide range of phenotypes associated with 8p23.1 rearrangements is that REPP and REPD are not yet well characterized, probably due to their polymorphic status. Here, we describe a novel primate-specific gene family, named FAM90A (family with sequence similarity 90), found within these SDs. According to the current human reference sequence assembly, the FAM90A family includes 24 members along 8p23.1 region plus a single member on chromosome 12p13.31, showing copy number variation (CNV) between individuals. These genes can be classified into subfamilies I and II, which differ in their upstream and 5′-untranslated region sequences, but both share the same open reading frame and are ubiquitously expressed. Sequence analysis and comparative fluorescence in situ hybridization studies showed that FAM90A subfamily II suffered a big expansion in the hominoid lineage, whereas subfamily I members were likely generated sometime around the divergence of orangutan and African great apes by a fusion process. In addition, the analysis of the Ka/Ks ratios provides evidence of functional constraint of some FAM90A genes in all species. The characterization of the FAM90A gene family contributes to a better understanding of the structural polymorphism of the human 8p23.1 region and constitutes a good example of how SDs, CNVs and rearrangements within themselves can promote the formation of new gene sequences with potential functional consequences.
Resumo:
Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells isone of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenoncontributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora ofdifferent transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify thedifferent types of reflected splicing variation. In this work, we present a general definition of the AS event along with anotation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assignsa specific ‘‘AS code’’ to every possible pattern of splicing variation. On the basis of this definition and the correspondingcodes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of ASevents in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversityacross genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—ofthe observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate andto compare the AS landscape of different reference annotation sets in human and in other metazoan species and found thatproportions of AS events change substantially depending on the annotation protocol, species-specific attributes, andcoding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conductspecific studies investigating the occurrence, impact, and regulation of AS.
Resumo:
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics 'In silico Biology: Gene Discovery and Systems Genomics', Atlanta, USA, 15-17 November, 2007.
Resumo:
Background: We present the results of EGASP, a community experiment to assess the state-ofthe-art in genome annotation within the ENCODE regions, which span 1% of the human genomesequence. The experiment had two major goals: the assessment of the accuracy of computationalmethods to predict protein coding genes; and the overall assessment of the completeness of thecurrent human genome annotations as represented in the ENCODE regions. For thecomputational prediction assessment, eighteen groups contributed gene predictions. Weevaluated these submissions against each other based on a ‘reference set’ of annotationsgenerated as part of the GENCODE project. These annotations were not available to theprediction groups prior to the submission deadline, so that their predictions were blind and anexternal advisory committee could perform a fair assessment.Results: The best methods had at least one gene transcript correctly predicted for close to 70%of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into accountalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotidelevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programsrelying on mRNA and protein sequences were the most accurate in reproducing the manuallycurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could beverified.Conclusions: This is the first such experiment in human DNA, and we have followed thestandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe theresults presented here contribute to the value of ongoing large-scale annotation projects and shouldguide further experimental methods when being scaled up to the entire human genome sequence.
Resumo:
Selenoproteins are a diverse group of proteinsusually misidentified and misannotated in sequencedatabases. The presence of an in-frame UGA (stop)codon in the coding sequence of selenoproteingenes precludes their identification and correctannotation. The in-frame UGA codons are recodedto cotranslationally incorporate selenocysteine,a rare selenium-containing amino acid. The developmentof ad hoc experimental and, more recently,computational approaches have allowed the efficientidentification and characterization of theselenoproteomes of a growing number of species.Today, dozens of selenoprotein families have beendescribed and more are being discovered in recentlysequenced species, but the correct genomic annotationis not available for the majority of thesegenes. SelenoDB is a long-term project that aims toprovide, through the collaborative effort of experimentaland computational researchers, automaticand manually curated annotations of selenoproteingenes, proteins and SECIS elements. Version 1.0 ofthe database includes an initial set of eukaryoticgenomic annotations, with special emphasis on thehuman selenoproteome, for immediate inspectionby selenium researchers or incorporation into moregeneral databases. SelenoDB is freely available athttp://www.selenodb.org.
Resumo:
Background: Despite the continuous production of genome sequence for a number of organisms,reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularlytrue for genomes for which there is not a large collection of known gene sequences, such as therecently published chicken genome. We used the chicken sequence to test comparative andhomology-based gene-finding methods followed by experimental validation as an effective genomeannotation method.Results: We performed experimental evaluation by RT-PCR of three different computational genefinders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram wascomputed and each component of it was evaluated. The results showed that de novo comparativemethods can identify up to about 700 chicken genes with no previous evidence of expression, andcan correctly extend about 40% of homology-based predictions at the 5' end.Conclusions: De novo comparative gene prediction followed by experimental verification iseffective at enhancing the annotation of the newly sequenced genomes provided by standardhomology-based methods.
Resumo:
We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments.
Resumo:
BACKGROUND: Abiotrophia and Granulicatella species, previously referred to as nutritionally variant streptococci (NVS), are significant causative agents of endocarditis and bacteraemia. In this study, we reviewed the clinical manifestations of infections due to A. defectiva and Granulicatella species that occurred at our institution between 1998 and 2004. METHODS: The analysis included all strains of NVS that were isolated from blood cultures or vascular graft specimens. All strains were identified by 16S rRNA sequence analysis. Patients' medical charts were reviewed for each case of infection. RESULTS: Eleven strains of NVS were isolated during the 6-year period. Identification of the strains by 16S rRNA showed 2 genogroups: Abiotrophia defectiva (3) and Granulicatella adiacens (6) or "para-adiacens" (2). The three A. defectiva strains were isolated from immunocompetent patients with endovascular infections, whereas 7 of 8 Granulicatella spp. strains were isolated from immunosuppressed patients, mainly febrile neutropenic patients. We report the first case of "G. para-adiacens" bacteraemia in the setting of febrile neutropenia. CONCLUSION: We propose that Granulicatella spp. be considered as a possible agent of bacteraemia in neutropenic patients.
Resumo:
PHO1 has been recently identified as a protein involved in the loading of inorganic phosphate into the xylem of roots in Arabidopsis. The genome of Arabidopsis contains 11 members of the PHO1 gene family. The cDNAs of all PHO1 homologs have been cloned and sequenced. All proteins have the same topology and harbor a SPX tripartite domain in the N-terminal hydrophilic portion and an EXS domain in the C-terminal hydrophobic portion. The SPX and EXS domains have been identified in yeast (Saccharomyces cerevisiae) proteins involved in either phosphate transport or sensing or in sorting proteins to endomembranes. The Arabidopsis genome contains additional proteins of unknown function containing either a SPX or an EXS domain. Phylogenetic analysis indicated that the PHO1 family is subdivided into at least three clusters. Reverse transcription-PCR revealed a broad pattern of expression in leaves, roots, stems, and flowers for most genes, although two genes are expressed exclusively in flowers. Analysis of the activity of the promoter of all PHO1 homologs using promoter-beta-glucuronidase fusions revealed a predominant expression in the vascular tissues of roots, leaves, stems, or flowers. beta-Glucuronidase expression is also detected for several promoters in nonvascular tissue, including hydathodes, trichomes, root tip, root cortical/epidermal cells, and pollen grains. The expression pattern of PHO1 homologs indicates a likely role of the PHO1 proteins not only in the transfer of phosphate to the vascular cylinder of various tissues but also in the acquisition of phosphate into cells, such as pollen or root epidermal/cortical cells.
Resumo:
We report the draft genome sequence of the red harvester ant, Pogonomyrmex barbatus. The genome was sequenced using 454 pyrosequencing, and the current assembly and annotation were completed in less than 1 y. Analyses of conserved gene groups (more than 1,200 manually annotated genes to date) suggest a high-quality assembly and annotation comparable to recently sequenced insect genomes using Sanger sequencing. The red harvester ant is a model for studying reproductive division of labor, phenotypic plasticity, and sociogenomics. Although the genome of P. barbatus is similar to other sequenced hymenopterans (Apis mellifera and Nasonia vitripennis) in GC content and compositional organization, and possesses a complete CpG methylation toolkit, its predicted genomic CpG content differs markedly from the other hymenopterans. Gene networks involved in generating key differences between the queen and worker castes (e.g., wings and ovaries) show signatures of increased methylation and suggest that ants and bees may have independently co-opted the same gene regulatory mechanisms for reproductive division of labor. Gene family expansions (e.g., 344 functional odorant receptors) and pseudogene accumulation in chemoreception and P450 genes compared with A. mellifera and N. vitripennis are consistent with major life-history changes during the adaptive radiation of Pogonomyrmex spp., perhaps in parallel with the development of the North American deserts.
Resumo:
Urine samples from 20 male volunteers of European Caucasian origin were stored at 4 degrees C over a 4-month period in order to compare the identification potential of nuclear DNA (nDNA) and mitochondrial DNA (mtDNA) markers. The amount of nDNA recovered from urines dramatically declined over time. Consequently, nDNA likelihood ratios (LRs) greater than 1,000 were obtained for 100, 70 and 55% of the urines analysed after 6, 60 and 120 days, respectively. For the mtDNA, HVI and HVII sequences were obtained for all samples tested, whatever the period considered. Nevertheless, the highest mtDNA LR of 435 was relatively low compared to its nDNA equivalent. Indeed, LRs obtained with only three nDNA loci could easily exceed this value and are quite easier to obtain. Overall, the joint use of nDNA and mtDNA markers enabled the 20 urine samples to be identified, even after the 4-month period.
Resumo:
We evaluated 25 protocol variants of 14 independent computational methods for exon identification, transcript reconstruction and expression-level quantification from RNA-seq data. Our results show that most algorithms are able to identify discrete transcript components with high success rates but that assembly of complete isoform structures poses a major challenge even when all constituent elements are identified. Expression-level estimates also varied widely across methods, even when based on similar transcript models. Consequently, the complexity of higher eukaryotic genomes imposes severe limitations on transcript recall and splice product discrimination that are likely to remain limiting factors for the analysis of current-generation RNA-seq data.
Resumo:
A selection gradient was recently suggested as one possible cause for a clinal distribution of mitochondrial DNA (mtDNA) haplotypes along an altitudinal transect in the greater white-toothed shrew, Crocidura russula (Ehinger et al. 2002). One mtDNA haplotype (H1) rare in lowland, became widespread when approaching the altitudinal margin of the distribution. As H1 differs from the main lowland haplotype by several nonsynonymous mutations (including on ATP6), and as mitochondria play a crucial role in metabolism and thermogenesis, distribution patterns might stem from differences in the thermogenic capacity of different mtDNA haplotypes. In order to test this hypothesis, we measured the nonshivering thermogenesis (NST) associated with different mtDNA haplotypes. Sixty-two shrews, half of which had the H1 haplotype, were acclimated in November at semioutdoor conditions and measured for NST throughout winter. Our results showed the crucial role of NST for winter survival in C. russula. The individuals that survived winter displayed a higher significant increase in NST during acclimation, associated with a significant gain in body mass, presumably from brown fat accumulation. The NST capacity (ratio of NST to basal metabolic rate) was exceptionally high for such a small species. NST was significantly affected by a gender x haplotype interaction after winter-acclimation: females bearing the H1 haplotype displayed a better thermogenesis at the onset of the breeding season, while the reverse was true for males. Altogether, our results suggest a sexually antagonistic cyto-nuclear selection on thermogenesis.