902 resultados para recurrent sequence
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.
Resumo:
The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10,010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper.
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
Background: Single nucleotide polymorphisms (SNPs) are the most frequent type of sequence variation between individuals, and represent a promising tool for finding genetic determinants of complex diseases and understanding the differences in drug response. In this regard, it is of particular interest to study the effect of non-synonymous SNPs in the context of biological networks such as cell signalling pathways. UniProt provides curated information about the functional and phenotypic effects of sequence variation, including SNPs, as well as on mutations of protein sequences. However, no strategy has been developed to integrate this information with biological networks, with the ultimate goal of studying the impact of the functional effect of SNPs in the structure and dynamics of biological networks. Results: First, we identified the different challenges posed by the integration of the phenotypic effect of sequence variants and mutations with biological networks. Second, we developed a strategy for the combination of data extracted from public resources, such as UniProt, NCBI dbSNP, Reactome and BioModels. We generated attribute files containing phenotypic and genotypic annotations to the nodes of biological networks, which can be imported into network visualization tools such as Cytoscape. These resources allow the mapping and visualization of mutations and natural variations of human proteins and their phenotypic effect on biological networks (e.g. signalling pathways, protein-protein interaction networks, dynamic models). Finally, an example on the use of the sequence variation data in the dynamics of a network model is presented. Conclusion: In this paper we present a general strategy for the integration of pathway and sequence variation data for visualization, analysis and modelling purposes, including the study of the functional impact of protein sequence variations on the dynamics of signalling pathways. This is of particular interest when the SNP or mutation is known to be associated to disease. We expect that this approach will help in the study of the functional impact of disease-associated SNPs on the behaviour of cell signalling pathways, which ultimately will lead to a better understanding of the mechanisms underlying complex diseases.
Resumo:
Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.
Resumo:
Background: Single Nucleotide Polymorphisms, among other type of sequence variants, constitute key elements in genetic epidemiology and pharmacogenomics. While sequence data about genetic variation is found at databases such as dbSNP, clues about the functional and phenotypic consequences of the variations are generally found in biomedical literature. The identification of the relevant documents and the extraction of the information from them are hampered by the large size of literature databases and the lack of widely accepted standard notation for biomedical entities. Thus, automatic systems for the identification of citations of allelic variants of genes in biomedical texts are required. Results: Our group has previously reported the development of OSIRIS, a system aimed at the retrieval of literature about allelic variants of genes http://ibi.imim.es/osirisform.html. Here we describe the development of a new version of OSIRIS (OSIRISv1.2, http://ibi.imim.es/OSIRISv1.2.html webcite) which incorporates a new entity recognition module and is built on top of a local mirror of the MEDLINE collection and HgenetInfoDB: a database that collects data on human gene sequence variations. The new entity recognition module is based on a pattern-based search algorithm for the identification of variation terms in the texts and their mapping to dbSNP identifiers. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in 99% precision, 82% recall, and an F-score of 0.89. As an example, the application of the system for collecting literature citations for the allelic variants of genes related to the diseases intracranial aneurysm and breast cancer is presented. Conclusion: OSIRISv1.2 can be used to link literature references to dbSNP database entries with high accuracy, and therefore is suitable for collecting current knowledge on gene sequence variations and supporting the functional annotation of variation databases. The application of OSIRISv1.2 in combination with controlled vocabularies like MeSH provides a way to identify associations of biomedical interest, such as those that relate SNPs with diseases.
Resumo:
Our objective was to describe the interventions aimed at preventing a recurrent hip fracture, and other injurious falls, which were provided during hospitalization for a first hip fracture and during the two following years. A secondary objective was to study some potential determinants of these preventive interventions. The design of the study was an observational, two-year follow-up of patients hospitalized for a first hip fracture at the University Hospital of Lausanne, Switzerland. The participants were 163 patients (median age 82 years, 83% women) hospitalized in 1991 for a first hip fracture, among 263 consecutively admitted patients (84 did not meet inclusion criteria, e.g., age>50, no cancer, no high energy trauma, and 16 refused to participate). Preventive interventions included: medical investigations performed during the first hospitalization and aimed at revealing modifiable pathologies that raise the risk of injurious falls; use of medications acting on the risk of falls and fractures; preventive recommendations given by medical staff; suppression of environmental hazards; and use of home assistance services. The information was obtained from a baseline questionnaire, the medical record filled during the index hospitalization, and an interview conducted 2 years after the fracture. Potential predictors of the use of preventive interventions were: age; gender; destination after discharge from hospital; comorbidity; cognitive functioning; and activities of daily living. Bi- and multivariate associations between the preventive interventions and the potential predictors were measured. In hospital investigations to rule out medical pathologies raising the risk of fracture were performed in only 20 patients (12%). Drugs raising the risk of falls were reduced in only 17 patients (16%). Preventive procedures not requiring active collaboration by the patient (e.g., modifications of the environment) were applied in 68 patients (42%), and home assistance was provided to 67 patients (85% of the patients living at home). Bivariate analyses indicated that prevention was less often provided to patients in poor general conditions, but no ascertainment of this association was found in multivariate analyses. In conclusion, this study indicates that, in the study setting, measures aimed at preventing recurrent falls and injuries were rarely provided to patients hospitalized for a first hip fracture at the time of the study. Tertiary prevention could be improved if a comprehensive geriatric assessment were systematically provided to the elderly patient hospitalized for a first hip fracture, and passive preventive measures implemented.
Resumo:
NovoTTF-100A (TTF) is a portable device delivering low-intensity, intermediate-frequency, alternating electric fields using noninvasive, disposable scalp electrodes. TTF interferes with tumor cell division, and it has been approved by the US Food and Drug Administration (FDA) for the treatment of recurrent glioblastoma (rGBM) based on data from a phase III trial. This presentation describes the updated survival data 2 years after completing recruitment. Adults with rGBM (KPS ≥ 70) were randomized (stratified by surgery and center) to either continuous TTF (20-24 h/day, 7 days/week) or efficacious chemotherapy based on best physician choice (BPC). The primary endpoint was overall survival (OS), and secondary endpoints were PFS6, 1-year survival, and QOL. Patients were randomized (28 US and European centers) to either TTF alone (n ¼ 120) or BPC (n ¼ 117). Patient characteristics were balanced, median age was 54 years (range, 23-80 years), and median KPS was 80 (range, 50-100). One quarter of the patients had debulking surgery, and over half of the patients were at their second or later recurrence. OS in the intent-to-treat (ITT) population was equivalent in TTF versus BPC patients (median OS, 6.6vs. 6.0 months; n ¼ 237; p ¼ 0.26; HR ¼ 0.86). With a median follow-up of 33.6 months, long-term survival in the TTF group was higher than that in the BPC group at 2, 3, and 4 years of follow-up (9.3% vs. 6.6%; 8.4% vs. 1.4%; 8.4% vs. 0.0%, respectively). Analysis of patients who received at least one treatment course demonstrated a survival benefit for TTF patients compared to BPC patients (median OS, 7.8 vs. 6.0 months; n ¼ 93 vs. n ¼ 117; p ¼ 0.012; HR ¼ 0.69). In this group, 1-year survival was 28% vs. 20%, and PFS6 was 26.2% vs. 15.2% (p ¼ 0.034). TTF, a noninvasive, novel cancer treatment modality shows significant therapeutic efficacy with promising long-term survival results. The impact of TTF was more pronounced when comparing only patients who received the minimal treatment course. A large-scale phase III trial in newly diagnosed GBM is ongoing.
Resumo:
Shrews of the genus Sorex are characterized by a Holarctic distribution, and relationships among extant taxa have never been fully resolved. Phylogenies have been proposed based on morphological, karyological, and biochemical comparisons, but these analyses often produced controversial and contradictory results. Phylogenetic analyses of partial mitochondrial cytochrome b gene sequences (1011 bp) were used to examine the relationships among 27 Sorex species. The molecular data suggest that Sorex comprises two major monophyletic lineages, one restricted mostly to the New World and one with a primarily Palearctic distribution. Furthermore, several sister-species relationships are revealed by the analysis. Based on the split between the Soricinae and Crocidurinae subfamilies, we used a 95% confidence interval for both the calibration of a molecular clock and the subsequent calculation of major diversification events within the genus Sorex. Our analysis does not support an unambiguous acceleration of the molecular clock in shrews, the estimated rate being similar to other estimates of mammalian mitochondrial clocks. In addition, the data presented here indicate that estimates from the fossil record greatly underestimate divergence dates among Sorex taxa.
Resumo:
The bacterial insertion sequence IS21 shares with many insertion sequences a two-step, reactive junction transposition pathway, for which a model is presented in this review: a reactive junction with abutted inverted repeats is first formed and subsequently integrated into the target DNA. The reactive junction occurs in IS21-IS21 tandems and IS21 minicircles. In addition, IS21 shows a unique specialization of transposition functions. By alternative translation initiation, the transposase gene codes for two products: the transposase, capable of promoting both steps of the reactive junction pathway, and the cointegrase, which only promotes the integration of reactive junctions but with higher efficiency. This review also includes a survey of the IS21 family and speculates on the possibility that other members present a similar transpositional specialization.
Resumo:
Epidemiological processes leave a fingerprint in the pattern of genetic structure of virus populations. Here, we provide a new method to infer epidemiological parameters directly from viral sequence data. The method is based on phylogenetic analysis using a birth-death model (BDM) rather than the commonly used coalescent as the model for the epidemiological transmission of the pathogen. Using the BDM has the advantage that transmission and death rates are estimated independently and therefore enables for the first time the estimation of the basic reproductive number of the pathogen using only sequence data, without further assumptions like the average duration of infection. We apply the method to genetic data of the HIV-1 epidemic in Switzerland.
Resumo:
Microtubule plus-end-tracking proteins (+TIPs) specifically localize to the growing plus-ends of microtubules to regulate microtubule dynamics and functions. A large group of +TIPs contain a short linear motif, SXIP, which is essential for them to bind to end-binding proteins (EBs) and target microtubule ends. The SXIP sequence site thus acts as a widespread microtubule tip localization signal (MtLS). Here we have analyzed the sequence-function relationship of a canonical MtLS. Using synthetic peptide arrays on membrane supports, we identified the residue preferences at each amino acid position of the SXIP motif and its surrounding sequence with respect to EB binding. We further developed an assay based on fluorescence polarization to assess the mechanism of the EB-SXIP interaction and to correlate EB binding and microtubule tip tracking of MtLS sequences from different +TIPs. Finally, we investigated the role of phosphorylation in regulating the EB-SXIP interaction. Together, our results define the sequence determinants of a canonical MtLS and provide the experimental data for bioinformatics approaches to carry out genome-wide predictions of novel +TIPs in multiple organisms.
Resumo:
BACKGROUND AND AIM: Recurrent hepatitis C is a major cause of morbidity and mortality after liver transplantation (LT), and optimal treatment algorithms have yet to be defined. Here, we present our experience of the first 21 patients with recurrent hepatitis C treated in Lausanne. PATIENTS AND METHODS: Twenty-one patients with histologyproven recurrent hepatitis C after LT were treated since 2003. Treatment was initiated with pegylated interferon-α2a 135 μg per week and ribavirin 400 mg per day in the majority of patients, and subsequent doses were adapted individually based on on-treatment virological responses and clinical and/or biochemical side effects. RESULTS: On an intention-to-treat basis, sustained virological response (SVR) was achieved in 12/21 (57%) patients (5/11 [45%], 2/3 [67%], 4/5 [80%] and 1/2 [50%] of patients infected with genotypes 1, 2, 3 and 4, respectively). Two patients experienced relapse and 6 did not respond to treatment (NR). Treatment duration ranged from 24 to 90 weeks. It was stopped prematurely due to adverse events in 5/21 (24%) patients (with SVR achieved in 2 patients, NR in 2 patients, and death of one patient awaiting re-transplantation). Of note, SVR was achieved in a patient with combined liver and kidney transplantation. Importantly, SVR was achieved in some patients despite the lack of an early virological response or HCV RNA negativity at week 24. Darbepoetin α and filgrastim were used in 33% and 14%, respectively. CONCLUSION: Individually adapted treatment of recurrent hepatitis C can achieve SVR in a substantial proportion of LT patients. Conventional stopping rules do not apply in this setting so that prolonged therapy may be useful in selected patients.