45 resultados para Markov chains hidden Markov models Viterbi algorithm Forward-Backward algorithm maximum likelihood
em Université de Lausanne, Switzerland
Resumo:
Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot
Resumo:
Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1,000,000 hits from 462,500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.
Resumo:
BACKGROUND: In vitro aggregating brain cell cultures containing all types of brain cells have been shown to be useful for neurotoxicological investigations. The cultures are used for the detection of nervous system-specific effects of compounds by measuring multiple endpoints, including changes in enzyme activities. Concentration-dependent neurotoxicity is determined at several time points. METHODS: A Markov model was set up to describe the dynamics of brain cell populations exposed to potentially neurotoxic compounds. Brain cells were assumed to be either in a healthy or stressed state, with only stressed cells being susceptible to cell death. Cells may have switched between these states or died with concentration-dependent transition rates. Since cell numbers were not directly measurable, intracellular lactate dehydrogenase (LDH) activity was used as a surrogate. Assuming that changes in cell numbers are proportional to changes in intracellular LDH activity, stochastic enzyme activity models were derived. Maximum likelihood and least squares regression techniques were applied for estimation of the transition rates. Likelihood ratio tests were performed to test hypotheses about the transition rates. Simulation studies were used to investigate the performance of the transition rate estimators and to analyze the error rates of the likelihood ratio tests. The stochastic time-concentration activity model was applied to intracellular LDH activity measurements after 7 and 14 days of continuous exposure to propofol. The model describes transitions from healthy to stressed cells and from stressed cells to death. RESULTS: The model predicted that propofol would affect stressed cells more than healthy cells. Increasing propofol concentration from 10 to 100 μM reduced the mean waiting time for transition to the stressed state by 50%, from 14 to 7 days, whereas the mean duration to cellular death reduced more dramatically from 2.7 days to 6.5 hours. CONCLUSION: The proposed stochastic modeling approach can be used to discriminate between different biological hypotheses regarding the effect of a compound on the transition rates. The effects of different compounds on the transition rate estimates can be quantitatively compared. Data can be extrapolated at late measurement time points to investigate whether costs and time-consuming long-term experiments could possibly be eliminated.
Resumo:
BACKGROUND: Membrane-bound organelles are a defining feature of eukaryotic cells, and play a central role in most of their fundamental processes. The Rab G proteins are the single largest family of proteins that participate in the traffic between organelles, with 66 Rabs encoded in the human genome. Rabs direct the organelle-specific recruitment of vesicle tethering factors, motor proteins, and regulators of membrane traffic. Each organelle or vesicle class is typically associated with one or more Rab, with the Rabs present in a particular cell reflecting that cell's complement of organelles and trafficking routes. RESULTS: Through iterative use of hidden Markov models and tree building, we classified Rabs across the eukaryotic kingdom to provide the most comprehensive view of Rab evolution obtained to date. A strikingly large repertoire of at least 20 Rabs appears to have been present in the last eukaryotic common ancestor (LECA), consistent with the 'complexity early' view of eukaryotic evolution. We were able to place these Rabs into six supergroups, giving a deep view into eukaryotic prehistory. CONCLUSIONS: Tracing the fate of the LECA Rabs revealed extensive losses with many extant eukaryotes having fewer Rabs, and none having the full complement. We found that other Rabs have expanded and diversified, including a large expansion at the dawn of metazoans, which could be followed to provide an account of the evolutionary history of all human Rabs. Some Rab changes could be correlated with differences in cellular organization, and the relative lack of variation in other families of membrane-traffic proteins suggests that it is the changes in Rabs that primarily underlies the variation in organelles between species and cell types.
Resumo:
BACKGROUND: Low-molecular-weight heparin (LMWH) appears to be safe and effective for treating pulmonary embolism (PE), but its cost-effectiveness has not been assessed. METHODS: We built a Markov state-transition model to evaluate the medical and economic outcomes of a 6-day course with fixed-dose LMWH or adjusted-dose unfractionated heparin (UFH) in a hypothetical cohort of 60-year-old patients with acute submassive PE. Probabilities for clinical outcomes were obtained from a meta-analysis of clinical trials. Cost estimates were derived from Medicare reimbursement data and other sources. The base-case analysis used an inpatient setting, whereas secondary analyses examined early discharge and outpatient treatment with LMWH. Using a societal perspective, strategies were compared based on lifetime costs, quality-adjusted life-years (QALYs), and the incremental cost-effectiveness ratio. RESULTS: Inpatient treatment costs were higher for LMWH treatment than for UFH (dollar 13,001 vs dollar 12,780), but LMWH yielded a greater number of QALYs than did UFH (7.677 QALYs vs 7.493 QALYs). The incremental costs of dollar 221 and the corresponding incremental effectiveness of 0.184 QALYs resulted in an incremental cost-effectiveness ratio of dollar 1,209/QALY. Our results were highly robust in sensitivity analyses. LMWH became cost-saving if the daily pharmacy costs for LMWH were < dollar 51, if > or = 8% of patients were eligible for early discharge, or if > or = 5% of patients could be treated entirely as outpatients. CONCLUSION: For inpatient treatment of PE, the use of LMWH is cost-effective compared to UFH. Early discharge or outpatient treatment in suitable patients with PE would lead to substantial cost savings.
Resumo:
BACKGROUND: Physician training in smoking cessation counseling has been shown to be effective as a means to increase quit success. We assessed the cost-effectiveness ratio of a smoking cessation counseling training programme. Its effectiveness was previously demonstrated in a cluster randomized, control trial performed in two Swiss university outpatients clinics, in which residents were randomized to receive training in smoking interventions or a control educational intervention. DESIGN AND METHODS: We used a Markov simulation model for effectiveness analysis. This model incorporates the intervention efficacy, the natural quit rate, and the lifetime probability of relapse after 1-year abstinence. We used previously published results in addition to hospital service and outpatient clinic cost data. The time horizon was 1 year, and we opted for a third-party payer perspective. RESULTS: The incremental cost of the intervention amounted to US$2.58 per consultation by a smoker, translating into a cost per life-year saved of US$25.4 for men and 35.2 for women. One-way sensitivity analyses yielded a range of US$4.0-107.1 in men and US$9.7-148.6 in women. Variations in the quit rate of the control intervention, the length of training effectiveness, and the discount rate yielded moderately large effects on the outcome. Variations in the natural cessation rate, the lifetime probability of relapse, the cost of physician training, the counseling time, the cost per hour of physician time, and the cost of the booklets had little effect on the cost-effectiveness ratio. CONCLUSIONS: Training residents in smoking cessation counseling is a very cost-effective intervention and may be more efficient than currently accepted tobacco control interventions.
Resumo:
BACKGROUND/AIMS: While several risk factors for the histological progression of chronic hepatitis C have been identified, the contribution of HCV genotypes to liver fibrosis evolution remains controversial. The aim of this study was to assess independent predictors for fibrosis progression. METHODS: We identified 1189 patients from the Swiss Hepatitis C Cohort database with at least one biopsy prior to antiviral treatment and assessable date of infection. Stage-constant fibrosis progression rate was assessed using the ratio of fibrosis Metavir score to duration of infection. Stage-specific fibrosis progression rates were obtained using a Markov model. Risk factors were assessed by univariate and multivariate regression models. RESULTS: Independent risk factors for accelerated stage-constant fibrosis progression (>0.083 fibrosis units/year) included male sex (OR=1.60, [95% CI 1.21-2.12], P<0.001), age at infection (OR=1.08, [1.06-1.09], P<0.001), histological activity (OR=2.03, [1.54-2.68], P<0.001) and genotype 3 (OR=1.89, [1.37-2.61], P<0.001). Slower progression rates were observed in patients infected by blood transfusion (P=0.02) and invasive procedures or needle stick (P=0.03), compared to those infected by intravenous drug use. Maximum likelihood estimates (95% CI) of stage-specific progression rates (fibrosis units/year) for genotype 3 versus the other genotypes were: F0-->F1: 0.126 (0.106-0.145) versus 0.091 (0.083-0.100), F1-->F2: 0.099 (0.080-0.117) versus 0.065 (0.058-0.073), F2-->F3: 0.077 (0.058-0.096) versus 0.068 (0.057-0.080) and F3-->F4: 0.171 (0.106-0.236) versus 0.112 (0.083-0.142, overall P<0.001). CONCLUSIONS: This study shows a significant association of genotype 3 with accelerated fibrosis using both stage-constant and stage-specific estimates of fibrosis progression rates. This observation may have important consequences for the management of patients infected with this genotype.
Resumo:
Models of codon evolution have attracted particular interest because of their unique capabilities to detect selection forces and their high fit when applied to sequence evolution. We described here a novel approach for modeling codon evolution, which is based on Kronecker product of matrices. The 61 × 61 codon substitution rate matrix is created using Kronecker product of three 4 × 4 nucleotide substitution matrices, the equilibrium frequency of codons, and the selection rate parameter. The entities of the nucleotide substitution matrices and selection rate are considered as parameters of the model, which are optimized by maximum likelihood. Our fully mechanistic model allows the instantaneous substitution matrix between codons to be fully estimated with only 19 parameters instead of 3,721, by using the biological interdependence existing between positions within codons. We illustrate the properties of our models using computer simulations and assessed its relevance by comparing the AICc measures of our model and other models of codon evolution on simulations and a large range of empirical data sets. We show that our model fits most biological data better compared with the current codon models. Furthermore, the parameters in our model can be interpreted in a similar way as the exchangeability rates found in empirical codon models.
Resumo:
BACKGROUND: Available methods to simulate nucleotide or amino acid data typically use Markov models to simulate each position independently. These approaches are not appropriate to assess the performance of combinatorial and probabilistic methods that look for coevolving positions in nucleotide or amino acid sequences. RESULTS: We have developed a web-based platform that gives a user-friendly access to two phylogenetic-based methods implementing the Coev model: the evaluation of coevolving scores and the simulation of coevolving positions. We have also extended the capabilities of the Coev model to allow for the generalization of the alphabet used in the Markov model, which can now analyse both nucleotide and amino acid data sets. The simulation of coevolving positions is novel and builds upon the developments of the Coev model. It allows user to simulate pairs of dependent nucleotide or amino acid positions. CONCLUSIONS: The main focus of our paper is the new simulation method we present for coevolving positions. The implementation of this method is embedded within the web platform Coev-web that is freely accessible at http://coev.vital-it.ch/, and was tested in most modern web browsers.
Resumo:
Monoclonal IgG are commonly observed in various B cell disorders, of which multiple myeloma is the most clinically relevant. In a series of serum samples, we identified by immunofixation 73 monoclonal IgG, including 63 IgG(1), 4 IgG(2), 5 IgG(3), and 1 IgG(4). The light chains were of kappa type in 45 cases, and of lambda type in 28 cases. These monoclonal IgG were further characterized by high resolution two-dimensional polyacrylamide gel electrophoresis (2-DE) in various isoelectric focusing conditions, as well as by 3-DE (2-DE of the proteins extracted from agarose after serum protein agarose electrophoresis). After 2-DE, 38 out of 73 monoclonal gamma chains (52%) were visualized using immobilized pH 3-10 gradients for isoelectric focusing. In 6 cases (8%), gamma chains were only detected using alkaline immobilized pH 6-11 gradients. In 3 cases (4%), 3-DE revealed monoclonal gamma chains hidden by polyclonal gamma chains. Finally, in 26 cases (36%), no monoclonal gamma chains were clearly visualized. Sixty-one monoclonal light chains (84%) were detected using immobilized pH 3-10 gradients, whereas 12 (16%) were not. Monoclonal gamma chains and light chains were highly heterogeneous in terms of pI and M(r). However, a statistically significant correlation (P<0.05) was observed between the position of the monoclonal IgG in agarose gel and the pI of their heavy and light chains (R=0.733, multiple linear regression). Because of the extreme diversity of their heavy and light chains, it appears that a classification of monoclonal IgG based only on their electrophoretic properties is not possible.
Resumo:
BACKGROUND: An objective measurement of surgical procedures outcomes is inherent to professional practices quality control; this especially applies in orthopaedics to joint replacement outcomes. A self-administered questionnaire offers an attractive alternative to surgeon's judgement but is infrequently used in France for these purposes. The British questionnaire, the 12-item Oxford Hip Score (OHS) was selected for this study because of its ease of use. HYPOTHESIS: The objective of this study was to validate the French translation of the self-assessment 12-item Oxford Hip Score and compare its results with those of the reference functional scores: the Harris Hip Score (HHS) and the Postel-Merle d'Aubigné (PMA) score. MATERIALS AND METHODS: Based on a clinical series of 242 patients who were candidates for total hip arthroplasty, the French translation of this questionnaire was validated. Its coherence was also validated by comparing the preoperative data with the data obtained from the two other reference clinical scores. RESULTS: The translation was validated using the forward-backward translation procedure from French to English, with correction of all differences or mistranslations after systematized comparison with the original questionnaire in English. The mean overall OHS score was 43.8 points (range, 22-60 points) with similarly good distribution of the overall value of the three scores compared. The correlation was excellent between the OHS and the HHS, but an identical correlation between the OHS and the PMA was only obtained for the association of the pain and function parameters, after excluding the mobility criterion, relatively over-represented in the PMA score. DISCUSSION AND CONCLUSION: Subjective questionnaires that contribute a personal appreciation of the results of arthroplasty by the patient can easily be applied on a large scale. This study made a translated and validated version of an internationally recognized, reliable self-assessment score available to French orthopaedic surgeons. The results obtained encourage us to use this questionnaire as a complement to the classical evaluation scores and methods.
Resumo:
Gene duplication and neofunctionalization are known to be important processes in the evolution of phenotypic complexity. They account for important evolutionary novelties that confer ecological adaptation, such as the major histocompatibility complex (MHC), a multigene family crucial to the vertebrate immune system. In birds, two MHC class II β (MHCIIβ) exon 3 lineages have been recently characterized, and two hypotheses for the evolutionary history of MHCIIβ lineages were proposed. These lineages could have arisen either by 1) an ancient duplication and subsequent divergence of one paralog or by 2) recent parallel duplications followed by functional convergence. Here, we compiled a data set consisting of 63 MHCIIβ exon 3 sequences from six avian orders to distinguish between these hypotheses and to understand the role of selection in the divergent evolution of the two avian MHCIIβ lineages. Based on phylogenetic reconstructions and simulations, we show that a unique duplication event preceding the major avian radiations gave rise to two ancestral MHCIIβ lineages that were each likely lost once later during avian evolution. Maximum likelihood estimation shows that following the ancestral duplication, positive selection drove a radical shift from basic to acidic amino acid composition of a protein domain facing the α-chain in the MHCII α β-heterodimer. Structural analyses of the MHCII α β-heterodimer highlight that three of these residues are potentially involved in direct interactions with the α-chain, suggesting that the shift following duplication may have been accompanied by coevolution of the interacting α- and β-chains. These results provide new insights into the long-term evolutionary relationships among avian MHC genes and open interesting perspectives for comparative and population genomic studies of avian MHC evolution.