983 resultados para Statistical Prediction


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Statistical tests of Load-Unload Response Ratio (LURR) signals are carried in order to verify statistical robustness of the previous studies using the Lattice Solid Model (MORA et al., 2002b). In each case 24 groups of samples with the same macroscopic parameters (tidal perturbation amplitude A, period T and tectonic loading rate k) but different particle arrangements are employed. Results of uni-axial compression experiments show that before the normalized time of catastrophic failure, the ensemble average LURR value rises significantly, in agreement with the observations of high LURR prior to the large earthquakes. In shearing tests, two parameters are found to control the correlation between earthquake occurrence and tidal stress. One is, A/(kT) controlling the phase shift between the peak seismicity rate and the peak amplitude of the perturbation stress. With an increase of this parameter, the phase shift is found to decrease. Another parameter, AT/k, controls the height of the probability density function (Pdf) of modeled seismicity. As this parameter increases, the Pdf becomes sharper and narrower, indicating a strong triggering. Statistical studies of LURR signals in shearing tests also suggest that except in strong triggering cases, where LURR cannot be calculated due to poor data in unloading cycles, the larger events are more likely to occur in higher LURR periods than the smaller ones, supporting the LURR hypothesis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of the present work is to propose a numerical and statistical approach, using computational fluid dynamics, for the study of the atmospheric pollutant dispersion. Modifications in the standard k-epsilon turbulence model and additional equations for the calculation of the variance of concentration are introduced to enhance the prediction of the flow field and scalar quantities. The flow field, the mean concentration and the variance of a flow over a two-dimensional triangular hill, with a finite-size point pollutant source, are calculated by a finite volume code and compared with published experimental results. A modified low Reynolds k-epsilon turbulence model was employed in this work, using the constant of the k-epsilon model C(mu)=0.03 to take into account the inactive atmospheric turbulence. The numerical results for the velocity profiles and the position of the reattachment point are in good agreement with the experimental results. The results for the mean and the variance of the concentration are also in good agreement with experimental results from the literature. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Low noise surfaces have been increasingly considered as a viable and cost-effective alternative to acoustical barriers. However, road planners and administrators frequently lack information on the correlation between the type of road surface and the resulting noise emission profile. To address this problem, a method to identify and classify different types of road pavements was developed, whereby near field road noise is analyzed using statistical learning methods. The vehicle rolling sound signal near the tires and close to the road surface was acquired by two microphones in a special arrangement which implements the Close-Proximity method. A set of features, characterizing the properties of the road pavement, was extracted from the corresponding sound profiles. A feature selection method was used to automatically select those that are most relevant in predicting the type of pavement, while reducing the computational cost. A set of different types of road pavement segments were tested and the performance of the classifier was evaluated. Results of pavement classification performed during a road journey are presented on a map, together with geographical data. This procedure leads to a considerable improvement in the quality of road pavement noise data, thereby increasing the accuracy of road traffic noise prediction models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data Mining (DM) methods are being increasingly used in prediction with time series data, in addition to traditional statistical approaches. This paper presents a literature review of the use of DM with time series data, focusing on short- time stocks prediction. This is an area that has been attracting a great deal of attention from researchers in the field. The main contribution of this paper is to provide an outline of the use of DM with time series data, using mainly examples related with short-term stocks prediction. This is important to a better understanding of the field. Some of the main trends and open issues will also be introduced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

CONTEXT: Several genetic risk scores to identify asymptomatic subjects at high risk of developing type 2 diabetes mellitus (T2DM) have been proposed, but it is unclear whether they add extra information to risk scores based on clinical and biological data. OBJECTIVE: The objective of the study was to assess the extra clinical value of genetic risk scores in predicting the occurrence of T2DM. DESIGN: This was a prospective study, with a mean follow-up time of 5 yr. SETTING AND SUBJECTS: The study included 2824 nondiabetic participants (1548 women, 52 ± 10 yr). MAIN OUTCOME MEASURE: Six genetic risk scores for T2DM were tested. Four were derived from the literature and two were created combining all (n = 24) or shared (n = 9) single-nucleotide polymorphisms of the previous scores. A previously validated clinic + biological risk score for T2DM was used as reference. RESULTS: Two hundred seven participants (7.3%) developed T2DM during follow-up. On bivariate analysis, no differences were found for all but one genetic score between nondiabetic and diabetic participants. After adjusting for the validated clinic + biological risk score, none of the genetic scores improved discrimination, as assessed by changes in the area under the receiver-operating characteristic curve (range -0.4 to -0.1%), sensitivity (-2.9 to -1.0%), specificity (0.0-0.1%), and positive (-6.6 to +0.7%) and negative (-0.2 to 0.0%) predictive values. Similarly, no improvement in T2DM risk prediction was found: net reclassification index ranging from -5.3 to -1.6% and nonsignificant (P ≥ 0.49) integrated discrimination improvement. CONCLUSIONS: In this study, adding genetic information to a previously validated clinic + biological score does not seem to improve the prediction of T2DM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND AND PURPOSE: The study aims to assess the recanalization rate in acute ischemic stroke patients who received no revascularization therapy, intravenous thrombolysis, and endovascular treatment, respectively, and to identify best clinical and imaging predictors of recanalization in each treatment group. METHODS: Clinical and imaging data were collected in 103 patients with acute ischemic stroke caused by anterior circulation arterial occlusion. We recorded demographics and vascular risk factors. We reviewed the noncontrast head computed tomographies to assess for hyperdense middle cerebral artery and its computed tomography density. We reviewed the computed tomography angiograms and the raw images to determine the site and degree of arterial occlusion, collateral score, clot burden score, and the density of the clot. Recanalization status was assessed on recanalization imaging using Thrombolysis in Myocardial Ischemia. Multivariate logistic regressions were utilized to determine the best predictors of outcome in each treatment group. RESULTS: Among the 103 study patients, 43 (42%) received intravenous thrombolysis, 34 (33%) received endovascular thrombolysis, and 26 (25%) did not receive any revascularization therapy. In the patients with intravenous thrombolysis or no revascularization therapy, recanalization of the vessel was more likely with intravenous thrombolysis (P = 0·046) and when M1/A1 was occluded (P = 0·001). In this subgroup of patients, clot burden score, cervical degree of stenosis (North American Symptomatic Carotid Endarterectomy Trial), and hyperlipidemia status added information to the aforementioned likelihood of recanalization at the patient level (P < 0·001). In patients with endovascular thrombolysis, recanalization of the vessel was more likely in the case of a higher computed tomography angiogram clot density (P = 0·012), and in this subgroup of patients gender added information to the likelihood of recanalization at the patient level (P = 0·044). CONCLUSION: The overall likelihood of recanalization was the highest in the endovascular group, and higher for intravenous thrombolysis compared with no revascularization therapy. However, our statistical models of recanalization for each individual patient indicate significant variability between treatment options, suggesting the need to include this prediction in the personalized treatment selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Metabolic problems lead to numerous failures during clinical trials, and much effort is now devoted to developing in silico models predicting metabolic stability and metabolites. Such models are well known for cytochromes P450 and some transferases, whereas less has been done to predict the activity of human hydrolases. The present study was undertaken to develop a computational approach able to predict the hydrolysis of novel esters by human carboxylesterase hCES2. The study involved first a homology modeling of the hCES2 protein based on the model of hCES1 since the two proteins share a high degree of homology (congruent with 73%). A set of 40 known substrates of hCES2 was taken from the literature; the ligands were docked in both their neutral and ionized forms using GriDock, a parallel tool based on the AutoDock4.0 engine which can perform efficient and easy virtual screening analyses of large molecular databases exploiting multi-core architectures. Useful statistical models (e.g., r (2) = 0.91 for substrates in their unprotonated state) were calculated by correlating experimental pK(m) values with distance between the carbon atom of the substrate's ester group and the hydroxy function of Ser228. Additional parameters in the equations accounted for hydrophobic and electrostatic interactions between substrates and contributing residues. The negatively charged residues in the hCES2 cavity explained the preference of the enzyme for neutral substrates and, more generally, suggested that ligands which interact too strongly by ionic bonds (e.g., ACE inhibitors) cannot be good CES2 substrates because they are trapped in the cavity in unproductive modes and behave as inhibitors. The effects of protonation on substrate recognition and the contrasting behavior of substrates and products were finally investigated by MD simulations of some CES2 complexes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE. The main goal of this paper is to obtain a classification model based on feed-forward multilayer perceptrons in order to improve postpartum depression prediction during the 32 weeks after childbirth with a high sensitivity and specificity and to develop a tool to be integrated in a decision support system for clinicians. MATERIALS AND METHODS. Multilayer perceptrons were trained on data from 1397 women who had just given birth, from seven Spanish general hospitals, including clinical, environmental and genetic variables. A prospective cohort study was made just after delivery, at 8 weeks and at 32 weeks after delivery. The models were evaluated with the geometric mean of accuracies using a hold-out strategy. RESULTS. Multilayer perceptrons showed good performance (high sensitivity and specificity) as predictive models for postpartum depression. CONCLUSIONS. The use of these models in a decision support system can be clinically evaluated in future work. The analysis of the models by pruning leads to a qualitative interpretation of the influence of each variable in the interest of clinical protocols.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El principal objectiu del projecte era desenvolupar millores conceptuals i metodològiques que permetessin una millor predicció dels canvis en la distribució de les espècies (a una escala de paisatge) derivats de canvis ambientals en un context dominat per pertorbacions. En un primer estudi, vàrem comparar l'eficàcia de diferents models dinàmics per a predir la distribució de l'hortolà (Emberiza hortulana). Els nostres resultats indiquen que un model híbrid que combini canvis en la qualitat de l'hàbitat, derivats de canvis en el paisatge, amb un model poblacional espacialment explícit és una aproximació adequada per abordar canvis en la distribució d'espècies en contextos de dinàmica ambiental elevada i una capacitat de dispersió limitada de l'espècie objectiu. En un segon estudi abordarem la calibració mitjançant dades de seguiment de models de distribució dinàmics per a 12 espècies amb preferència per hàbitats oberts. Entre les conclusions extretes destaquem: (1) la necessitat de que les dades de seguiment abarquin aquelles àrees on es produeixen els canvis de qualitat; (2) el biaix que es produeix en la estimació dels paràmetres del model d'ocupació quan la hipòtesi de canvi de paisatge o el model de qualitat d'hàbitat són incorrectes. En el darrer treball estudiarem el possible impacte en 67 espècies d’ocells de diferents règims d’incendis, definits a partir de combinacions de nivells de canvi climàtic (portant a un augment esperat de la mida i freqüència d’incendis forestals), i eficiència d’extinció per part dels bombers. Segons els resultats dels nostres models, la combinació de factors antropogènics del regim d’incendis, tals com l’abandonament rural i l’extinció, poden ser més determinants per als canvis de distribució que els efectes derivats del canvi climàtic. Els productes generats inclouen tres publicacions científiques, una pàgina web amb resultats del projecte i una llibreria per a l'entorn estadístic R.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives The relevance of the SYNTAX score for the particular case of patients with acute ST- segment elevation myocardial infarction (STEMI) undergoing primary percutaneous coronary intervention (PPCI)  has previously only been studied in the setting of post hoc analysis of large prospective randomized clinical trials. A "real-life" population approach has never been explored before. The aim of this study was to evaluate the impact of the SYNTAX score for the prediction of the myocardial infarction size, estimated by the creatin-kinase (CK) peak value, using the SYNTAX score in patients treated with primary coronary intervention for acute ST-segment elevation myocardial infarction. Methods The primary endpoint of the study was myocardial infarction size as measured by the CK peak value. The SYNTAX score was calculated retrospectively in 253 consecutive patients with acute ST-segment elevation myocardial infarction (STEMI) undergoing primary percutaneous coronary intervention (PPCI) in a large tertiary referral center in Switzerland, between January 2009 and June 2010. Linear regression analysis was performed to compare myocardial infarction size with the SYNTAX score. This same endpoint was then stratified according to SYNTAX score tertiles: low <22 (n=178), intermediate [22-32] (n=60), and high >=33 (n=15). Results There were no significant differences in terms of clinical characteristics between the three groups. When stratified according to the SYNTAX score tertiles, average CK peak values of 1985 (low<22), 3336 (intermediate [22-32]) and 3684 (high>=33) were obtained with a p-value <0.0001. Bartlett's test for equal variances between the three groups was 9.999 (p-value <0.0067). A moderate Pearson product-moment correlation coefficient (r=0.4074) with a high statistical significance level (p-value <0.0001) was found. The coefficient of determination (R^2=0.1660) showed that approximately 17% of the variation of CK peak value (myocardial infarction size) could be explained by the SYNTAX score, i.e. by the coronary disease complexity. Conclusion In an all-comers population, the SYNTAX score is an additional tool in predicting myocardial infarction size in patients treated with primary percutaneous coronary intervention (PPCI). The stratification of patients in different risk groups according to SYNTAX enables to identify a high-risk population that may warrant particular patient care.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nonlinear regression problems can often be reduced to linearity by transforming the response variable (e.g., using the Box-Cox family of transformations). The classic estimates of the parameter defining the transformation as well as of the regression coefficients are based on the maximum likelihood criterion, assuming homoscedastic normal errors for the transformed response. These estimates are nonrobust in the presence of outliers and can be inconsistent when the errors are nonnormal or heteroscedastic. This article proposes new robust estimates that are consistent and asymptotically normal for any unimodal and homoscedastic error distribution. For this purpose, a robust version of conditional expectation is introduced for which the prediction mean squared error is replaced with an M scale. This concept is then used to develop a nonparametric criterion to estimate the transformation parameter as well as the regression coefficients. A finite sample estimate of this criterion based on a robust version of smearing is also proposed. Monte Carlo experiments show that the new estimates compare favorably with respect to the available competitors.