961 resultados para predictive performance


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims [1] To quantify the random and predictable components of variability for aminoglycoside clearance and volume of distribution [2] To investigate models for predicting aminoglycoside clearance in patients with low serum creatinine concentrations [3] To evaluate the predictive performance of initial dosing strategies for achieving an aminoglycoside target concentration. Methods Aminoglycoside demographic, dosing and concentration data were collected from 697 adult patients (>=20 years old) as part of standard clinical care using a target concentration intervention approach for dose individualization. It was assumed that aminoglycoside clearance had a renal and a nonrenal component, with the renal component being linearly related to predicted creatinine clearance. Results A two compartment pharmacokinetic model best described the aminoglycoside data. The addition of weight, age, sex and serum creatinine as covariates reduced the random component of between subject variability (BSVR) in clearance (CL) from 94% to 36% of population parameter variability (PPV). The final pharmacokinetic parameter estimates for the model with the best predictive performance were: CL, 4.7 l h(-1) 70 kg(-1); intercompartmental clearance (CLic), 1 l h(-1) 70 kg(-1); volume of central compartment (V-1), 19.5 l 70 kg(-1); volume of peripheral compartment (V-2) 11.2 l 70 kg(-1). Conclusions Using a fixed dose of aminoglycoside will achieve 35% of typical patients within 80-125% of a required dose. Covariate guided predictions increase this up to 61%. However, because we have shown that random within subject variability (WSVR) in clearance is less than safe and effective variability (SEV), target concentration intervention can potentially achieve safe and effective doses in 90% of patients.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Predictive species distribution modelling (SDM) has become an essential tool in biodiversity conservation and management. The choice of grain size (resolution) of environmental layers used in modelling is one important factor that may affect predictions. We applied 10 distinct modelling techniques to presence-only data for 50 species in five different regions, to test whether: (1) a 10-fold coarsening of resolution affects predictive performance of SDMs, and (2) any observed effects are dependent on the type of region, modelling technique, or species considered. Results show that a 10 times change in grain size does not severely affect predictions from species distribution models. The overall trend is towards degradation of model performance, but improvement can also be observed. Changing grain size does not equally affect models across regions, techniques, and species types. The strongest effect is on regions and species types, with tree species in the data sets (regions) with highest locational accuracy being most affected. Changing grain size had little influence on the ranking of techniques: boosted regression trees remain best at both resolutions. The number of occurrences used for model training had an important effect, with larger sample sizes resulting in better models, which tended to be more sensitive to grain. Effect of grain change was only noticeable for models reaching sufficient performance and/or with initial data that have an intrinsic error smaller than the coarser grain size.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Introduction: The original and modified Wells score are widely used prediction rules for pre-test probability assessment of deep vein thrombosis (DVT). The objective of this study was to compare the predictive performance of both Wells scores in unselected patients with clinical suspicion of DVT.Methods: Consecutive inpatients and outpatients with a clinical suspicion of DVT were prospectively enrolled. Pre-test DVT probability (low/intermediate/high) was determined using both scores. Patients with a non-high probability based on the original Wells score underwent D-dimers measurement. Patients with D-dimers <500 mu g/L did not undergo further testing, and treatment was withheld. All others underwent complete lower limb compression ultrasound, and those diagnosed with DVT were anticoagulated. The primary study outcome was objectively confirmed symptomatic venous thromboembolism within 3 months of enrollment.Results: 298 patients with suspected DVT were included. Of these, 82 (27.5%) had DVT, and 46 of them were proximal. Compared to the modified score, the original Wells score classified a higher proportion of patients as low-risk (53 vs 48%; p<0.01) and a lower proportion as high-risk (17 vs 15%; p=0.02); the prevalence of proximal DVT in each category was similar with both scores (7-8% low, 16-19% intermediate, 36-37% high). The area under the receiver operating characteristic curve regarding proximal DVT detection was similar for both scores, but they both performed poorly in predicting isolated distal DVT and DVT in inpatients.Conclusion: The study demonstrates that both Wells scores perform equally well in proximal DVT pre-test probability prediction. Neither score appears to be particularly useful in hospitalized patients and those with isolated distal DVT. (C) 2011 Elsevier Ltd. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Our digital universe is rapidly expanding,more and more daily activities are digitally recorded, data arrives in streams, it needs to be analyzed in real time and may evolve over time. In the last decade many adaptive learning algorithms and prediction systems, which can automatically update themselves with the new incoming data, have been developed. The majority of those algorithms focus on improving the predictive performance and assume that model update is always desired as soon as possible and as frequently as possible. In this study we consider potential model update as an investment decision, which, as in the financial markets, should be taken only if a certain return on investment is expected. We introduce and motivate a new research problem for data streams ? cost-sensitive adaptation. We propose a reference framework for analyzing adaptation strategies in terms of costs and benefits. Our framework allows to characterize and decompose the costs of model updates, and to asses and interpret the gains in performance due to model adaptation for a given learning algorithm on a given prediction task. Our proof-of-concept experiment demonstrates how the framework can aid in analyzing and managing adaptation decisions in the chemical industry.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Species distribution models (SDM) are increasingly used to understand the factors that regulate variation in biodiversity patterns and to help plan conservation strategies. However, these models are rarely validated with independently collected data and it is unclear whether SDM performance is maintained across distinct habitats and for species with different functional traits. Highly mobile species, such as bees, can be particularly challenging to model. Here, we use independent sets of occurrence data collected systematically in several agricultural habitats to test how the predictive performance of SDMs for wild bee species depends on species traits, habitat type, and sampling technique. We used a species distribution modeling approach parametrized for the Netherlands, with presence records from 1990 to 2010 for 193 Dutch wild bees. For each species, we built a Maxent model based on 13 climate and landscape variables. We tested the predictive performance of the SDMs with independent datasets collected from orchards and arable fields across the Netherlands from 2010 to 2013, using transect surveys or pan traps. Model predictive performance depended on species traits and habitat type. Occurrence of bee species specialized in habitat and diet was better predicted than generalist bees. Predictions of habitat suitability were also more precise for habitats that are temporally more stable (orchards) than for habitats that suffer regular alterations (arable), particularly for small, solitary bees. As a conservation tool, SDMs are best suited to modeling rarer, specialist species than more generalist and will work best in long-term stable habitats. The variability of complex, short-term habitats is difficult to capture in such models and historical land use generally has low thematic resolution. To improve SDMs’ usefulness, models require explanatory variables and collection data that include detailed landscape characteristics, for example, variability of crops and flower availability. Additionally, testing SDMs with field surveys should involve multiple collection techniques.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The original and modified Wells score are widely used prediction rules for pre-test probability assessment of deep vein thrombosis (DVT). The objective of this study was to compare the predictive performance of both Wells scores in unselected patients with clinical suspicion of DVT.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background The loose and stringent Asthma Predictive Indices (API), developed in Tucson, are popular rules to predict asthma in preschool children. To be clinically useful, they require validation in different settings. Objective To assess the predictive performance of the API in an independent population and compare it with simpler rules based only on preschool wheeze. Methods We studied 1954 children of the population-based Leicester Respiratory Cohort, followed up from age 1 to 10 years. The API and frequency of wheeze were assessed at age 3 years, and we determined their association with asthma at ages 7 and 10 years by using logistic regression. We computed test characteristics and measures of predictive performance to validate the API and compare it with simpler rules. Results The ability of the API to predict asthma in Leicester was comparable to Tucson: for the loose API, odds ratios for asthma at age 7 years were 5.2 in Leicester (5.5 in Tucson), and positive predictive values were 26% (26%). For the stringent API, these values were 8.2 (9.8) and 40% (48%). For the simpler rule early wheeze, corresponding values were 5.4 and 21%; for early frequent wheeze, 6.7 and 36%. The discriminative ability of all prediction rules was moderate (c statistic ≤ 0.7) and overall predictive performance low (scaled Brier score < 20%). Conclusion Predictive performance of the API in Leicester, although comparable to the original study, was modest and similar to prediction based only on preschool wheeze. This highlights the need for better prediction rules.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Symptoms of primary ciliary dyskinesia (PCD) are nonspecific and guidance on whom to refer for testing is limited. Diagnostic tests for PCD are highly specialised, requiring expensive equipment and experienced PCD scientists. This study aims to develop a practical clinical diagnostic tool to identify patients requiring testing.Patients consecutively referred for testing were studied. Information readily obtained from patient history was correlated with diagnostic outcome. Using logistic regression, the predictive performance of the best model was tested by receiver operating characteristic curve analyses. The model was simplified into a practical tool (PICADAR) and externally validated in a second diagnostic centre.Of 641 referrals with a definitive diagnostic outcome, 75 (12%) were positive. PICADAR applies to patients with persistent wet cough and has seven predictive parameters: full-term gestation, neonatal chest symptoms, neonatal intensive care admittance, chronic rhinitis, ear symptoms, situs inversus and congenital cardiac defect. Sensitivity and specificity of the tool were 0.90 and 0.75 for a cut-off score of 5 points. Area under the curve for the internally and externally validated tool was 0.91 and 0.87, respectively.PICADAR represents a simple diagnostic clinical prediction rule with good accuracy and validity, ready for testing in respiratory centres referring to PCD centres.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of the input space and the sample size n are usually the main ingredients in the characterization of data bigness. The specific statistical machine learning technique used to handle a particular big data set will depend on which category it falls in within the bigness taxonomy. Large p small n data sets for instance require a different set of tools from the large n small p variety. Among other tools, we discuss Preprocessing, Standardization, Imputation, Projection, Regularization, Penalization, Compression, Reduction, Selection, Kernelization, Hybridization, Parallelization, Aggregation, Randomization, Replication, Sequentialization. Indeed, it is important to emphasize right away that the so-called no free lunch theorem applies here, in the sense that there is no universally superior method that outperforms all other methods on all categories of bigness. It is also important to stress the fact that simplicity in the sense of Ockham’s razor non-plurality principle of parsimony tends to reign supreme when it comes to massive data. We conclude with a comparison of the predictive performance of some of the most commonly used methods on a few data sets.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This research evaluates pattern recognition techniques on a subclass of big data where the dimensionality of the input space (p) is much larger than the number of observations (n). Specifically, we evaluate massive gene expression microarray cancer data where the ratio κ is less than one. We explore the statistical and computational challenges inherent in these high dimensional low sample size (HDLSS) problems and present statistical machine learning methods used to tackle and circumvent these difficulties. Regularization and kernel algorithms were explored in this research using seven datasets where κ < 1. These techniques require special attention to tuning necessitating several extensions of cross-validation to be investigated to support better predictive performance. While no single algorithm was universally the best predictor, the regularization technique produced lower test errors in five of the seven datasets studied.