22 resultados para leave one out cross validation
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
A new, quantitative, inference model for environmental reconstruction (transfer function), based for the first time on the simultaneous analysis of multigroup species, has been developed. Quantitative reconstructions based on palaeoecological transfer functions provide a powerful tool for addressing questions of environmental change in a wide range of environments, from oceans to mountain lakes, and over a range of timescales, from decades to millions of years. Much progress has been made in the development of inferences based on multiple proxies but usually these have been considered separately, and the different numeric reconstructions compared and reconciled post-hoc. This paper presents a new method to combine information from multiple biological groups at the reconstruction stage. The aim of the multigroup work was to test the potential of the new approach to making improved inferences of past environmental change by improving upon current reconstruction methodologies. The taxonomic groups analysed include diatoms, chironomids and chrysophyte cysts. We test the new methodology using two cold-environment training-sets, namely mountain lakes from the Pyrenees and the Alps. The use of multiple groups, as opposed to single groupings, was only found to increase the reconstruction skill slightly, as measured by the root mean square error of prediction (leave-one-out cross-validation), in the case of alkalinity, dissolved inorganic carbon and altitude (a surrogate for air-temperature), but not for pH or dissolved CO2. Reasons why the improvement was less than might have been anticipated are discussed. These can include the different life-forms, environmental responses and reaction times of the groups under study.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
Lean meat percentage (LMP) is the criterion for carcass classification and it must be measured on line objectively. The aim of this work was to compare the error of the prediction (RMSEP) of the LMP measured with the following different devices: Fat-O-Meat’er (FOM), UltraFOM (UFOM), AUTOFOM and -VCS2000. For this reason the same 99 carcasses were measured using all 4 apparatus and dissected according to the European Reference Method. Moreover a subsample of the carcasses (n=77) were fully scanned with a X-ray Computed Tomography equipment (CT). The RMSEP calculated with cross validation leave-one-out was lower for FOM and AUTOFOM (1.8% and 1.9%, respectively) and higher for UFOM and VCS2000 (2.3% for both devices). The error obtained with CT was the lowest (0.96%) in accordance with previous results, but CT cannot be used on line. It can be concluded that FOM and AUTOFOM presented better accuracy than UFOM and VCS2000.
Resumo:
Head space gas chromatography with flame-ionization detection (HS-GC-FID), ancl purge and trap gas chromatography-mass spectrometry (P&T-GC-MS) have been used to determine methyl-tert-butyl ether (MTBE) and benzene, toluene, and the ylenes (BTEX) in groundwater. In the work discussed in this paper measures of quality, e.g. recovery (94-111%), precision (4.6 - 12.2%), limits of detection (0.3 - 5.7 I~g L 1 for HS and 0.001 I~g L 1 for PT), and robust-ness, for both methods were compared. In addition, for purposes of comparison, groundwater samples from areas suffering from odor problems because of fuel spillage and tank leakage were analyzed by use of both techniques. For high concentration levels there was good correlation between results from both methods.
Resumo:
Head space gas chromatography with flame-ionization detection (HS-GC-FID), ancl purge and trap gas chromatography-mass spectrometry (P&T-GC-MS) have been used to determine methyl-tert-butyl ether (MTBE) and benzene, toluene, and the ylenes (BTEX) in groundwater. In the work discussed in this paper measures of quality, e.g. recovery (94-111%), precision (4.6 - 12.2%), limits of detection (0.3 - 5.7 I~g L 1 for HS and 0.001 I~g L 1 for PT), and robust-ness, for both methods were compared. In addition, for purposes of comparison, groundwater samples from areas suffering from odor problems because of fuel spillage and tank leakage were analyzed by use of both techniques. For high concentration levels there was good correlation between results from both methods.
Resumo:
Does Independent Component Analysis (ICA) denature EEG signals? We applied ICA to two groups of subjects (mild Alzheimer patients and control subjects). The aim of this study was to examine whether or not the ICA method can reduce both group di®erences and within-subject variability. We found that ICA diminished Leave-One- Out root mean square error (RMSE) of validation (from 0.32 to 0.28), indicative of the reduction of group di®erence. More interestingly, ICA reduced the inter-subject variability within each group (¾ = 2:54 in the ± range before ICA, ¾ = 1:56 after, Bartlett p = 0.046 after Bonfer- roni correction). Additionally, we present a method to limit the impact of human error (' 13:8%, with 75.6% inter-cleaner agreement) during ICA cleaning, and reduce human bias. These ¯ndings suggests the novel usefulness of ICA in clinical EEG in Alzheimer's disease for reduction of subject variability.
Resumo:
A recent trend in digital mammography is computer-aided diagnosis systems, which are computerised tools designed to assist radiologists. Most of these systems are used for the automatic detection of abnormalities. However, recent studies have shown that their sensitivity is significantly decreased as the density of the breast increases. This dependence is method specific. In this paper we propose a new approach to the classification of mammographic images according to their breast parenchymal density. Our classification uses information extracted from segmentation results and is based on the underlying breast tissue texture. Classification performance was based on a large set of digitised mammograms. Evaluation involves different classifiers and uses a leave-one-out methodology. Results demonstrate the feasibility of estimating breast density using image processing and analysis techniques
Resumo:
Topological indices have been applied to build QSAR models for a set of 20 antimalarial cyclic peroxy cetals. In order to evaluate the reliability of the proposed linear models leave-n-out and Internal Test Sets (ITS) approaches have been considered. The proposed procedure resulted in a robust and consensued prediction equation and here it is shown why it is superior to the employed standard cross-validation algorithms involving multilinear regression models
Resumo:
In this paper we present a Bayesian image reconstruction algorithm with entropy prior (FMAPE) that uses a space-variant hyperparameter. The spatial variation of the hyperparameter allows different degrees of resolution in areas of different statistical characteristics, thus avoiding the large residuals resulting from algorithms that use a constant hyperparameter. In the first implementation of the algorithm, we begin by segmenting a Maximum Likelihood Estimator (MLE) reconstruction. The segmentation method is based on using a wavelet decomposition and a self-organizing neural network. The result is a predetermined number of extended regions plus a small region for each star or bright object. To assign a different value of the hyperparameter to each extended region and star, we use either feasibility tests or cross-validation methods. Once the set of hyperparameters is obtained, we carried out the final Bayesian reconstruction, leading to a reconstruction with decreased bias and excellent visual characteristics. The method has been applied to data from the non-refurbished Hubble Space Telescope. The method can be also applied to ground-based images.
Resumo:
Objective: Health status measures usually have an asymmetric distribution and present a highpercentage of respondents with the best possible score (ceiling effect), specially when they areassessed in the overall population. Different methods to model this type of variables have beenproposed that take into account the ceiling effect: the tobit models, the Censored Least AbsoluteDeviations (CLAD) models or the two-part models, among others. The objective of this workwas to describe the tobit model, and compare it with the Ordinary Least Squares (OLS) model,that ignores the ceiling effect.Methods: Two different data sets have been used in order to compare both models: a) real datacomming from the European Study of Mental Disorders (ESEMeD), in order to model theEQ5D index, one of the measures of utilities most commonly used for the evaluation of healthstatus; and b) data obtained from simulation. Cross-validation was used to compare thepredicted values of the tobit model and the OLS models. The following estimators werecompared: the percentage of absolute error (R1), the percentage of squared error (R2), the MeanSquared Error (MSE) and the Mean Absolute Prediction Error (MAPE). Different datasets werecreated for different values of the error variance and different percentages of individuals withceiling effect. The estimations of the coefficients, the percentage of explained variance and theplots of residuals versus predicted values obtained under each model were compared.Results: With regard to the results of the ESEMeD study, the predicted values obtained with theOLS model and those obtained with the tobit models were very similar. The regressioncoefficients of the linear model were consistently smaller than those from the tobit model. In thesimulation study, we observed that when the error variance was small (s=1), the tobit modelpresented unbiased estimations of the coefficients and accurate predicted values, specially whenthe percentage of individuals wiht the highest possible score was small. However, when theerrror variance was greater (s=10 or s=20), the percentage of explained variance for the tobitmodel and the predicted values were more similar to those obtained with an OLS model.Conclusions: The proportion of variability accounted for the models and the percentage ofindividuals with the highest possible score have an important effect in the performance of thetobit model in comparison with the linear model.
Resumo:
Longline fisheries, oil spills, and offshore wind farms are some of the major threats increasing seabird mortality at sea, but the impact of these threats on specific populations has been difficult to determine so far. We tested the use of molecular markers, morphometric measures, and stable isotope (δ15N and δ13C) and trace element concentrations in the first primary feather (grown at the end of the breeding period) to assign the geographic origin of Calonectris shearwaters. Overall, we sampled birds from three taxa: 13 Mediterranean Cory's Shearwater (Calonectris diomedea diomedea) breeding sites, 10 Atlantic Cory's Shearwater (Calonectris diomedea borealis) breeding sites, and one Cape Verde Shearwater (C. edwardsii) breeding site. Assignment rates were investigated at three spatial scales: breeding colony, breeding archipelago, and taxa levels. Genetic analyses based on the mitochondrial control region (198 birds from 21 breeding colonies) correctly assigned 100% of birds to the three main taxa but failed in detecting geographic structuring at lower scales. Discriminant analyses based on trace elements composition achieved the best rate of correct assignment to colony (77.5%). Body measurements or stable isotopes mainly succeeded in assigning individuals among taxa (87.9% and 89.9%, respectively) but failed at the colony level (27.1% and 38.0%, respectively). Combining all three approaches (morphometrics, isotopes, and trace elements on 186 birds from 15 breeding colonies) substantially improved correct classifications (86.0%, 90.7%, and 100% among colonies, archipelagos, and taxa, respectively). Validations using two independent data sets and jackknife cross-validation confirmed the robustness of the combined approach in the colony assignment (62.5%, 58.8%, and 69.8% for each validation test, respectively). A preliminary application of the discriminant model based on stable isotope δ15N and δ13C values and trace elements (219 birds from 17 breeding sites) showed that 41 Cory's Shearwaters caught by western Mediterranean long-liners came mainly from breeding colonies in Menorca (48.8%), Ibiza (14.6%), and Crete (31.7%). Our findings show that combining analyses of trace elements and stable isotopes on feathers can achieve high rates of correct geographic assignment of birds in the marine environment, opening new prospects for the study of seabird mortality at sea.
Resumo:
Recently there has been a renewed research interest in the properties of non survey updates of input-output tables and social accounting matrices (SAM). Along with the venerable and well known scaling RAS method, several alternative new procedures related to entropy minimization and other metrics have been suggested, tested and used in the literature. Whether these procedures will eventually substitute or merely complement the RAS approach is still an open question without a definite answer. The performance of many of the updating procedures has been tested using some kind of proximity or closeness measure to a reference input-output table or SAM. The first goal of this paper, in contrast, is the proposal of checking the operational performance of updating mechanisms by way of comparing the simulation results that ensue from adopting alternative databases for calibration of a reference applied general equilibrium model. The second goal is to introduce a new updatin! g procedure based on information retrieval principles. This new procedure is then compared as far as performance is concerned to two well-known updating approaches: RAS and cross-entropy. The rationale for the suggested cross validation is that the driving force for having more up to date databases is to be able to conduct more current, and hopefully more credible, policy analyses.
Resumo:
Lean meat percentage (LMP) is an important carcass quality parameter. The aim of this work is to obtain a calibration equation for the Computed Tomography (CT) scans with the Partial Least Square Regression (PLS) technique in order to predict the LMP of the carcass and the different cuts and to study and compare two different methodologies of the selection of the variables (Variable Importance for Projection — VIP- and Stepwise) to be included in the prediction equation. The error of prediction with cross-validation (RMSEPCV) of the LMP obtained with PLS and selection based on VIP value was 0.82% and for stepwise selection it was 0.83%. The prediction of the LMP scanning only the ham had a RMSEPCV of 0.97% and if the ham and the loin were scanned the RMSEPCV was 0.90%. Results indicate that for CT data both VIP and stepwise selection are good methods. Moreover the scanning of only the ham allowed us to obtain a good prediction of the LMP of the whole carcass.
Resumo:
La predicció del grau d’adaptació de les persones que ingressen als centres penitenciaris és un element clau per poder minimitzar problemes regimentals i, alhora, facilitar el procés de rehabilitació. Els elements predictors es poden fonamentar en variables psicomètriques (mesurades mitjançant qüestionaris psicològics) i també en judicis tècnics professionals, basats en l’acumulació d’experiència obtinguda pel treball. En aquesta investigació, es validen dos qüestionaris psicomètrics que poden ser útils per predir el grau d’adaptació regimental: un de personalitat, el CPS (Cuestionario de Personalidad Situacional) i l’altre d’impulsivitat, el BARRAT (BIS-11). Els resultats demostren que les variables del CPS prediuen millor el comportament conflictiu, mentre que la impulsivitat mesurada pel BARRAT prediu millor el comportament adaptat. També es validen els criteris tècnics psicològic (judici tècnic emès per part del psicòleg) i criminològic (judici tècnic emès per part del jurista criminòleg). El pas següent ha estat fer una validació creuada que consisteix en la comparació mútua entre el criteris psicomètrics i els tècnics, per veure quines variables són finalment seleccionades de cara a la predicció de la variable més rellevant de l’adaptació regimental: haver tingut o no una regressió de grau de tractament penitenciari. Els resultats demostren que no hi ha una superioritat dels criteris psicomètrics sobre els criteris tècnics i que no estan oposats, sinó que es reforcen mútuament per millorar la predicció del grau d’adaptació regimental. El poder predictiu de les variables seleccionades és encara més alt quan es tracta de predir l’adaptació al règim penitenciari dels interns primaris (els que ingressen per primera vegada a la presó) que quan es refereix als reincidents. D’altra banda, s’ha baremat el qüestionari d’impulsivitat de BARRAT (BIS-11) en població penitenciària per convertir-lo en una eina de pronòstic, tant de comportament regimental com de reincidència penal.
Resumo:
We propose a smooth multibidding mechanism for environments where a group of agents have to choose one out of several projects (possibly with the help of a social planner). Our proposal is related to the multibidding mechanism (Pérez-Castrillo and Wettstein, 2002) but it is "smoother" in the sense that small variations in an agent's bids do not lead to dramatic changes in the probability of selecting a project. This mechanism is shown to possess several interesting properties. Unlike in the study by Pérez Castrillo and Wettstein (2002), the equilibrium outcome is unique. Second, it ensures an equal sharing of the surplus that it induces. Finally, it enables reaching an outcome as close to effciency as is desired.