956 resultados para random forest regression


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The emissions estimation, both during homologation and standard driving, is one of the new challenges that automotive industries have to face. The new European and American regulation will allow a lower and lower quantity of Carbon Monoxide emission and will require that all the vehicles have to be able to monitor their own pollutants production. Since numerical models are too computationally expensive and approximated, new solutions based on Machine Learning are replacing standard techniques. In this project we considered a real V12 Internal Combustion Engine to propose a novel approach pushing Random Forests to generate meaningful prediction also in extreme cases (extrapolation, very high frequency peaks, noisy instrumentation etc.). The present work proposes also a data preprocessing pipeline for strongly unbalanced datasets and a reinterpretation of the regression problem as a classification problem in a logarithmic quantized domain. Results have been evaluated for two different models representing a pure interpolation scenario (more standard) and an extrapolation scenario, to test the out of bounds robustness of the model. The employed metrics take into account different aspects which can affect the homologation procedure, so the final analysis will focus on combining all the specific performances together to obtain the overall conclusions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

PURPOSE: To evaluate the sensitivity and specificity of machine learning classifiers (MLCs) for glaucoma diagnosis using Spectral Domain OCT (SD-OCT) and standard automated perimetry (SAP). METHODS: Observational cross-sectional study. Sixty two glaucoma patients and 48 healthy individuals were included. All patients underwent a complete ophthalmologic examination, achromatic standard automated perimetry (SAP) and retinal nerve fiber layer (RNFL) imaging with SD-OCT (Cirrus HD-OCT; Carl Zeiss Meditec Inc., Dublin, California). Receiver operating characteristic (ROC) curves were obtained for all SD-OCT parameters and global indices of SAP. Subsequently, the following MLCs were tested using parameters from the SD-OCT and SAP: Bagging (BAG), Naive-Bayes (NB), Multilayer Perceptron (MLP), Radial Basis Function (RBF), Random Forest (RAN), Ensemble Selection (ENS), Classification Tree (CTREE), Ada Boost M1(ADA),Support Vector Machine Linear (SVML) and Support Vector Machine Gaussian (SVMG). Areas under the receiver operating characteristic curves (aROC) obtained for isolated SAP and OCT parameters were compared with MLCs using OCT+SAP data. RESULTS: Combining OCT and SAP data, MLCs' aROCs varied from 0.777(CTREE) to 0.946 (RAN).The best OCT+SAP aROC obtained with RAN (0.946) was significantly larger the best single OCT parameter (p<0.05), but was not significantly different from the aROC obtained with the best single SAP parameter (p=0.19). CONCLUSION: Machine learning classifiers trained on OCT and SAP data can successfully discriminate between healthy and glaucomatous eyes. The combination of OCT and SAP measurements improved the diagnostic accuracy compared with OCT data alone.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of remote sensing is necessary for monitoring forest carbon stocks at large scales. Optical remote sensing, although not the most suitable technique for the direct estimation of stand biomass, offers the advantage of providing large temporal and spatial datasets. In particular, information on canopy structure is encompassed in stand reflectance time series. This study focused on the example of Eucalyptus forest plantations, which have recently attracted much attention as a result of their high expansion rate in many tropical countries. Stand scale time-series of Normalized Difference Vegetation Index (NDVI) were obtained from MODIS satellite data after a procedure involving un-mixing and interpolation, on about 15,000 ha of plantations in southern Brazil. The comparison of the planting date of the current rotation (and therefore the age of the stands) estimated from these time series with real values provided by the company showed that the root mean square error was 35.5 days. Age alone explained more than 82% of stand wood volume variability and 87% of stand dominant height variability. Age variables were combined with other variables derived from the NDVI time series and simple bioclimatic data by means of linear (Stepwise) or nonlinear (Random Forest) regressions. The nonlinear regressions gave r-square values of 0.90 for volume and 0.92 for dominant height, and an accuracy of about 25 m(3)/ha for volume (15% of the volume average value) and about 1.6 m for dominant height (8% of the height average value). The improvement including NDVI and bioclimatic data comes from the fact that the cumulative NDVI since planting date integrates the interannual variability of leaf area index (LAI), light interception by the foliage and growth due for example to variations of seasonal water stress. The accuracy of biomass and height predictions was strongly improved by using the NDVI integrated over the two first years after planting, which are critical for stand establishment. These results open perspectives for cost-effective monitoring of biomass at large scales in intensively-managed plantation forests. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação de Mestrado, Gestão de Empresa (MBA), 16 de Julho de 2013, Universidade dos Açores.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação para obtenção do grau de Mestre em Engenharia Informática

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação apresentada para a obtenção do Grau de Doutor em Química Especialidade de Química Orgânica Pela Universidade Nova de Lisboa Faculdade de Ciências e Tecnologia

Relevância:

80.00% 80.00%

Publicador:

Resumo:

More than ever, there is an increase of the number of decision support methods and computer aided diagnostic systems applied to various areas of medicine. In breast cancer research, many works have been done in order to reduce false-positives when used as a double reading method. In this study, we aimed to present a set of data mining techniques that were applied to approach a decision support system in the area of breast cancer diagnosis. This method is geared to assist clinical practice in identifying mammographic findings such as microcalcifications, masses and even normal tissues, in order to avoid misdiagnosis. In this work a reliable database was used, with 410 images from about 115 patients, containing previous reviews performed by radiologists as microcalcifications, masses and also normal tissue findings. Throughout this work, two feature extraction techniques were used: the gray level co-occurrence matrix and the gray level run length matrix. For classification purposes, we considered various scenarios according to different distinct patterns of injuries and several classifiers in order to distinguish the best performance in each case described. The many classifiers used were Naïve Bayes, Support Vector Machines, k-nearest Neighbors and Decision Trees (J48 and Random Forests). The results in distinguishing mammographic findings revealed great percentages of PPV and very good accuracy values. Furthermore, it also presented other related results of classification of breast density and BI-RADS® scale. The best predictive method found for all tested groups was the Random Forest classifier, and the best performance has been achieved through the distinction of microcalcifications. The conclusions based on the several tested scenarios represent a new perspective in breast cancer diagnosis using data mining techniques.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

"Lecture notes in computer science series, ISSN 0302-9743, vol. 9273"

Relevância:

80.00% 80.00%

Publicador:

Resumo:

experimental design, mixed model, random coefficient regression model, population pharmacokinetics, approximate design

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Therapy of chronic hepatitis C (CHC) with pegIFNa/ribavirin achieves sustained virologic response (SVR) in ~55%. Pre-activation of the endogenous interferon system in the liver is associated non-response (NR). Recently, genome-wide association studies described associations of allelic variants near the IL28B (IFNλ3) gene with treatment response and with spontaneous clearance of the virus. We investigated if the IL28B genotype determines the constitutive expression of IFN stimulated genes (ISGs) in the liver of patients with CHC. Methods: We genotyped 93 patients with CHC for 3 IL28B single nucleotide polymorphisms (SNPs, rs12979860, rs8099917, rs12980275), extracted RNA from their liver biopsies and quantified the expression of IL28B and of 8 previously identified classifier genes which discriminate between SVR and NR (IFI44L, RSAD2, ISG15, IFI22, LAMP3, OAS3, LGALS3BP and HTATIP2). Decision tree ensembles in the form of a random forest classifier were used to calculate the relative predictive power of these different variables in a multivariate analysis. Results: The minor IL28B allele (bad risk for treatment response) was significantly associated with increased expression of ISGs, and, unexpectedly, with decreased expression of IL28B. Stratification of the patients into SVR and NR revealed that ISG expression was conditionally independent from the IL28B genotype, i.e. there was an increased expression of ISGs in NR compared to SVR irrespective of the IL28B genotype. The random forest feature score (RFFS) identified IFI27 (RFFS = 2.93), RSAD2 (1.88) and HTATIP2 (1.50) expression and the HCV genotype (1.62) as the strongest predictors of treatment response. ROC curves of the IL28B SNPs showed an AUC of 0.66 with an error rate (ERR) of 0.38. A classifier with the 3 best classifying genes showed an excellent test performance with an AUC of 0.94 and ERR of 0.15. The addition of IL28B genotype information did not improve the predictive power of the 3-gene classifier. Conclusions: IL28B genotype and hepatic ISG expression are conditionally independent predictors of treatment response in CHC. There is no direct link between altered IFNλ3 expression and pre-activation of the endogenous system in the liver. Hepatic ISG expression is by far the better predictor for treatment response than IL28B genotype.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Introduction. There is some cross-sectional evidence that theory of mind ability is associated with social functioning in those with psychosis but the direction of this relationship is unknown. This study investigates the longitudinal association between both theory of mind and psychotic symptoms and social functioning outcome in first-episode psychosis. Methods. Fifty-four people with first-episode psychosis were followed up at 6 and 12 months. Random effects regression models were used to estimate the stability of theory of mind over time and the association between baseline theory of mind and psychotic symptoms and social functioning outcome. Results. Neither baseline theory of mind ability (regression coefficients: Hinting test 1.07 95% CI 0.74, 2.88; Visual Cartoon test 2.91 95% CI 7.32, 1.51) nor baseline symptoms (regression coefficients: positive symptoms 0.04 95% CI 1.24, 1.16; selected negative symptoms 0.15 95% CI 2.63, 2.32) were associated with social functioning outcome. There was evidence that theory of mind ability was stable over time, (regression coefficients: Hinting test 5.92 95% CI 6.66, 8.92; Visual Cartoon test score 0.13 95% CI 0.17, 0.44). Conclusions. Neither baseline theory of mind ability nor psychotic symptoms are associated with social functioning outcome. Further longitudinal work is needed to understand the origin of social functioning deficits in psychosis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND & AIMS: The host immune response during the chronic phase of hepatitis C virus infection varies among individuals; some patients have a no interferon (IFN) response in the liver, whereas others have full activation IFN-stimulated genes (ISGs). Preactivation of this endogenous IFN system is associated with nonresponse to pegylated IFN-α (pegIFN-α) and ribavirin. Genome-wide association studies have associated allelic variants near the IL28B (IFNλ3) gene with treatment response. We investigated whether IL28B genotype determines the constitutive expression of ISGs in the liver and compared the abilities of ISG levels and IL28B genotype to predict treatment outcome. METHODS: We genotyped 109 patients with chronic hepatitis C for IL28B allelic variants and quantified the hepatic expression of ISGs and of IL28B. Decision tree ensembles, in the form of a random forest classifier, were used to calculate the relative predictive power of these different variables in a multivariate analysis. RESULTS: The minor IL28B allele was significantly associated with increased expression of ISG. However, stratification of the patients according to treatment response revealed increased ISG expression in nonresponders, irrespective of IL28B genotype. Multivariate analysis of ISG expression, IL28B genotype, and several other factors associated with response to therapy identified ISG expression as the best predictor of treatment response. CONCLUSIONS: IL28B genotype and hepatic expression of ISGs are independent predictors of response to treatment with pegIFN-α and ribavirin in patients with chronic hepatitis C. The most accurate prediction of response was obtained with a 4-gene classifier comprising IFI27, ISG15, RSAD2, and HTATIP2.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Visceral adiposity is increasingly recognized as a key condition for the development of obesity related disorders, with the ratio between visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT) reported as the best correlate of cardiometabolic risk. In this study, using a cohort of 40 obese females (age: 25-45 y, BMI: 28-40 kg/m(2)) under healthy clinical conditions and monitored over a 2 weeks period we examined the relationships between different body composition parameters, estimates of visceral adiposity and blood/urine metabolic profiles. Metabonomics and lipidomics analysis of blood plasma and urine were employed in combination with in vivo quantitation of body composition and abdominal fat distribution using iDXA and computerized tomography. Of the various visceral fat estimates, VAT/SAT and VAT/total abdominal fat ratios exhibited significant associations with regio-specific body lean and fat composition. The integration of these visceral fat estimates with metabolic profiles of blood and urine described a distinct amino acid, diacyl and ether phospholipid phenotype in women with higher visceral fat. Metabolites important in predicting visceral fat adiposity as assessed by Random forest analysis highlighted 7 most robust markers, including tyrosine, glutamine, PC-O 44∶6, PC-O 44∶4, PC-O 42∶4, PC-O 40∶4, and PC-O 40∶3 lipid species. Unexpectedly, the visceral fat associated inflammatory profiles were shown to be highly influenced by inter-days and between-subject variations. Nevertheless, the visceral fat associated amino acid and lipid signature is proposed to be further validated for future patient stratification and cardiometabolic health diagnostics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article presents an experimental study about the classification ability of several classifiers for multi-classclassification of cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland lawenforcement authorities regularly ask forensic laboratories to determinate the chemotype of a seized cannabisplant and then to conclude if the plantation is legal or not. This classification is mainly performed when theplant is mature as required by the EU official protocol and then the classification of cannabis seedlings is a timeconsuming and costly procedure. A previous study made by the authors has investigated this problematic [1]and showed that it is possible to differentiate between drug type (illegal) and fibre type (legal) cannabis at anearly stage of growth using gas chromatography interfaced with mass spectrometry (GC-MS) based on therelative proportions of eight major leaf compounds. The aims of the present work are on one hand to continueformer work and to optimize the methodology for the discrimination of drug- and fibre type cannabisdeveloped in the previous study and on the other hand to investigate the possibility to predict illegal cannabisvarieties. Seven classifiers for differentiating between cannabis seedlings are evaluated in this paper, namelyLinear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Nearest NeighbourClassification (NNC), Learning Vector Quantization (LVQ), Radial Basis Function Support Vector Machines(RBF SVMs), Random Forest (RF) and Artificial Neural Networks (ANN). The performance of each method wasassessed using the same analytical dataset that consists of 861 samples split into drug- and fibre type cannabiswith drug type cannabis being made up of 12 varieties (i.e. 12 classes). The results show that linear classifiersare not able to manage the distribution of classes in which some overlap areas exist for both classificationproblems. Unlike linear classifiers, NNC and RBF SVMs best differentiate cannabis samples both for 2-class and12-class classifications with average classification results up to 99% and 98%, respectively. Furthermore, RBFSVMs correctly classified into drug type cannabis the independent validation set, which consists of cannabisplants coming from police seizures. In forensic case work this study shows that the discrimination betweencannabis samples at an early stage of growth is possible with fairly high classification performance fordiscriminating between cannabis chemotypes or between drug type cannabis varieties.