914 resultados para label regression
Resumo:
Aplicación de simulación de Monte Carlo y técnicas de Análisis de la Varianza (ANOVA) a la comparación de modelos estocásticos dinámicos para accidentes de tráfico.
Resumo:
Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn?t be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in this type of analysis is necessary. Robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. In this method we don?t have to assume that the outlier point is simply an extreme observation drawn from the tail of a normal distribution not compromising the validity of the regression results. In this work we have evaluated the capacity of robust regression to select the points in the experimental data used trying to avoid subjective choices. Based on this analysis we have developed a new work methodology that implies two basic steps: ? Evaluation of the improvement of linear fitting when consecutive points are eliminated based on R pvalue. In this way we consider the implications of reducing the number of points. ? Evaluation of the significance of slope difference between fitting with the two extremes points and fitted with the available points. We compare the results applying this methodology and the common used least squares one. The data selected for these comparisons are coming from experimental soil roughness transect and simulated based on middle point displacement method adding tendencies and noise. The results are discussed indicating the advantages and disadvantages of each methodology.
Resumo:
Label free immunoassay sector is a ferment of activity, experiencing rapid growth as new technologies come forward and achieve acceptance. The landscape is changing in a “bottom up” approach, as individual companies promote individual technologies and find a market for them. Therefore, each of the companies operating in the label-free immunoassay sector offers a technology that is in some way unique and proprietary. However, no many technologies based on Label-free technology are currently in the market for PoC and High Throughput Screening (HTS), where mature labeled technologies have taken the market.
Resumo:
The field of optical label free biosensors has become a topic of interest during past years, with devices based on the detection of angular or wavelength shift of optical modes [1]. Common parameters to characterize their performance are the Limit of Detection (LOD, defined as the minimum change of refractive index upon the sensing surface that the device is able to detect, and also BioLOD, which represents the minimum amount of target analyte accurately resolved by the system; with units of concentration (common un its are p pm, ng/ml, or nM). LOD gives a first value to compare different biosensors, and is obtained both theoretically (using photonic calculation tools), and experimentally,covering the sensing area with fluids of different refractive indexes.
Resumo:
Los sectores de detección biológica demandan continuamente técnicas de análisis y diagnóstico más eficientes y precisas para identificar enfermedades y desarrollar nuevos medicamentos. Actualmente se considera que hay una gran necesidad de desarrollar herramientas de diagnóstico capaces de asegurar sensibilidad, rapidez, sencillez y asequibilidad para aplicaciones en sectores como la salud, la alimentación, el medioambiente o la seguridad. En el ámbito clínico se necesitan profundos avances tecnológicos capaces de ofrecer análisis rápidos, exactos, fiables y asequibles en coste y que tengan como consecuencia la mejora clínica y económica a partir de un diagnóstico eficiente. En concreto, hay un interés creciente por la descentralización del diagnóstico clínico mediante plataformas de detección cercanas al usuario final, denominadas POCs (Point Of Care devices). La utilización de POCs (referidas al diagnóstico cercano al usuario final o fuera del laboratorio de análisis clínico), mediante detección in vitro (IVD), será extremadamente útil en centros de salud, clínicas o unidades hospitalarias, entornos laborales o incluso en el hogar. Por otra parte, el desarrollo de la genómica, proteómica y otras tecnologías conocidas como “omics” (sufijo en inglés para referirse, por ejemplo, a genomics, transcriptomics, proteomics, metabolomics, lipidomics) está incrementando la demanda de nuevas tecnologías mucho más avanzadas con una clara orientación hacia la medicina personalizada y la necesidad de hacer frente a cambios en los tratamientos en el caso de enfermedades complejas. Desde hace poco tiempo se han definido las Celdas Biofónicas (BICELLs) como una metodología novedosa para la detección de agentes biológicos que ofrecen una serie de características que las hacen interesantes como son: Capacidad de multiplexación, alta sensibilidad, posibilidad de medir en gota, compatible con otras tecnologías. En este trabajo se hace un estudio y optimización sobre diferentes tipos de BICELLs y se valoran una serie de figuras de merito a tener en cuenta desde el punto de vista del lector óptico a emplear.
Resumo:
In this paper, multiple regression analysis is used to model the top of descent (TOD) location of user-preferred descent trajectories computed by the flight management system (FMS) on over 1000 commercial flights into Melbourne, Australia. In addition to recording TOD, the cruise altitude, final altitude, cruise Mach, descent speed, wind, and engine type were also identified for use as the independent variables in the regression analysis. Both first-order and second-order models are considered, where cross-validation, hypothesis testing, and additional analysis are used to compare models. This identifies the models that should give the smallest errors if used to predict TOD location for new data in the future. A model that is linear in TOD altitude, final altitude, descent speed, and wind gives an estimated standard deviation of 3.9 nmi for TOD location given the trajectory parame- ters, which means about 80% of predictions would have error less than 5 nmi in absolute value. This accuracy is better than demonstrated by other ground automation predictions using kinetic models. Furthermore, this approach would enable online learning of the model. Additional data or further knowledge of algorithms is necessary to conclude definitively that no second-order terms are appropriate. Possible applications of the linear model are described, including enabling arriving aircraft to fly optimized descents computed by the FMS even in congested airspace.
Resumo:
The use of Biophotonic Sensing Cells (BICELLs) based on micro-nano pattemed photonic architectures has been recently proven as an efficient methodology for label-free biosensing by using Optical Interrogation [1]. According to this, we have studied the different optical response for a specific typology of BICELL, consisting of structures of SU -8. This material is biocompatible with different types of biomolecules and can be immobilized on its sensing surface. In particular, we have measured the optical response for a biomarker in clinic diagnostic of dry eye. Although different proteins can be enstudied such as: PRDX5, ANXA 1, ANXA 11, CST 4, PLAA Y S 1 OOA6 related with ocular surface (dry eye), for this work PLAA (phospholipase A2) is studied by means of label free biosensing based on BICELLs for analyzing the performance and specificity according with means values of concentration in ROC curves.
Resumo:
We present a model of Bayesian network for continuous variables, where densities and conditional densities are estimated with B-spline MoPs. We use a novel approach to directly obtain conditional densities estimation using B-spline properties. In particular we implement naive Bayes and wrapper variables selection. Finally we apply our techniques to the problem of predicting neurons morphological variables from electrophysiological ones.
Resumo:
Multi-label classification (MLC) is the supervised learning problem where an instance may be associated with multiple labels. Modeling dependencies between labels allows MLC methods to improve their performance at the expense of an increased computational cost. In this paper we focus on the classifier chains (CC) approach for modeling dependencies. On the one hand, the original CC algorithm makes a greedy approximation, and is fast but tends to propagate errors down the chain. On the other hand, a recent Bayes-optimal method improves the performance, but is computationally intractable in practice. Here we present a novel double-Monte Carlo scheme (M2CC), both for finding a good chain sequence and performing efficient inference. The M2CC algorithm remains tractable for high-dimensional data sets and obtains the best overall accuracy, as shown on several real data sets with input dimension as high as 1449 and up to 103 labels.
Resumo:
The aim of this paper is to develop a probabilistic modeling framework for the segmentation of structures of interest from a collection of atlases. Given a subset of registered atlases into the target image for a particular Region of Interest (ROI), a statistical model of appearance and shape is computed for fusing the labels. Segmentations are obtained by minimizing an energy function associated with the proposed model, using a graph-cut technique. We test different label fusion methods on publicly available MR images of human brains.
Resumo:
Bayesian network classifiers are widely used in machine learning because they intuitively represent causal relations. Multi-label classification problems require each instance to be assigned a subset of a defined set of h labels. This problem is equivalent to finding a multi-valued decision function that predicts a vector of h binary classes. In this paper we obtain the decision boundaries of two widely used Bayesian network approaches for building multi-label classifiers: Multi-label Bayesian network classifiers built using the binary relevance method and Bayesian network chain classifiers. We extend our previous single-label results to multi-label chain classifiers, and we prove that, as expected, chain classifiers provide a more expressive model than the binary relevance method.
Resumo:
Abstract Interneuron classification is an important and long-debated topic in neuroscience. A recent study provided a data set of digitally reconstructed interneurons classified by 42 leading neuroscientists according to a pragmatic classification scheme composed of five categorical variables, namely, of the interneuron type and four features of axonal morphology. From this data set we now learned a model which can classify interneurons, on the basis of their axonal morphometric parameters, into these five descriptive variables simultaneously. Because of differences in opinion among the neuroscientists, especially regarding neuronal type, for many interneurons we lacked a unique, agreed-upon classification, which we could use to guide model learning. Instead, we guided model learning with a probability distribution over the neuronal type and the axonal features, obtained, for each interneuron, from the neuroscientists’ classification choices. We conveniently encoded such probability distributions with Bayesian networks, calling them label Bayesian networks (LBNs), and developed a method to predict them. This method predicts an LBN by forming a probabilistic consensus among the LBNs of the interneurons most similar to the one being classified. We used 18 axonal morphometric parameters as predictor variables, 13 of which we introduce in this paper as quantitative counterparts to the categorical axonal features. We were able to accurately predict interneuronal LBNs. Furthermore, when extracting crisp (i.e., non-probabilistic) predictions from the predicted LBNs, our method outperformed related work on interneuron classification. Our results indicate that our method is adequate for multi-dimensional classification of interneurons with probabilistic labels. Moreover, the introduced morphometric parameters are good predictors of interneuron type and the four features of axonal morphology and thus may serve as objective counterparts to the subjective, categorical axonal features.
Resumo:
Interneuron classification is an important and long-debated topic in neuroscience. A recent study provided a data set of digitally reconstructed interneurons classified by 42 leading neuroscientists according to a pragmatic classification scheme composed of five categorical variables, namely, of the interneuron type and four features of axonal morphology. From this data set we now learned a model which can classify interneurons, on the basis of their axonal morphometric parameters, into these five descriptive variables simultaneously. Because of differences in opinion among the neuroscientists, especially regarding neuronal type, for many interneurons we lacked a unique, agreed-upon classification, which we could use to guide model learning. Instead, we guided model learning with a probability distribution over the neuronal type and the axonal features, obtained, for each interneuron, from the neuroscientists’ classification choices. We conveniently encoded such probability distributions with Bayesian networks, calling them label Bayesian networks (LBNs), and developed a method to predict them. This method predicts an LBN by forming a probabilistic consensus among the LBNs of the interneurons most similar to the one being classified. We used 18 axonal morphometric parameters as predictor variables, 13 of which we introduce in this paper as quantitative counterparts to the categorical axonal features. We were able to accurately predict interneuronal LBNs. Furthermore, when extracting crisp (i.e., non-probabilistic) predictions from the predicted LBNs, our method outperformed related work on interneuron classification. Our results indicate that our method is adequate for multi-dimensional classification of interneurons with probabilistic labels. Moreover, the introduced morphometric parameters are good predictors of interneuron type and the four features of axonal morphology and thus may serve as objective counterparts to the subjective, categorical axonal features.
Resumo:
This work proposes an automatic methodology for modeling complex systems. Our methodology is based on the combination of Grammatical Evolution and classical regression to obtain an optimal set of features that take part of a linear and convex model. This technique provides both Feature Engineering and Symbolic Regression in order to infer accurate models with no effort or designer's expertise requirements. As advanced Cloud services are becoming mainstream, the contribution of data centers in the overall power consumption of modern cities is growing dramatically. These facilities consume from 10 to 100 times more power per square foot than typical office buildings. Modeling the power consumption for these infrastructures is crucial to anticipate the effects of aggressive optimization policies, but accurate and fast power modeling is a complex challenge for high-end servers not yet satisfied by analytical approaches. For this case study, our methodology minimizes error in power prediction. This work has been tested using real Cloud applications resulting on an average error in power estimation of 3.98%. Our work improves the possibilities of deriving Cloud energy efficient policies in Cloud data centers being applicable to other computing environments with similar characteristics.
Resumo:
Predicting failures in a distributed system based on previous events through logistic regression is a standard approach in literature. This technique is not reliable, though, in two situations: in the prediction of rare events, which do not appear in enough proportion for the algorithm to capture, and in environments where there are too many variables, as logistic regression tends to overfit on this situations; while manually selecting a subset of variables to create the model is error- prone. On this paper, we solve an industrial research case that presented this situation with a combination of elastic net logistic regression, a method that allows us to automatically select useful variables, a process of cross-validation on top of it and the application of a rare events prediction technique to reduce computation time. This process provides two layers of cross- validation that automatically obtain the optimal model complexity and the optimal mode l parameters values, while ensuring even rare events will be correctly predicted with a low amount of training instances. We tested this method against real industrial data, obtaining a total of 60 out of 80 possible models with a 90% average model accuracy.