937 resultados para Bayesian smoothing
Resumo:
Many of the emerging telecom services make use of Outer Edge Networks, in particular Home Area Networks. The configuration and maintenance of such services may not be under full control of the telecom operator which still needs to guarantee the service quality experienced by the consumer. Diagnosing service faults in these scenarios becomes especially difficult since there may be not full visibility between different domains. This paper describes the fault diagnosis solution developed in the MAGNETO project, based on the application of Bayesian Inference to deal with the uncertainty. It also takes advantage of a distributed framework to deploy diagnosis components in the different domains and network elements involved, spanning both the telecom operator and the Outer Edge networks. In addition, MAGNETO features self-learning capabilities to automatically improve diagnosis knowledge over time and a partition mechanism that allows breaking down the overall diagnosis knowledge into smaller subsets. The MAGNETO solution has been prototyped and adapted to a particular outer edge scenario, and has been further validated on a real testbed. Evaluation of the results shows the potential of our approach to deal with fault management of outer edge networks.
Resumo:
Background Malignancies arising in the large bowel cause the second largest number of deaths from cancer in the Western World. Despite progresses made during the last decades, colorectal cancer remains one of the most frequent and deadly neoplasias in the western countries. Methods A genomic study of human colorectal cancer has been carried out on a total of 31 tumoral samples, corresponding to different stages of the disease, and 33 non-tumoral samples. The study was carried out by hybridisation of the tumour samples against a reference pool of non-tumoral samples using Agilent Human 1A 60-mer oligo microarrays. The results obtained were validated by qRT-PCR. In the subsequent bioinformatics analysis, gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling were built. The consensus among all the induced models produced a hierarchy of dependences and, thus, of variables. Results After an exhaustive process of pre-processing to ensure data quality--lost values imputation, probes quality, data smoothing and intraclass variability filtering--the final dataset comprised a total of 8, 104 probes. Next, a supervised classification approach and data analysis was carried out to obtain the most relevant genes. Two of them are directly involved in cancer progression and in particular in colorectal cancer. Finally, a supervised classifier was induced to classify new unseen samples. Conclusions We have developed a tentative model for the diagnosis of colorectal cancer based on a biomarker panel. Our results indicate that the gene profile described herein can discriminate between non-cancerous and cancerous samples with 94.45% accuracy using different supervised classifiers (AUC values in the range of 0.997 and 0.955)
Resumo:
The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studies
Resumo:
In the presence of a river flood, operators in charge of control must take decisions based on imperfect and incomplete sources of information (e.g., data provided by a limited number sensors) and partial knowledge about the structure and behavior of the river basin. This is a case of reasoning about a complex dynamic system with uncertainty and real-time constraints where bayesian networks can be used to provide an effective support. In this paper we describe a solution with spatio-temporal bayesian networks to be used in a context of emergencies produced by river floods. In the paper we describe first a set of types of causal relations for hydrologic processes with spatial and temporal references to represent the dynamics of the river basin. Then we describe how this was included in a computer system called SAIDA to provide assistance to operators in charge of control in a river basin. Finally the paper shows experimental results about the performance of the model.
Resumo:
Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.
Resumo:
Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learning MBCs from data. Basically, it consists of determining the Markov blanket around each class variable using the HITON algorithm, then specifying the directionality over the MBC subgraphs. Our approach is applied to the prediction problem of the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39) in order to estimate the health-related quality of life of Parkinson’s patients. Fivefold cross-validation experiments were carried out on randomly generated synthetic data sets, Yeast data set, as well as on a real-world Parkinson’s disease data set containing 488 patients. The experimental study, including comparison with additional Bayesian network-based approaches, back propagation for multi-label learning, multi-label k-nearest neighbor, multinomial logistic regression, ordinary least squares, and censored least absolute deviations, shows encouraging results in terms of predictive accuracy as well as the identification of dependence relationships among class and feature variables.
Resumo:
The quality and the reliability of the power generated by large grid-connected photovoltaic (PV) plants are negatively affected by the source characteristic variability. This paper deals with the smoothing of power fluctuations because of geographical dispersion of PV systems. The fluctuation frequency and the maximum fluctuation registered at a PV plant ensemble are analyzed to study these effects. We propose an empirical expression to compare the fluctuation attenuation because of both the size and the number of PV plants grouped. The convolution of single PV plants frequency distribution functions has turned out to be a successful tool to statistically describe the behavior of an ensemble of PV plants and determine their maximum output fluctuation. Our work is based on experimental 1-s data collected throughout 2009 from seven PV plants, 20 MWp in total, separated between 6 and 360 km.
Resumo:
Background:Malignancies arising in the large bowel cause the second largest number of deaths from cancer in the Western World. Despite progresses made during the last decades, colorectal cancer remains one of the most frequent and deadly neoplasias in the western countries. Methods: A genomic study of human colorectal cancer has been carried out on a total of 31 tumoral samples, corresponding to different stages of the disease, and 33 non-tumoral samples. The study was carried out by hybridisation of the tumour samples against a reference pool of non-tumoral samples using Agilent Human 1A 60-mer oligo microarrays. The results obtained were validated by qRT-PCR. In the subsequent bioinformatics analysis, gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling were built. The consensus among all the induced models produced a hierarchy of dependences and, thus, of variables. Results: After an exhaustive process of pre-processing to ensure data quality--lost values imputation, probes quality, data smoothing and intraclass variability filtering--the final dataset comprised a total of 8, 104 probes. Next, a supervised classification approach and data analysis was carried out to obtain the most relevant genes. Two of them are directly involved in cancer progression and in particular in colorectal cancer. Finally, a supervised classifier was induced to classify new unseen samples. Conclusions: We have developed a tentative model for the diagnosis of colorectal cancer based on a biomarker panel. Our results indicate that the gene profile described herein can discriminate between non-cancerous and cancerous samples with 94.45% accuracy using different supervised classifiers (AUC values in the range of 0.997 and 0.955).
Resumo:
We present a computing model based on the DNA strand displacement technique which performs Bayesian inference. The model will take single stranded DNA as input data, representing the presence or absence of a specific molecular signal (evidence). The program logic encodes the prior probability of a disease and the conditional probability of a signal given the disease playing with a set of different DNA complexes and their ratios. When the input and program molecules interact, they release a different pair of single stranded DNA species whose relative proportion represents the application of Bayes? Law: the conditional probability of the disease given the signal. The models presented in this paper can empower the application of probabilistic reasoning in genetic diagnosis in vitro.
Resumo:
In this paper, an innovative approach to perform distributed Bayesian inference using a multi-agent architecture is presented. The final goal is dealing with uncertainty in network diagnosis, but the solution can be of applied in other fields. The validation testbed has been a P2P streaming video service. An assessment of the work is presented, in order to show its advantages when it is compared with traditional manual processes and other previous systems.
Resumo:
This work evaluates a spline-based smoothing method applied to the output of a glucose predictor. Methods:Our on-line prediction algorithm is based on a neural network model (NNM). We trained/validated the NNM with a prediction horizon of 30 minutes using 39/54 profiles of patients monitored with the Guardian® Real-Time continuous glucose monitoring system The NNM output is smoothed by fitting a causal cubic spline. The assessment parameters are the error (RMSE), mean delay (MD) and the high-frequency noise (HFCrms). The HFCrms is the root-mean-square values of the high-frequency components isolated with a zero-delay non-causal filter. HFCrms is 2.90±1.37 (mg/dl) for the original profiles.
Resumo:
Thermal smoothing in the plasma ablated from a laser target under weakly nonuniform irradiation is analyzed, assuming absorption at nc and a deflagration regime (conduction restricted to a thin quasisteady layer next to the target). Magnetic generation effects are included and found to be weak. Differences from results available in the literature are explained; the importance of the character of the underdense flow at uniform irradiation is emphasized.
Resumo:
Refractive smoothing of weak non-uniformities in the illumination of laser targets is analyzed, assuming absorption at the critical density and restricting conduction to a thin layer, and using results from thermal smoothing, which is uncoupled from the refraction. Magnetic effects are included. Non-uniformity wavelengths comparable to the thickness of the conduction layer are considered; efficient smoothing exists at both short and long wavelengths in this range. Thermal focusing could make the ablated plasma unstable.