946 resultados para BAYESIAN-INFERENCE


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We explore the meaning of information about quantities of interest. Our approach is divided in two scenarios: the analysis of observations and the planning of an experiment. First, we review the Sufficiency, Conditionality and Likelihood principles and how they relate to trivial experiments. Next, we review Blackwell Sufficiency and show that sampling without replacement is Blackwell Sufficient for sampling with replacement. Finally, we unify the two scenarios presenting an extension of the relationship between Blackwell Equivalence and the Likelihood Principle.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In my PhD thesis I propose a Bayesian nonparametric estimation method for structural econometric models where the functional parameter of interest describes the economic agent's behavior. The structural parameter is characterized as the solution of a functional equation, or by using more technical words, as the solution of an inverse problem that can be either ill-posed or well-posed. From a Bayesian point of view, the parameter of interest is a random function and the solution to the inference problem is the posterior distribution of this parameter. A regular version of the posterior distribution in functional spaces is characterized. However, the infinite dimension of the considered spaces causes a problem of non continuity of the solution and then a problem of inconsistency, from a frequentist point of view, of the posterior distribution (i.e. problem of ill-posedness). The contribution of this essay is to propose new methods to deal with this problem of ill-posedness. The first one consists in adopting a Tikhonov regularization scheme in the construction of the posterior distribution so that I end up with a new object that I call regularized posterior distribution and that I guess it is solution of the inverse problem. The second approach consists in specifying a prior distribution on the parameter of interest of the g-prior type. Then, I detect a class of models for which the prior distribution is able to correct for the ill-posedness also in infinite dimensional problems. I study asymptotic properties of these proposed solutions and I prove that, under some regularity condition satisfied by the true value of the parameter of interest, they are consistent in a "frequentist" sense. Once I have set the general theory, I apply my bayesian nonparametric methodology to different estimation problems. First, I apply this estimator to deconvolution and to hazard rate, density and regression estimation. Then, I consider the estimation of an Instrumental Regression that is useful in micro-econometrics when we have to deal with problems of endogeneity. Finally, I develop an application in finance: I get the bayesian estimator for the equilibrium asset pricing functional by using the Euler equation defined in the Lucas'(1978) tree-type models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider inference in randomized studies, in which repeatedly measured outcomes may be informatively missing due to drop out. In this setting, it is well known that full data estimands are not identified unless unverified assumptions are imposed. We assume a non-future dependence model for the drop-out mechanism and posit an exponential tilt model that links non-identifiable and identifiable distributions. This model is indexed by non-identified parameters, which are assumed to have an informative prior distribution, elicited from subject-matter experts. Under this model, full data estimands are shown to be expressed as functionals of the distribution of the observed data. To avoid the curse of dimensionality, we model the distribution of the observed data using a Bayesian shrinkage model. In a simulation study, we compare our approach to a fully parametric and a fully saturated model for the distribution of the observed data. Our methodology is motivated and applied to data from the Breast Cancer Prevention Trial.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a novel approach to the inference of spectral functions from Euclidean time correlator data that makes close contact with modern Bayesian concepts. Our method differs significantly from the maximum entropy method (MEM). A new set of axioms is postulated for the prior probability, leading to an improved expression, which is devoid of the asymptotically flat directions present in the Shanon-Jaynes entropy. Hyperparameters are integrated out explicitly, liberating us from the Gaussian approximations underlying the evidence approach of the maximum entropy method. We present a realistic test of our method in the context of the nonperturbative extraction of the heavy quark potential. Based on hard-thermal-loop correlator mock data, we establish firm requirements in the number of data points and their accuracy for a successful extraction of the potential from lattice QCD. Finally we reinvestigate quenched lattice QCD correlators from a previous study and provide an improved potential estimation at T2.33TC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The considerable search for synergistic agents in cancer research is motivated by the therapeutic benefits achieved by combining anti-cancer agents. Synergistic agents make it possible to reduce dosage while maintaining or enhancing a desired effect. Other favorable outcomes of synergistic agents include reduction in toxicity and minimizing or delaying drug resistance. Dose-response assessment and drug-drug interaction analysis play an important part in the drug discovery process, however analysis are often poorly done. This dissertation is an effort to notably improve dose-response assessment and drug-drug interaction analysis. The most commonly used method in published analysis is the Median-Effect Principle/Combination Index method (Chou and Talalay, 1984). The Median-Effect Principle/Combination Index method leads to inefficiency by ignoring important sources of variation inherent in dose-response data and discarding data points that do not fit the Median-Effect Principle. Previous work has shown that the conventional method yields a high rate of false positives (Boik, Boik, Newman, 2008; Hennessey, Rosner, Bast, Chen, 2010) and, in some cases, low power to detect synergy. There is a great need for improving the current methodology. We developed a Bayesian framework for dose-response modeling and drug-drug interaction analysis. First, we developed a hierarchical meta-regression dose-response model that accounts for various sources of variation and uncertainty and allows one to incorporate knowledge from prior studies into the current analysis, thus offering a more efficient and reliable inference. Second, in the case that parametric dose-response models do not fit the data, we developed a practical and flexible nonparametric regression method for meta-analysis of independently repeated dose-response experiments. Third, and lastly, we developed a method, based on Loewe additivity that allows one to quantitatively assess interaction between two agents combined at a fixed dose ratio. The proposed method makes a comprehensive and honest account of uncertainty within drug interaction assessment. Extensive simulation studies show that the novel methodology improves the screening process of effective/synergistic agents and reduces the incidence of type I error. We consider an ovarian cancer cell line study that investigates the combined effect of DNA methylation inhibitors and histone deacetylation inhibitors in human ovarian cancer cell lines. The hypothesis is that the combination of DNA methylation inhibitors and histone deacetylation inhibitors will enhance antiproliferative activity in human ovarian cancer cell lines compared to treatment with each inhibitor alone. By applying the proposed Bayesian methodology, in vitro synergy was declared for DNA methylation inhibitor, 5-AZA-2'-deoxycytidine combined with one histone deacetylation inhibitor, suberoylanilide hydroxamic acid or trichostatin A in the cell lines HEY and SKOV3. This suggests potential new epigenetic therapies in cell growth inhibition of ovarian cancer cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Population allele frequencies are correlated when populations have a shared history or when they exchange genes. Unfortunately, most models for allele frequency and inference about population structure ignore this correlation. Recent analytical results show that among populations, correlations can be very high, which could affect estimates of population genetic structure. In this study, we propose a mixture beta model to characterize the allele frequency distribution among populations. This formulation incorporates the correlation among populations as well as extending the model to data with different clusters of populations. Results: Using simulated data, we show that in general, the mixture model provides a good approximation of the among-population allele frequency distribution and a good estimate of correlation among populations. Results from fitting the mixture model to a dataset of genotypes at 377 autosomal microsatellite loci from human populations indicate high correlation among populations, which may not be appropriate to neglect. Traditional measures of population structure tend to over-estimate the amount of genetic differentiation when correlation is neglected. Inference is performed in a Bayesian framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Breast cancer is the most common non-skin cancer and the second leading cause of cancer-related death in women in the United States. Studies on ipsilateral breast tumor relapse (IBTR) status and disease-specific survival will help guide clinic treatment and predict patient prognosis.^ After breast conservation therapy, patients with breast cancer may experience breast tumor relapse. This relapse is classified into two distinct types: true local recurrence (TR) and new ipsilateral primary tumor (NP). However, the methods used to classify the relapse types are imperfect and are prone to misclassification. In addition, some observed survival data (e.g., time to relapse and time from relapse to death)are strongly correlated with relapse types. The first part of this dissertation presents a Bayesian approach to (1) modeling the potentially misclassified relapse status and the correlated survival information, (2) estimating the sensitivity and specificity of the diagnostic methods, and (3) quantify the covariate effects on event probabilities. A shared frailty was used to account for the within-subject correlation between survival times. The inference was conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in softwareWinBUGS. Simulation was used to validate the Bayesian method and assess its frequentist properties. The new model has two important innovations: (1) it utilizes the additional survival times correlated with the relapse status to improve the parameter estimation, and (2) it provides tools to address the correlation between the two diagnostic methods conditional to the true relapse types.^ Prediction of patients at highest risk for IBTR after local excision of ductal carcinoma in situ (DCIS) remains a clinical concern. The goals of the second part of this dissertation were to evaluate a published nomogram from Memorial Sloan-Kettering Cancer Center, to determine the risk of IBTR in patients with DCIS treated with local excision, and to determine whether there is a subset of patients at low risk of IBTR. Patients who had undergone local excision from 1990 through 2007 at MD Anderson Cancer Center with a final diagnosis of DCIS (n=794) were included in this part. Clinicopathologic factors and the performance of the Memorial Sloan-Kettering Cancer Center nomogram for prediction of IBTR were assessed for 734 patients with complete data. Nomogram for prediction of 5- and 10-year IBTR probabilities were found to demonstrate imperfect calibration and discrimination, with an area under the receiver operating characteristic curve of .63 and a concordance index of .63. In conclusion, predictive models for IBTR in DCIS patients treated with local excision are imperfect. Our current ability to accurately predict recurrence based on clinical parameters is limited.^ The American Joint Committee on Cancer (AJCC) staging of breast cancer is widely used to determine prognosis, yet survival within each AJCC stage shows wide variation and remains unpredictable. For the third part of this dissertation, biologic markers were hypothesized to be responsible for some of this variation, and the addition of biologic markers to current AJCC staging were examined for possibly provide improved prognostication. The initial cohort included patients treated with surgery as first intervention at MDACC from 1997 to 2006. Cox proportional hazards models were used to create prognostic scoring systems. AJCC pathologic staging parameters and biologic tumor markers were investigated to devise the scoring systems. Surveillance Epidemiology and End Results (SEER) data was used as the external cohort to validate the scoring systems. Binary indicators for pathologic stage (PS), estrogen receptor status (E), and tumor grade (G) were summed to create PS+EG scoring systems devised to predict 5-year patient outcomes. These scoring systems facilitated separation of the study population into more refined subgroups than the current AJCC staging system. The ability of the PS+EG score to stratify outcomes was confirmed in both internal and external validation cohorts. The current study proposes and validates a new staging system by incorporating tumor grade and ER status into current AJCC staging. We recommend that biologic markers be incorporating into revised versions of the AJCC staging system for patients receiving surgery as the first intervention.^ Chapter 1 focuses on developing a Bayesian method to solve misclassified relapse status and application to breast cancer data. Chapter 2 focuses on evaluation of a breast cancer nomogram for predicting risk of IBTR in patients with DCIS after local excision gives the statement of the problem in the clinical research. Chapter 3 focuses on validation of a novel staging system for disease-specific survival in patients with breast cancer treated with surgery as the first intervention. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Belief propagation (BP) is a technique for distributed inference in wireless networks and is often used even when the underlying graphical model contains cycles. In this paper, we propose a uniformly reweighted BP scheme that reduces the impact of cycles by weighting messages by a constant ?edge appearance probability? rho ? 1. We apply this algorithm to distributed binary hypothesis testing problems (e.g., distributed detection) in wireless networks with Markov random field models. We demonstrate that in the considered setting the proposed method outperforms standard BP, while maintaining similar complexity. We then show that the optimal ? can be approximated as a simple function of the average node degree, and can hence be computed in a distributed fashion through a consensus algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When mapping is formulated in a Bayesian framework, the need of specifying a prior for the environment arises naturally. However, so far, the use of a particular structure prior has been coupled to working with a particular representation. We describe a system that supports inference with multiple priors while keeping the same dense representation. The priors are rigorously described by the user in a domain-specific language. Even though we work very close to the measurement space, we are able to represent structure constraints with the same expressivity as methods based on geometric primitives. This approach allows the intrinsic degrees of freedom of the environment’s shape to be recovered. Experiments with simulated and real data sets will be presented

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Probabilistic graphical models are a huge research field in artificial intelligence nowadays. The scope of this work is the study of directed graphical models for the representation of discrete distributions. Two of the main research topics related to this area focus on performing inference over graphical models and on learning graphical models from data. Traditionally, the inference process and the learning process have been treated separately, but given that the learned models structure marks the inference complexity, this kind of strategies will sometimes produce very inefficient models. With the purpose of learning thinner models, in this master thesis we propose a new model for the representation of network polynomials, which we call polynomial trees. Polynomial trees are a complementary representation for Bayesian networks that allows an efficient evaluation of the inference complexity and provides a framework for exact inference. We also propose a set of methods for the incremental compilation of polynomial trees and an algorithm for learning polynomial trees from data using a greedy score+search method that includes the inference complexity as a penalization in the scoring function.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuronal morphology is hugely variable across brain regions and species, and their classification strategies are a matter of intense debate in neuroscience. GABAergic cortical interneurons have been a challenge because it is difficult to find a set of morphological properties which clearly define neuronal types. A group of 48 neuroscience experts around the world were asked to classify a set of 320 cortical GABAergic interneurons according to the main features of their three-dimensional morphological reconstructions. A methodology for building a model which captures the opinions of all the experts was proposed. First, one Bayesian network was learned for each expert, and we proposed an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts was induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts was built. A thorough analysis of the consensus model identified different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types was defined by performing inference in the Bayesian multinet. These findings were used to validate the model and to gain some insights into neuron morphology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Both long-term environmental changes such as those driven by the glacial cycles and more recent anthropogenic impacts have had major effects on the past demography in wild organisms. Within species, these changes are reflected in the amount and distribution of neutral genetic variation. In this thesis, mitochondrial and microsatellite DNA was analysed to investigate how environmental and anthropogenic factors have affected genetic diversity and structure in four ecologically different animal species. Paper I describes the post-glacial recolonisation history of the speckled-wood butterfly (Pararge aegeria) in Northern Europe. A decrease in genetic diversity with latitude and a marked population structure were uncovered, consistent with a hypothesis of repeated founder events during the postglacial recolonisation. Moreover, Approximate Bayesian Computation analyses indicate that the univoltine populations in Scandinavia and Finland originate from recolonisations along two routes, one on each side of the Baltic. Paper II aimed to investigate how past sea-level rises affected the population history of the convict surgeonfish (Acanthurus triostegus) in the Indo-Pacific. Assessment of the species’ demographic history suggested a population expansion that occurred approximately at the end of the last glaciation. Moreover, the results demonstrated an overall lack of phylogeographic structure, probably due to the high dispersal rates associated with the species’ pelagic larval stage. Populations at the species’ eastern range margin were significantly differentiated from other populations, which likely is a consequence of their geographic isolation. In Paper III, we assessed the effect of human impact on the genetic variation of European moose (Alces alces) in Sweden. Genetic analyses revealed a spatial structure with two genetic clusters, one in northern and one in southern Sweden, which were separated by a narrow transition zone. Moreover, demographic inference suggested a recent population bottleneck. The inferred timing of this bottleneck coincided with a known reduction in population size in the 19th and early 20th century due to high hunting pressure. In Paper IV, we examined the effect of an indirect but well-described human impact, via environmental toxic chemicals (PCBs), on the genetic variation of Eurasian otters (Lutra lutra) in Sweden. Genetic clustering assignment revealed differentiation between otters in northern and southern Sweden, but also in the Stockholm region. ABC analyses indicated a decrease in effective population size in both northern and southern Sweden. Moreover, comparative analyses of historical and contemporary samples demonstrated a more severe decline in genetic diversity in southern Sweden compared to northern Sweden, in agreement with the levels of PCBs found.