965 resultados para Probability densities
Resumo:
Most of the common techniques for estimating conditional probability densities are inappropriate for applications involving periodic variables. In this paper we introduce three novel techniques for tackling such problems, and investigate their performance using synthetic data. We then apply these techniques to the problem of extracting the distribution of wind vector directions from radar scatterometer data gathered by a remote-sensing satellite.
Resumo:
Compositional data analysis motivated the introduction of a complete Euclidean structure in the simplex of D parts. This was based on the early work of J. Aitchison (1986) and completed recently when Aitchinson distance in the simplex was associated with an inner product and orthonormal bases were identified (Aitchison and others, 2002; Egozcue and others, 2003). A partition of the support of a random variable generates a composition by assigning the probability of each interval to a part of the composition. One can imagine that the partition can be refined and the probability density would represent a kind of continuous composition of probabilities in a simplex of infinitely many parts. This intuitive idea would lead to a Hilbert-space of probability densitiesby generalizing the Aitchison geometry for compositions in the simplex into the set probability densities
Resumo:
Compositional data analysis motivated the introduction of a complete Euclidean structure in the simplex of D parts. This was based on the early work of J. Aitchison (1986) and completed recently when Aitchinson distance in the simplex was associated with an inner product and orthonormal bases were identified (Aitchison and others, 2002; Egozcue and others, 2003). A partition of the support of a random variable generates a composition by assigning the probability of each interval to a part of the composition. One can imagine that the partition can be refined and the probability density would represent a kind of continuous composition of probabilities in a simplex of infinitely many parts. This intuitive idea would lead to a Hilbert-space of probability densities by generalizing the Aitchison geometry for compositions in the simplex into the set probability densities
Resumo:
Most of the common techniques for estimating conditional probability densities are inappropriate for applications involving periodic variables. In this paper we introduce two novel techniques for tackling such problems, and investigate their performance using synthetic data. We then apply these techniques to the problem of extracting the distribution of wind vector directions from radar scatterometer data gathered by a remote-sensing satellite.
Resumo:
Most of the common techniques for estimating conditional probability densities are inappropriate for applications involving periodic variables. In this paper we apply two novel techniques to the problem of extracting the distribution of wind vector directions from radar catterometer data gathered by a remote-sensing satellite.
Resumo:
Most conventional techniques for estimating conditional probability densities are inappropriate for applications involving periodic variables. In this paper we introduce three related techniques for tackling such problems, and investigate their performance using synthetic data. We then apply these techniques to the problem of extracting the distribution of wind vector directions from radar scatterometer data gathered by a remote-sensing satellite.
Resumo:
We present a new dynamical approach to the Blumberg's equation, a family of unimodal maps. These maps are proportional to Beta(p, q) probability densities functions. Using the symmetry of the Beta(p, q) distribution and symbolic dynamics techniques, a new concept of mirror symmetry is defined for this family of maps. The kneading theory is used to analyze the effect of such symmetry in the presented models. The main result proves that two mirror symmetric unimodal maps have the same topological entropy. Different population dynamics regimes are identified, when the intrinsic growth rate is modified: extinctions, stabilities, bifurcations, chaos and Allee effect. To illustrate our results, we present a numerical analysis, where are demonstrated: monotonicity of the topological entropy with the variation of the intrinsic growth rate, existence of isentropic sets in the parameters space and mirror symmetry.
Resumo:
We apply the formalism of the continuous-time random walk to the study of financial data. The entire distribution of prices can be obtained once two auxiliary densities are known. These are the probability densities for the pausing time between successive jumps and the corresponding probability density for the magnitude of a jump. We have applied the formalism to data on the U.S. dollardeutsche mark future exchange, finding good agreement between theory and the observed data.
Resumo:
In the classical theorems of extreme value theory the limits of suitably rescaled maxima of sequences of independent, identically distributed random variables are studied. The vast majority of the literature on the subject deals with affine normalization. We argue that more general normalizations are natural from a mathematical and physical point of view and work them out. The problem is approached using the language of renormalization-group transformations in the space of probability densities. The limit distributions are fixed points of the transformation and the study of its differential around them allows a local analysis of the domains of attraction and the computation of finite-size corrections.
Resumo:
Dans le domaine des neurosciences computationnelles, l'hypothèse a été émise que le système visuel, depuis la rétine et jusqu'au cortex visuel primaire au moins, ajuste continuellement un modèle probabiliste avec des variables latentes, à son flux de perceptions. Ni le modèle exact, ni la méthode exacte utilisée pour l'ajustement ne sont connus, mais les algorithmes existants qui permettent l'ajustement de tels modèles ont besoin de faire une estimation conditionnelle des variables latentes. Cela nous peut nous aider à comprendre pourquoi le système visuel pourrait ajuster un tel modèle; si le modèle est approprié, ces estimé conditionnels peuvent aussi former une excellente représentation, qui permettent d'analyser le contenu sémantique des images perçues. Le travail présenté ici utilise la performance en classification d'images (discrimination entre des types d'objets communs) comme base pour comparer des modèles du système visuel, et des algorithmes pour ajuster ces modèles (vus comme des densités de probabilité) à des images. Cette thèse (a) montre que des modèles basés sur les cellules complexes de l'aire visuelle V1 généralisent mieux à partir d'exemples d'entraînement étiquetés que les réseaux de neurones conventionnels, dont les unités cachées sont plus semblables aux cellules simples de V1; (b) présente une nouvelle interprétation des modèles du système visuels basés sur des cellules complexes, comme distributions de probabilités, ainsi que de nouveaux algorithmes pour les ajuster à des données; et (c) montre que ces modèles forment des représentations qui sont meilleures pour la classification d'images, après avoir été entraînés comme des modèles de probabilités. Deux innovations techniques additionnelles, qui ont rendu ce travail possible, sont également décrites : un algorithme de recherche aléatoire pour sélectionner des hyper-paramètres, et un compilateur pour des expressions mathématiques matricielles, qui peut optimiser ces expressions pour processeur central (CPU) et graphique (GPU).
Resumo:
A truly variance-minimizing filter is introduced and its per for mance is demonstrated with the Korteweg– DeV ries (KdV) equation and with a multilayer quasigeostrophic model of the ocean area around South Africa. It is recalled that Kalman-like filters are not variance minimizing for nonlinear model dynamics and that four - dimensional variational data assimilation (4DV AR)-like methods relying on per fect model dynamics have dif- ficulty with providing error estimates. The new method does not have these drawbacks. In fact, it combines advantages from both methods in that it does provide error estimates while automatically having balanced states after analysis, without extra computations. It is based on ensemble or Monte Carlo integrations to simulate the probability density of the model evolution. When obser vations are available, the so-called importance resampling algorithm is applied. From Bayes’ s theorem it follows that each ensemble member receives a new weight dependent on its ‘ ‘distance’ ’ t o the obser vations. Because the weights are strongly var ying, a resampling of the ensemble is necessar y. This resampling is done such that members with high weights are duplicated according to their weights, while low-weight members are largely ignored. In passing, it is noted that data assimilation is not an inverse problem by nature, although it can be for mulated that way . Also, it is shown that the posterior variance can be larger than the prior if the usual Gaussian framework is set aside. However , i n the examples presented here, the entropy of the probability densities is decreasing. The application to the ocean area around South Africa, gover ned by strongly nonlinear dynamics, shows that the method is working satisfactorily . The strong and weak points of the method are discussed and possible improvements are proposed.
Resumo:
Structural Health Monitoring (SHM) denotes a system with the ability to detect and interpret adverse changes in structures in order to improve reliability and reduce life-cycle costs. The greatest challenge for designing a SHM system is knowing what changes to look for and how to classify them. Different approaches for SHM have been proposed for damage identification, each one with advantages and drawbacks. This paper presents a methodology for improvement in vibration signal analysis using statistics information involving the probability density. Generally, the presence of noises in input and output signals results in false alarms, then, it is important that the methodology can minimize this problem. In this paper, the proposed approach is experimentally tested in a flexible plate using a piezoelectric (PZT) actuator to provide the disturbance.
Resumo:
This paper discusses the theoretical and experimental results obtained for the excitonic binding energy (Eb) in a set of single and coupled double quantum wells (SQWs and CDQWs) of GaAs/AlGaAs with different Al concentrations (Al%) and inter-well barrier thicknesses. To obtain the theoretical Eb the method proposed by Mathieu, Lefebvre and Christol (MLC) was used, which is based on the idea of fractional-dimension space, together with the approach proposed by Zhao et al., which extends the MLC method for application in CDQWs. Through magnetophotoluminescence (MPL) measurements performed at 4 K with magnetic fields ranging from 0 T to 12 T, the diamagnetic shift curves were plotted and adjusted using two expressions: one appropriate to fit the curve in the range of low intensity fields and another for the range of high intensity fields, providing the experimental Eb values. The effects of increasing the Al% and the inter-well barrier thickness on E b are discussed. The Eb reduction when going from the SQW to the CDQW with 5 Å inter-well barrier is clearly observed experimentally for 35% Al concentration and this trend can be noticed even for concentrations as low as 25% and 15%, although the Eb variations in these latter cases are within the error bars. As the Zhao's approach is unable to describe this effect, the wave functions and the probability densities for electrons and holes were calculated, allowing us to explain this effect as being due to a decrease in the spatial superposition of the wave functions caused by the thin inter-well barrier. © 2013 Elsevier B.V.
Resumo:
This paper discusses the theoretical and experimental results obtained for the excitonic binding energy (Eb) in a set of single and coupled double quantum wells (SQWs and CDQWs) of GaAs/AlGaAs with different Al concentrations (Al%) and inter-well barrier thicknesses. To obtain the theoretical Eb the method proposed by Mathieu, Lefebvre and Christol (MLC) was used, which is based on the idea of fractional-dimension space, together with the approach proposed by Zhao et al., which extends the MLC method for application in CDQWs. Through magnetophotoluminescence (MPL) measurements performed at 4 K with magnetic fields ranging from 0 T to 12 T, the diamagnetic shift curves were plotted and adjusted using two expressions: one appropriate to fit the curve in the range of low intensity fields and another for the range of high intensity fields, providing the experimental Eb values. The effects of increasing the Al% and the inter-well barrier thickness on Eb are discussed. The Eb reduction when going from the SQW to the CDQW with 5 Å inter-well barrier is clearly observed experimentally for 35% Al concentration and this trend can be noticed even for concentrations as low as 25% and 15%, although the Eb variations in these latter cases are within the error bars. As the Zhao's approach is unable to describe this effect, the wave functions and the probability densities for electrons and holes were calculated, allowing us to explain this effect as being due to a decrease in the spatial superposition of the wave functions caused by the thin inter-well barrier.
Resumo:
Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.