957 resultados para Probability distributions


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis studies molecular dynamics simulations on two levels of resolution: the detailed level of atomistic simulations, where the motion of explicit atoms in a many-particle system is considered, and the coarse-grained level, where the motion of superatoms composed of up to 10 atoms is modeled. While atomistic models are capable of describing material specific effects on small scales, the time and length scales they can cover are limited due to their computational costs. Polymer systems are typically characterized by effects on a broad range of length and time scales. Therefore it is often impossible to atomistically simulate processes, which determine macroscopic properties in polymer systems. Coarse-grained (CG) simulations extend the range of accessible time and length scales by three to four orders of magnitude. However, no standardized coarse-graining procedure has been established yet. Following the ideas of structure-based coarse-graining, a coarse-grained model for polystyrene is presented. Structure-based methods parameterize CG models to reproduce static properties of atomistic melts such as radial distribution functions between superatoms or other probability distributions for coarse-grained degrees of freedom. Two enhancements of the coarse-graining methodology are suggested. Correlations between local degrees of freedom are implicitly taken into account by additional potentials acting between neighboring superatoms in the polymer chain. This improves the reproduction of local chain conformations and allows the study of different tacticities of polystyrene. It also gives better control of the chain stiffness, which agrees perfectly with the atomistic model, and leads to a reproduction of experimental results for overall chain dimensions, such as the characteristic ratio, for all different tacticities. The second new aspect is the computationally cheap development of nonbonded CG potentials based on the sampling of pairs of oligomers in vacuum. Static properties of polymer melts are obtained as predictions of the CG model in contrast to other structure-based CG models, which are iteratively refined to reproduce reference melt structures. The dynamics of simulations at the two levels of resolution are compared. The time scales of dynamical processes in atomistic and coarse-grained simulations can be connected by a time scaling factor, which depends on several specific system properties as molecular weight, density, temperature, and other components in mixtures. In this thesis the influence of molecular weight in systems of oligomers and the situation in two-component mixtures is studied. For a system of small additives in a melt of long polymer chains the temperature dependence of the additive diffusion is predicted and compared to experiments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Amphiphile Peptide, Pro-Glu-(Phe-Glu)n-Pro, Pro-Asp-(Phe-Asp)n-Pro, und Phe-Glu-(Phe-Glu)n-Phe, können so aus n alternierenden Sequenzen von hydrophoben und hydrophilen Aminosäuren konstruiert werden, dass sie sich in Monolagen an der Luft-Wasser Grenzfläche anordnen. In biologischen Systemen können Strukturen an der organisch-wässrigen Grenzfläche als Matrix für die Kristallisation von Hydroxyapatit dienen, ein Vorgang der für die Behandlung von Osteoporose verwendet werden kann. In der vorliegenden Arbeit wurden Computersimulationenrneingesetzt, um die Strukturen und die zugrunde liegenden Wechselwirkungen welche die Aggregation der Peptide auf mikroskopischer Ebene steuern, zu untersuchen. Atomistische Molekulardynamik-Simulationen von einzelnen Peptidsträngen zeigen, dass sie sich leicht an der Luft-Wasser Grenzfläche anordnen und die Fähigkeit haben, sich in β-Schleifen zu falten, selbst für relativ kurze Peptidlängen (n = 2). Seltene Ereignisse wie diese (i.e. Konformationsänderungen) erfordern den Einsatz fortgeschrittener Sampling-Techniken. Hier wurde “Replica Exchange” Molekulardynamik verwendet um den Einfluss der Peptidsequenzen zu untersuchen. Die Simulationsergebnisse zeigten, dass Peptide mit kürzeren azidischen Seitenketten (Asp vs. Glu) gestrecktere Konformationen aufwiesen als die mit längeren Seitenketten, die in der Lage waren die Prolin-Termini zu erreichen. Darüber hinaus zeigte sich, dass die Prolin-Termini (Pro vs. Phe) notwendig sind, um eine 2D-Ordnung innerhalb derrnAggregate zu erhalten. Das Peptid Pro-Asp-(Phe-Asp)n-Pro, das beide dieser Eigenschaften enthält, zeigt das geordnetste Verhalten, eine geringe Verdrehung der Hauptkette, und ist in der Lage die gebildeten Aggregate durch Wasserstoffbrücken zwischen den sauren Seitenketten zu stabilisieren. Somit ist dieses Peptid am besten zur Aggregation geeignet. Dies wurde auch durch die Beurteilung der Stabilität von experimentnah-aufgesetzten Peptidaggregaten, sowie der Neigung einzelner Peptide zur Selbstorganisation von anfänglich ungeordneten Konfigurationen unterstützt. Da atomistische Simulationen nur auf kleine Systemgrößen und relativ kurze Zeitskalen begrenzt sind, wird ein vergröbertes Modell entwickelt damit die Selbstorganisation auf einem größeren Maßstab studiert werden kann. Da die Selbstorganisation an der Grenzfläche vonrnInteresse ist, wurden existierenden Vergröberungsmethoden erweitert, um nicht-gebundene Potentiale für inhomogene Systeme zu bestimmen. Die entwickelte Methode ist analog zur iterativen Boltzmann Inversion, bildet aber das Update für das Interaktionspotential basierend auf der radialen Verteilungsfunktion in einer Slab-Geometrie und den Breiten des Slabs und der Grenzfläche. Somit kann ein Kompromiss zwischen der lokalen Flüssigketsstruktur und den thermodynamischen Eigenschaften der Grenzfläche erreicht werden. Die neue Methode wurde für einen Wasser- und einen Methanol-Slab im Vakuum demonstriert, sowie für ein einzelnes Benzolmolekül an der Vakuum-Wasser Grenzfläche, eine Anwendung die von besonderer Bedeutung in der Biologie ist, in der oft das thermodynamische/Grenzflächenpolymerisations-Verhalten zusätzlich der strukturellen Eigenschaften des Systems erhalten werden müssen. Daraufrnbasierend wurde ein vergröbertes Modell über einen Fragment-Ansatz parametrisiert und die Affinität des Peptids zur Vakuum-Wasser Grenzfläche getestet. Obwohl die einzelnen Fragmente sowohl die Struktur als auch die Wahrscheinlichkeitsverteilungen an der Grenzfläche reproduzierten, diffundierte das Peptid als Ganzes von der Grenzfläche weg. Jedoch führte eine Reparametrisierung der nicht-gebundenen Wechselwirkungen für eines der Fragmente der Hauptkette in einem Trimer dazu, dass das Peptid an der Grenzfläche blieb. Dies deutet darauf hin, dass die Kettenkonnektivität eine wichtige Rolle im Verhalten des Petpids an der Grenzfläche spielt.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The present study was conducted to estimate the direct losses due to Neospora caninum in Swiss dairy cattle and to assess the costs and benefits of different potential control strategies. A Monte Carlo simulation spreadsheet module was developed to estimate the direct costs caused by N. caninum, with and without control strategies, and to estimate the costs of these control strategies in a financial analysis. The control strategies considered were "testing and culling of seropositive female cattle", "discontinued breeding with offspring from seropositive cows", "chemotherapeutical treatment of female offspring" and "vaccination of all female cattle". Each parameter in the module that was considered to be uncertain, was described using probability distributions. The simulations were run with 20,000 iterations over a time period of 25 years. The median annual losses due to N. caninum in the Swiss dairy cow population were estimated to be euro 9.7 million euros. All control strategies that required yearly serological testing of all cattle in the population produced high costs and thus were not financially profitable. Among the other control strategies, two showed benefit-cost ratios (BCR) >1 and positive net present values (NPV): "Discontinued breeding with offspring from seropositive cows" (BCR=1.29, NPV=25 million euros ) and "chemotherapeutical treatment of all female offspring" (BCR=2.95, NPV=59 million euros). In economic terms, the best control strategy currently available would therefore be "discontinued breeding with offspring from seropositive cows".

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Similarity measure is one of the main factors that affect the accuracy of intensity-based 2D/3D registration of X-ray fluoroscopy to CT images. Information theory has been used to derive similarity measure for image registration leading to the introduction of mutual information, an accurate similarity measure for multi-modal and mono-modal image registration tasks. However, it is known that the standard mutual information measure only takes intensity values into account without considering spatial information and its robustness is questionable. Previous attempt to incorporate spatial information into mutual information either requires computing the entropy of higher dimensional probability distributions, or is not robust to outliers. In this paper, we show how to incorporate spatial information into mutual information without suffering from these problems. Using a variational approximation derived from the Kullback-Leibler bound, spatial information can be effectively incorporated into mutual information via energy minimization. The resulting similarity measure has a least-squares form and can be effectively minimized by a multi-resolution Levenberg-Marquardt optimizer. Experimental results are presented on datasets of two applications: (a) intra-operative patient pose estimation from a few (e.g. 2) calibrated fluoroscopic images, and (b) post-operative cup alignment estimation from single X-ray radiograph with gonadal shielding.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We describe several simulation algorithms that yield random probability distributions with given values of risk measures. In case of vanilla risk measures, the algorithms involve combining and transforming random cumulative distribution functions or random Lorenz curves obtained by simulating rather general random probability distributions on the unit interval. A new algorithm based on the simulation of a weighted barycentres array is suggested to generate random probability distributions with a given value of the spectral risk measure.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For probability distributions on ℝq, a detailed study of the breakdown properties of some multivariate M-functionals related to Tyler's [Ann. Statist. 15 (1987) 234] ‘distribution-free’ M-functional of scatter is given. These include a symmetrized version of Tyler's M-functional of scatter, and the multivariate t M-functionals of location and scatter. It is shown that for ‘smooth’ distributions, the (contamination) breakdown point of Tyler's M-functional of scatter and of its symmetrized version are 1/q and inline image, respectively. For the multivariate t M-functional which arises from the maximum likelihood estimate for the parameters of an elliptical t distribution on ν ≥ 1 degrees of freedom the breakdown point at smooth distributions is 1/(q + ν). Breakdown points are also obtained for general distributions, including empirical distributions. Finally, the sources of breakdown are investigated. It turns out that breakdown can only be caused by contaminating distributions that are concentrated near low-dimensional subspaces.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present studies of 9 modern (up to 400-yr-old) peat sections from Slovenia, Switzerland, Austria, Italy, and Finland. Precise radiocarbon dating of modern samples is possible due to the large bomb peak of atmospheric 14C concentration in 1963 and the following rapid decline in the 14C level. All the analyzed 14C profiles appeared concordant with the shape of the bomb peak of atmospheric 14C concentration, integrated over some time interval with a length specific to the peat section. In the peat layers covered by the bomb peak, calendar ages of individual peat samples could be determined almost immediately, with an accuracy of 23 yr. In the pre-bomb sections, the calendar ages of individual dated samples are determined in the form of multi-modal probability distributions of about 300 yr wide (about AD 16501950). However, simultaneous use of the post-bomb and pre-bomb 14C dates, and lithological information, enabled the rejection of most modes of probability distributions in the pre-bomb section. In effect, precise age-depth models of the post-bomb sections have been extended back in time, into the wiggly part of the 14C calibration curve.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The focus of this study was to generalize the theory of runs to multinomial outcomes using the generating function approach. Detailed discussion is provided for determining the probability distributions for all runs of length i in a sequence of n trials for the binomial and trinomial cases. The generalization to multinomial case is also presented. Application to data for patients from a long term disability care facility is presented to illustrate the use of Run Theory in determining the probability of a dominant state of treatment associated with a patient during his/her hospitalization. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The selection of metrics for ecosystem restoration programs is critical for improving the quality of monitoring programs and characterizing project success. Moreover it is oftentimes very difficult to balance the importance of multiple ecological, social, and economical metrics. Metric selection process is a complex and must simultaneously take into account monitoring data, environmental models, socio-economic considerations, and stakeholder interests. We propose multicriteria decision analysis (MCDA) methods, broadly defined, for the selection of optimal sets of metrics to enhance evaluation of ecosystem restoration alternatives. Two MCDA methods, a multiattribute utility analysis (MAUT), and a probabilistic multicriteria acceptability analysis (ProMAA), are applied and compared for a hypothetical case study of a river restoration involving multiple stakeholders. Overall, the MCDA results in a systematic, unbiased, and transparent solution, informing restoration alternatives evaluation. The two methods provide comparable results in terms of selected metrics. However, because ProMAA can consider probability distributions for weights and utility values of metrics for each criteria, it is suggested as the best option if data uncertainty is high. Despite the increase in complexity in the metric selection process, MCDA improves upon the current ad-hoc decision practice based on the consultations with stakeholders and experts, and encourages transparent and quantitative aggregation of data and judgement, increasing the transparency of decision making in restoration projects. We believe that MCDA can enhance the overall sustainability of ecosystem by enhancing both ecological and societal needs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The naïve Bayes approach is a simple but often satisfactory method for supervised classification. In this paper, we focus on the naïve Bayes model and propose the application of regularization techniques to learn a naïve Bayes classifier. The main contribution of the paper is a stagewise version of the selective naïve Bayes, which can be considered a regularized version of the naïve Bayes model. We call it forward stagewise naïve Bayes. For comparison’s sake, we also introduce an explicitly regularized formulation of the naïve Bayes model, where conditional independence (absence of arcs) is promoted via an L 1/L 2-group penalty on the parameters that define the conditional probability distributions. Although already published in the literature, this idea has only been applied for continuous predictors. We extend this formulation to discrete predictors and propose a modification that yields an adaptive penalization. We show that, whereas the L 1/L 2 group penalty formulation only discards irrelevant predictors, the forward stagewise naïve Bayes can discard both irrelevant and redundant predictors, which are known to be harmful for the naïve Bayes classifier. Both approaches, however, usually improve the classical naïve Bayes model’s accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In multi-attribute utility theory, it is often not easy to elicit precise values for the scaling weights representing the relative importance of criteria. A very widespread approach is to gather incomplete information. A recent approach for dealing with such situations is to use information about each alternative?s intensity of dominance, known as dominance measuring methods. Different dominancemeasuring methods have been proposed, and simulation studies have been carried out to compare these methods with each other and with other approaches but only when ordinal information about weights is available. In this paper, we useMonte Carlo simulation techniques to analyse the performance of and adapt such methods to deal with weight intervals, weights fitting independent normal probability distributions orweights represented by fuzzy numbers.Moreover, dominance measuringmethod performance is also compared with a widely used methodology dealing with incomplete information on weights, the stochastic multicriteria acceptability analysis (SMAA). SMAA is based on exploring the weight space to describe the evaluations that would make each alternative the preferred one.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The new Spanish Regulation in Building Acoustic establishes values and limits for the different acoustic magnitudes whose fulfillment can be verify by means field measurements. In this sense, an essential aspect of a field measurement is to give the measured magnitude and the uncertainty associated to such a magnitude. In the calculus of the uncertainty it is very usual to follow the uncertainty propagation method as described in the Guide to the expression of Uncertainty in Measurements (GUM). Other option is the numerical calculus based on the distribution propagation method by means of Monte Carlo simulation. In fact, at this stage, it is possible to find several publications developing this last method by using different software programs. In the present work, we used Excel for the Monte Carlo simulation for the calculus of the uncertainty associated to the different magnitudes derived from the field measurements following ISO 140-4, 140-5 and 140-7. We compare the results with the ones obtained by the uncertainty propagation method. Although both methods give similar values, some small differences have been observed. Some arguments to explain such differences are the asymmetry of the probability distributions associated to the entry magnitudes,the overestimation of the uncertainty following the GUM

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose distributed algorithms for sampling networks based on a new class of random walks that we call Centrifugal Random Walks (CRW). A CRW is a random walk that starts at a source and always moves away from it. We propose CRW algorithms for connected networks with arbitrary probability distributions, and for grids and networks with regular concentric connectivity with distance based distributions. All CRW sampling algorithms select a node with the exact probability distribution, do not need warm-up, and end in a number of hops bounded by the network diameter.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este trabajo aborda el problema de modelizar sistemas din´amicos reales a partir del estudio de sus series temporales, usando una formulaci´on est´andar que pretende ser una abstracci´on universal de los sistemas din´amicos, independientemente de su naturaleza determinista, estoc´astica o h´ıbrida. Se parte de modelizaciones separadas de sistemas deterministas por un lado y estoc´asticos por otro, para converger finalmente en un modelo h´ıbrido que permite estudiar sistemas gen´ericos mixtos, esto es, que presentan una combinaci´on de comportamiento determinista y aleatorio. Este modelo consta de dos componentes, uno determinista consistente en una ecuaci´on en diferencias, obtenida a partir de un estudio de autocorrelaci´on, y otro estoc´astico que modeliza el error cometido por el primero. El componente estoc´astico es un generador universal de distribuciones de probabilidad, basado en un proceso compuesto de variables aleatorias, uniformemente distribuidas en un intervalo variable en el tiempo. Este generador universal es deducido en la tesis a partir de una nueva teor´ıa sobre la oferta y la demanda de un recurso gen´erico. El modelo resultante puede formularse conceptualmente como una entidad con tres elementos fundamentales: un motor generador de din´amica determinista, una fuente interna de ruido generadora de incertidumbre y una exposici´on al entorno que representa las interacciones del sistema real con el mundo exterior. En las aplicaciones estos tres elementos se ajustan en base al hist´orico de las series temporales del sistema din´amico. Una vez ajustados sus componentes, el modelo se comporta de una forma adaptativa tomando como inputs los nuevos valores de las series temporales del sistema y calculando predicciones sobre su comportamiento futuro. Cada predicci´on se presenta como un intervalo dentro del cual cualquier valor es equipro- bable, teniendo probabilidad nula cualquier valor externo al intervalo. De esta forma el modelo computa el comportamiento futuro y su nivel de incertidumbre en base al estado actual del sistema. Se ha aplicado el modelo en esta tesis a sistemas muy diferentes mostrando ser muy flexible para afrontar el estudio de campos de naturaleza dispar. El intercambio de tr´afico telef´onico entre operadores de telefon´ıa, la evoluci´on de mercados financieros y el flujo de informaci´on entre servidores de Internet son estudiados en profundidad en la tesis. Todos estos sistemas son modelizados de forma exitosa con un mismo lenguaje, a pesar de tratarse de sistemas f´ısicos totalmente distintos. El estudio de las redes de telefon´ıa muestra que los patrones de tr´afico telef´onico presentan una fuerte pseudo-periodicidad semanal contaminada con una gran cantidad de ruido, sobre todo en el caso de llamadas internacionales. El estudio de los mercados financieros muestra por su parte que la naturaleza fundamental de ´estos es aleatoria con un rango de comportamiento relativamente acotado. Una parte de la tesis se dedica a explicar algunas de las manifestaciones emp´ıricas m´as importantes en los mercados financieros como son los “fat tails”, “power laws” y “volatility clustering”. Por ´ultimo se demuestra que la comunicaci´on entre servidores de Internet tiene, al igual que los mercados financieros, una componente subyacente totalmente estoc´astica pero de comportamiento bastante “d´ocil”, siendo esta docilidad m´as acusada a medida que aumenta la distancia entre servidores. Dos aspectos son destacables en el modelo, su adaptabilidad y su universalidad. El primero es debido a que, una vez ajustados los par´ametros generales, el modelo se “alimenta” de los valores observables del sistema y es capaz de calcular con ellos comportamientos futuros. A pesar de tener unos par´ametros fijos, la variabilidad en los observables que sirven de input al modelo llevan a una gran riqueza de ouputs posibles. El segundo aspecto se debe a la formulaci´on gen´erica del modelo h´ıbrido y a que sus par´ametros se ajustan en base a manifestaciones externas del sistema en estudio, y no en base a sus caracter´ısticas f´ısicas. Estos factores hacen que el modelo pueda utilizarse en gran variedad de campos. Por ´ultimo, la tesis propone en su parte final otros campos donde se han obtenido ´exitos preliminares muy prometedores como son la modelizaci´on del riesgo financiero, los algoritmos de routing en redes de telecomunicaci´on y el cambio clim´atico. Abstract This work faces the problem of modeling dynamical systems based on the study of its time series, by using a standard language that aims to be an universal abstraction of dynamical systems, irrespective of their deterministic, stochastic or hybrid nature. Deterministic and stochastic models are developed separately to be merged subsequently into a hybrid model, which allows the study of generic systems, that is to say, those having both deterministic and random behavior. This model is a combination of two different components. One of them is deterministic and consisting in an equation in differences derived from an auto-correlation study and the other is stochastic and models the errors made by the deterministic one. The stochastic component is an universal generator of probability distributions based on a process consisting in random variables distributed uniformly within an interval varying in time. This universal generator is derived in the thesis from a new theory of offer and demand for a generic resource. The resulting model can be visualized as an entity with three fundamental elements: an engine generating deterministic dynamics, an internal source of noise generating uncertainty and an exposure to the environment which depicts the interactions between the real system and the external world. In the applications these three elements are adjusted to the history of the time series from the dynamical system. Once its components have been adjusted, the model behaves in an adaptive way by using the new time series values from the system as inputs and calculating predictions about its future behavior. Every prediction is provided as an interval, where any inner value is equally probable while all outer ones have null probability. So, the model computes the future behavior and its level of uncertainty based on the current state of the system. The model is applied to quite different systems in this thesis, showing to be very flexible when facing the study of fields with diverse nature. The exchange of traffic between telephony operators, the evolution of financial markets and the flow of information between servers on the Internet are deeply studied in this thesis. All these systems are successfully modeled by using the same “language”, in spite the fact that they are systems physically radically different. The study of telephony networks shows that the traffic patterns are strongly weekly pseudo-periodic but mixed with a great amount of noise, specially in the case of international calls. It is proved that the underlying nature of financial markets is random with a moderate range of variability. A part of this thesis is devoted to explain some of the most important empirical observations in financial markets, such as “fat tails”, “power laws” and “volatility clustering”. Finally it is proved that the communication between two servers on the Internet has, as in the case of financial markets, an underlaying random dynamics but with a narrow range of variability, being this lack of variability more marked as the distance between servers is increased. Two aspects of the model stand out as being the most important: its adaptability and its universality. The first one is due to the fact that once the general parameters have been adjusted , the model is “fed” on the observable manifestations of the system in order to calculate its future behavior. Despite the fact that the model has fixed parameters the variability in the observable manifestations of the system, which are used as inputs of the model, lead to a great variability in the possible outputs. The second aspect is due to the general “language” used in the formulation of the hybrid model and to the fact that its parameters are adjusted based on external manifestations of the system under study instead of its physical characteristics. These factors made the model suitable to be used in great variety of fields. Lastly, this thesis proposes other fields in which preliminary and promising results have been obtained, such as the modeling of financial risk, the development of routing algorithms for telecommunication networks and the assessment of climate change.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.