21 resultados para Statistical modeling technique

em Universidad Politécnica de Madrid


Relevância:

90.00% 90.00%

Publicador:

Resumo:

EWT solar cells start from drilled wafers with approximately 100 holes/cm2. These holes act as stress concentrators leading to a reduction in the mechanical strength of this type of wafers. The viability of cells with higher density of holes has been studied. To this end, sets of wafers with different density of holes have been characterized. The ring on ring test has been employed and FE models have been developed to simulate the test. The statistical evaluation permits to draw conclusions about the reduction of the strength depending on the density of holes. Moreover, the stress concentration around the holes has been studied by means of the FE method employing the sub-modeling technique. The maximum principal stress of EWT wafers with twice the density of holes of commercial ones is almost the same. However, the mutual interaction between the stress concentration effects around neighboring holes is only observed for wafers with a density of 200 holes/cm2

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we present a new tool to perform guided HAZOP analyses. This tool uses a functional model of the process that merges its functional and its structural information in a natural way. The functional modeling technique used is called D-higraphs. This tool solves some of the problems and drawbacks of other existing methodologies for the automation of HAZOPs. The applicability and easy understanding of the proposed methodology is shown in an industrial case.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An innovative background modeling technique that is able to accurately segment foreground regions in RGB-D imagery (RGB plus depth) has been presented in this paper. The technique is based on a Bayesian framework that efficiently fuses different sources of information to segment the foreground. In particular, the final segmentation is obtained by considering a prediction of the foreground regions, carried out by a novel Bayesian Network with a depth-based dynamic model, and, by considering two independent depth and color-based mixture of Gaussians background models. The efficient Bayesian combination of all these data reduces the noise and uncertainties introduced by the color and depth features and the corresponding models. As a result, more compact segmentations, and refined foreground object silhouettes are obtained. Experimental results with different databases suggest that the proposed technique outperforms existing state-of-the-art algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Accurate characterization of the radio channel in tunnels is of great importance for new signaling and train control communications systems. To model this environment, measurements have been taken at 2.4 GHz in a real environment in Madrid subway. The measurements were carried out with four base station transmitters installed in a 2-km tunnel and using a mobile receiver installed on a standard train. First, with an optimum antenna configuration, all the propagation characteristics of a complex subway environment, including near shadowing, path loss,shadow fading, fast fading, level crossing rate (LCR), and average fade duration (AFD), have been measured and computed. Thereafter, comparisons of propagation characteristics in a double-track tunnel (9.8-m width) and a single-track tunnel (4.8-m width) have been made. Finally, all the measurement results have been shown in a complete table for accurate statistical modeling.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients' voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients' voices, which should be found in continuous speech.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La presente investigación, tiene como objetivo analizar las influencias que ejercen los recursos intangibles (Gestión del Conocimiento, Marca, Reputación Organizacional y Responsabilidad Social) en la gestión estratégica de las instituciones de educación superior (IES) y el impacto de los mismos en los procesos de innovación a través del valor añadido que se transfiere al entorno. Se considera importante realizar un estudio sobre este tema dado que son las IES las encargadas de proporcionar los conocimientos y los nuevos hallazgos en innovaciones tecnológicas, que son transferidas al tejido productivo de las regiones, lo que proporciona crecimiento económico y mejoras en la calidad de vida. El estudio se enmarca dentro de los postulados de la teoría de los recursos y las capacidades (TRC) y de los intangibles, los cuales sirven de base a la investigación. Se planteó un sistema de hipótesis subdividido en dos vías de influencias. La primera, donde se analizan las influencias directas que ejercen los recursos intangibles sobre los resultados de las IES. La otra vía es la indirecta, que estudia las influencias que ejercen los recursos intangibles gestionados estratégicamente sobre los resultados de las IES. Esta investigación se ha concebido como no experimental, de tipo exploratorio, basada en el paradigma que busca explicar un fenómeno (variable dependiente) a través del comportamiento de las variables independientes. Es un estudio transversal, cuantitativo, que intenta describir las causas del fenómeno. Con el objeto de determinar las influencias o relaciones de causalidad que subyacen entre las variables, se utilizó la técnica del modelo de ecuaciones estructurales (SEM). La población objeto de estudio estuvo constituida por los 857 individuos pertenecientes a los consejos directivos de las IES, que forman parte de las base de datos que gestiona el Consorcio de Escuelas de Ingeniería de Latinoamérica y del Caribe y la Universidad Politécnica de Madrid, con un tamaño de muestra significativa de 250 directivos, lo que representa el 29,42% de la población. Como fuentes de recolección de información se utilizaron fuentes primarias y secundarias. Para recabar la información primaria se diseñó un cuestionario (ad hoc), el cual fue validado por expertos. La información de fuentes secundarias se extrajo de la bases de datos de la Red Iberoamericana de Ciencia y Tecnología (RICYT). Los resultados obtenidos indican que las influencias directas que pueden ejercer los recursos intangibles (Gestión del Conocimiento, Marca, Reputación Organizacional y Responsabilidad Social) no son significativas, por ello se rechazaron todas las hipótesis de la vía de influencia directa. Asimismo, de acuerdo con el contraste realizado al submodelo que representa la vía de influencia indirecta, resultaron significativas las influencias que ejercen los intangibles Gestión del Conocimiento y Reputación Organizacional, gestionadas estratégicamente sobre los resultados con valor añadido generado por las IES y transferidos al entorno. Sin embargo, no se apoyan todas las hipótesis del modelo, debido a que los constructos Marca y Responsabilidad Social resultaron no significativos. Las teorías sobre intangibles enmarcadas en la TRC no son del todo robustas y requieren de mayores esfuerzos por parte de los investigadores para lograr definir los constructos a utilizar. De igual forma, se sigue corroborando el desfase que existe entre las teorías que sustentan la investigación y las comprobaciones empíricas de las mismas. Además, se evidencia que las IES enfocan su actuación hacia la academia, por encima de las otras funciones, otorgando a la enseñanza e investigación y a la reputación organizacional una mayor importancia. Sin embargo, debido a su naturaleza no empresarial, las IES siguen manteniendo una filosofía de gestión enfocada a la generación y transmisión de conocimientos que crean reputación. Se excluyen los intangibles Marca y Responsabilidad Social, por considerar que no aportan valor a sus procesos internos o que están inmersos dentro de otros recursos intangibles. En conclusión, se corrobora el atraso de la gestión estratégica que presentan las IES en Latinoamérica. Se comprueba la no aplicación de postulados básicos de la gerencia moderna que contribuyan al manejo eficiente de todos sus recursos y al logro de sus objetivos. Esto deriva en la necesidad de modernizar la visión estratégica de las IES y en crear mejores mecanismos para lograr reconocer, mantener, proteger y desarrollar los Recursos Intangibles que poseen, realizando combinaciones de recursos óptimas, que maximicen la creación de valor para sí mismas y para la sociedad a la que pertenecen. ABSTRACT This research aims to analyze the influences exerted by intangible resources (Knowledge Management, Brand, Organizational Reputation and Social Responsibility) in the strategic management of higher education institutions (HEIs) and their impact in the innovation processes through the added value that is transferred to the environment. It is considered important to conduct a study on this issue since HEIs are responsible for providing knowledge and new findings on technological innovations, which are then, transferred to the productive fabric of these regions, providing economic growth and improvements in quality of life. The study is framed within the tenets of the Theory of Resources and Capabilities (TRC) and of intangibles which underlie this research. A system of hypotheses was raised which was subdivided into two pathways of influences. In the first system the direct influences exerted by intangible resources on the results of the IES are analyzed. The other system focuses on the indirect influences exerted by the strategically managed intangible resources on the HEIs results. This research is designed as experimental, exploratory and based on the paradigm that seeks to explain a phenomenon (the dependent variable) through the behavior of the independent variables. It is a crosssectional, quantitative study, which attempts to describe the causes of the phenomenon. In order to determine the influences or causal relationships among variables the structural equation modeling technique (SEM) was used. The population under study consisted of 857 individuals from the boards of HEIs, which are part of the database managed by the Consortium of Engineering Schools in Latin America and the Caribbean and the Technical University of Madrid, with a significant sample size of 250 managers which represents 29.42% of the population. As sources of information gathering primary and secondary sources were used. To collect primary information an ad-hoc questionnaire which was validated by experts was designed. The secondary information was extracted from the database of the Latin American Network of Science and Technology (RICYT). The results obtained indicate that the direct influences that intangible resources (Knowledge Management, Brand, Organizational Reputation and Social Responsibility) can exert are not significant. Therefore, all hypotheses related to direct influence were rejected. Also, according to the test made with the system which represents the indirect channel of influence, significant influences were exerted on the results with added value generated by the HEIs by the intangibles Knowledge Management and Organizational Reputation when they were managed strategically. However, all model hypotheses are not supported, because the constructs Brand and Social Responsibility were not significant. Theories of intangibles within the framework of the Theory of Resources and Capabilities are not entirely robust and require greater efforts by researchers to define the constructs to be used. Similarly the existing gap between the theories underpinning research and the empirical tests continues to be corroborated. In addition, there is evidence that HEIs focus their action on the academy neglecting the other functions, giving more importance to teaching, research and organizational reputation. However, due to their non-business nature, HEIs still maintain a management philosophy focused on the generation and transmission of knowledge which leads to reputation. The intangibles Brand and Social Responsibility are excluded, considering that they do not add value to their internal processes or are embedded within other intangible resources. In conclusion, the backwardness of HEIs’ strategic management in Latin America is confirmed. The lack of application of the basic principles of modern management that contribute to the efficient administration of all the resources and the achievement of objectives is proven. This leads to the need to modernize the strategic vision of HEIs and the need for better mechanisms to recognize, maintain, protect and develop the intangible resources they possess, achieving optimal combinations of resources in order to maximize the creation of value for them and for the society to which they belong.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The micrometeorological mass-balance integrated horizontal flux (IHF) technique has been commonly employed for measuring ammonia (NH3) emissions inon-field experiments. However, the inverse-dispersion modeling technique, such as the backward Lagrangian stochastic (bLS) modeling approach, is currently highlighted as offering flexibility in plot design and requiring a minimum number of samplers (Ro et al., 2013). The objective of this study was to make a comparison between the bLS technique with the IHF technique for estimating NH3 emission from flexible bag storage and following landspreading of dairy cattle slurry. Moreover, considering that NH3 emission in storage could have been non uniform, the effect on bLS estimates of a single point and multiple downwind concentration measurements was tested, as proposed by Sanz et al. (2010).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An image processing observational technique for the stereoscopic reconstruction of the wave form of oceanic sea states is developed. The technique incorporates the enforcement of any given statistical wave law modeling the quasi Gaussianity of oceanic waves observed in nature. The problem is posed in a variational optimization framework, where the desired wave form is obtained as the minimizer of a cost functional that combines image observations, smoothness priors and a weak statistical constraint. The minimizer is obtained combining gradient descent and multigrid methods on the necessary optimality equations of the cost functional. Robust photometric error criteria and a spatial intensity compensation model are also developed to improve the performance of the presented image matching strategy. The weak statistical constraint is thoroughly evaluated in combination with other elements presented to reconstruct and enforce constraints on experimental stereo data, demonstrating the improvement in the estimation of the observed ocean surface.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we use ARIMA modelling to estimate a set of characteristics of a short-term indicator (for example, the index of industrial production), as trends, seasonal variations, cyclical oscillations, unpredictability, deterministic effects (as a strike), etc. Thus for each sector and product (more than 1000), we construct a vector of values corresponding to the above-mentioned characteristics, that can be used for data editing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The characteristics of the power-line communication (PLC) channel are difficult to model due to the heterogeneity of the networks and the lack of common wiring practices. To obtain the full variability of the PLC channel, random channel generators are of great importance for the design and testing of communication algorithms. In this respect, we propose a random channel generator that is based on the top-down approach. Basically, we describe the multipath propagation and the coupling effects with an analytical model. We introduce the variability into a restricted set of parameters and, finally, we fit the model to a set of measured channels. The proposed model enables a closed-form description of both the mean path-loss profile and the statistical correlation function of the channel frequency response. As an example of application, we apply the procedure to a set of in-home measured channels in the band 2-100 MHz whose statistics are available in the literature. The measured channels are divided into nine classes according to their channel capacity. We provide the parameters for the random generation of channels for all nine classes, and we show that the results are consistent with the experimental ones. Finally, we merge the classes to capture the entire heterogeneity of in-home PLC channels. In detail, we introduce the class occurrence probability, and we present a random channel generator that targets the ensemble of all nine classes. The statistics of the composite set of channels are also studied, and they are compared to the results of experimental measurement campaigns in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wake effect represents one of the most important aspects to be analyzed at the engineering phase of every wind farm since it supposes an important power deficit and an increase of turbulence levels with the consequent decrease of the lifetime. It depends on the wind farm design, wind turbine type and the atmospheric conditions prevailing at the site. Traditionally industry has used analytical models, quick and robust, which allow carry out at the preliminary stages wind farm engineering in a flexible way. However, new models based on Computational Fluid Dynamics (CFD) are needed. These models must increase the accuracy of the output variables avoiding at the same time an increase in the computational time. Among them, the elliptic models based on the actuator disk technique have reached an extended use during the last years. These models present three important problems in case of being used by default for the solution of large wind farms: the estimation of the reference wind speed upstream of each rotor disk, turbulence modeling and computational time. In order to minimize the consequence of these problems, this PhD Thesis proposes solutions implemented under the open source CFD solver OpenFOAM and adapted for each type of site: a correction on the reference wind speed for the general elliptic models, the semi-parabollic model for large offshore wind farms and the hybrid model for wind farms in complex terrain. All the models are validated in terms of power ratios by means of experimental data derived from real operating wind farms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computing the modal parameters of large structures in Operational Modal Analysis often requires to process data from multiple non simultaneously recorded setups of sensors. These setups share some sensors in common, the so-called reference sensors that are fixed for all the measurements, while the other sensors are moved from one setup to the next. One possibility is to process the setups separately what result in different modal parameter estimates for each setup. Then the reference sensors are used to merge or glue the different parts of the mode shapes to obtain global modes, while the natural frequencies and damping ratios are usually averaged. In this paper we present a state space model that can be used to process all setups at once so the global mode shapes are obtained automatically and subsequently only a value for the natural frequency and damping ratio of each mode is computed. We also present how this model can be estimated using maximum likelihood and the Expectation Maximization algorithm. We apply this technique to real data measured at a footbridge.