19 resultados para Molecular modeling algorithms

em Universidad Politécnica de Madrid


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modeling phase is fundamental both in the analysis process of a dynamic system and the design of a control system. If this phase is in-line is even more critical and the only information of the system comes from input/output data. Some adaptation algorithms for fuzzy system based on extended Kalman filter are presented in this paper, which allows obtaining accurate models without renounce the computational efficiency that characterizes the Kalman filter, and allows its implementation in-line with the process

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a new category of CAD applications devoted to the definition and parameterization of hull forms, called programmed design. Programmed design relies on two prerequisites. The first one is a product model with a variety of types large enough to face the modeling of any type of ship. The second one is a design language dedicated to create the product model. The main purpose of the language is to publish the modeling algorithms of the application in the designer knowledge domain to let the designer create parametric model scripts. The programmed design is an evolution of the parametric design but it is not just parametric design. It is a tool to create parametric design tools. It provides a methodology to extract the design knowledge by abstracting a design experience in order to store and reuse it. Programmed design is related with the organizational and architectural aspects of the CAD applications but not with the development of modeling algorithms. It is built on top and relies on existing algorithms provided by a comprehensive product model. Programmed design can be useful to develop new applications, to support the evolution of existing applications or even to integrate different types of application in a single one. A three-level software architecture is proposed to make the implementation of the programmed design easier. These levels are the conceptual level based on the design language, the mathematical level based on the geometric formulation of the product model and the visual level based on the polyhedral representation of the model as required by the graphic card. Finally, some scenarios of the use of programmed design are discussed. For instance, the development of specialized parametric hull form generators for a ship type or a family of ships or the creation of palettes of hull form components to be used as parametric design patterns. Also two new processes of reverse engineering which can considerably improve the application have been detected: the creation of the mathematical level from the visual level and the creation of the conceptual level from the mathematical level. © 2012 Elsevier Ltd. All rights reserved. 1. Introduction

Relevância:

40.00% 40.00%

Publicador:

Resumo:

At present, all methods in Evolutionary Computation are bioinspired by the fundamental principles of neo-Darwinism, as well as by a vertical gene transfer. Virus transduction is one of the key mechanisms of horizontal gene propagation in microorganisms (e.g. bacteria). In the present paper, we model and simulate a transduction operator, exploring the possible role and usefulness of transduction in a genetic algorithm. The genetic algorithm including transduction has been named PETRI (abbreviation of Promoting Evolution Through Reiterated Infection). Our results showed how PETRI approaches higher fitness values as transduction probability comes close to 100%. The conclusion is that transduction improves the performance of a genetic algorithm, assuming a population divided among several sub-populations or ?bacterial colonies?.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The mechanisms of growth of a circular void by plastic deformation were studied by means of molecular dynamics in two dimensions (2D). While previous molecular dynamics (MD) simulations in three dimensions (3D) have been limited to small voids (up to ≈10 nm in radius), this strategy allows us to study the behavior of voids of up to 100 nm in radius. MD simulations showed that plastic deformation was triggered by the nucleation of dislocations at the atomic steps of the void surface in the whole range of void sizes studied. The yield stress, defined as stress necessary to nucleate stable dislocations, decreased with temperature, but the void growth rate was not very sensitive to this parameter. Simulations under uniaxial tension, uniaxial deformation and biaxial deformation showed that the void growth rate increased very rapidly with multiaxiality but it did not depend on the initial void radius. These results were compared with previous 3D MD and 2D dislocation dynamics simulations to establish a map of mechanisms and size effects for plastic void growth in crystalline solids.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pb17Li is today a reference breeder material in diverse fusion R&D programs worldwide. Extracting dynamic and structural properties of liquid LiPb mixtures via molecular dynamics simulations, represent a crucial step for multiscale modeling efforts in order to understand the suitability of this compound for future Nuclear Fusion technologies. At present a Li-Pb cross potential is not available in the literature. Here we present our first results on the validation of two semi-empirical potentials for Li and Pb in liquid phase. Our results represent the establishment of a solid base as a previous but crucial step to implement a LiPb cross potential. Structural and thermodynamical analyses confirm that the implemented potentials for Li and Pb are realistic to simulate both elements in the liquid phase.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pb17Li is today a reference breeder material in diverse fusion R&D programs worldwide. Extracting dynamic and structural properties of liquid LiPb mixtures via molecular dynamics simulations, represent a crucial step for multiscale modeling efforts in order to understand the suitability of this compound for future Nuclear Fusion technologies. At present a Li-Pb cross potential is not available in the literature. Here we present our first results on the validation of two semi-empirical potentials for Li and Pb in liquid phase. Our results represent the establishment of a solid base as a previous but crucial step to implement a LiPb cross potential. Structural and thermodynamical analyses confirm that the implemented potentials for Li and Pb are realistic to simulate both elements in the liquid phase.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Energy management has always been recognized as a challenge in mobile systems, especially in modern OS-based mobile systems where multi-functioning are widely supported. Nowadays, it is common for a mobile system user to run multiple applications simultaneously while having a target battery lifetime in mind for a specific application. Traditional OS-level power management (PM) policies make their best effort to save energy under performance constraint, but fail to guarantee a target lifetime, leaving the painful trading off between the total performance of applications and the target lifetime to the user itself. This thesis provides a new way to deal with the problem. It is advocated that a strong energy-aware PM scheme should first guarantee a user-specified battery lifetime to a target application by restricting the average power of those less important applications, and in addition to that, maximize the total performance of applications without harming the lifetime guarantee. As a support, energy, instead of CPU or transmission bandwidth, should be globally managed as the first-class resource by the OS. As the first-stage work of a complete PM scheme, this thesis presents the energy-based fair queuing scheduling, a novel class of energy-aware scheduling algorithms which, in combination with a mechanism of battery discharge rate restricting, systematically manage energy as the first-class resource with the objective of guaranteeing a user-specified battery lifetime for a target application in OS-based mobile systems. Energy-based fair queuing is a cross-application of the traditional fair queuing in the energy management domain. It assigns a power share to each task, and manages energy by proportionally serving energy to tasks according to their assigned power shares. The proportional energy use establishes proportional share of the system power among tasks, which guarantees a minimum power for each task and thus, avoids energy starvation on any task. Energy-based fair queuing treats all tasks equally as one type and supports periodical time-sensitive tasks by allocating each of them a share of system power that is adequate to meet the highest energy demand in all periods. However, an overly conservative power share is usually required to guarantee the meeting of all time constraints. To provide more effective and flexible support for various types of time-sensitive tasks in general purpose operating systems, an extra real-time friendly mechanism is introduced to combine priority-based scheduling into the energy-based fair queuing. Since a method is available to control the maximum time one time-sensitive task can run with priority, the power control and time-constraint meeting can be flexibly traded off. A SystemC-based test-bench is designed to assess the algorithms. Simulation results show the success of the energy-based fair queuing in achieving proportional energy use, time-constraint meeting, and a proper trading off between them. La gestión de energía en los sistema móviles está considerada hoy en día como un reto fundamental, notándose, especialmente, en aquellos terminales que utilizando un sistema operativo implementan múltiples funciones. Es común en los sistemas móviles actuales ejecutar simultaneamente diferentes aplicaciones y tener, para una de ellas, un objetivo de tiempo de uso de la batería. Tradicionalmente, las políticas de gestión de consumo de potencia de los sistemas operativos hacen lo que está en sus manos para ahorrar energía y satisfacer sus requisitos de prestaciones, pero no son capaces de proporcionar un objetivo de tiempo de utilización del sistema, dejando al usuario la difícil tarea de buscar un compromiso entre prestaciones y tiempo de utilización del sistema. Esta tesis, como contribución, proporciona una nueva manera de afrontar el problema. En ella se establece que un esquema de gestión de consumo de energía debería, en primer lugar, garantizar, para una aplicación dada, un tiempo mínimo de utilización de la batería que estuviera especificado por el usuario, restringiendo la potencia media consumida por las aplicaciones que se puedan considerar menos importantes y, en segundo lugar, maximizar las prestaciones globales sin comprometer la garantía de utilización de la batería. Como soporte de lo anterior, la energía, en lugar del tiempo de CPU o el ancho de banda, debería gestionarse globalmente por el sistema operativo como recurso de primera clase. Como primera fase en el desarrollo completo de un esquema de gestión de consumo, esta tesis presenta un algoritmo de planificación de encolado equitativo (fair queueing) basado en el consumo de energía, es decir, una nueva clase de algoritmos de planificación que, en combinación con mecanismos que restrinjan la tasa de descarga de una batería, gestionen de forma sistemática la energía como recurso de primera clase, con el objetivo de garantizar, para una aplicación dada, un tiempo de uso de la batería, definido por el usuario, en sistemas móviles empotrados. El encolado equitativo de energía es una extensión al dominio de la energía del encolado equitativo tradicional. Esta clase de algoritmos asigna una reserva de potencia a cada tarea y gestiona la energía sirviéndola de manera proporcional a su reserva. Este uso proporcional de la energía garantiza que cada tarea reciba una porción de potencia y evita que haya tareas que se vean privadas de recibir energía por otras con un comportamiento más ambicioso. Esta clase de algoritmos trata a todas las tareas por igual y puede planificar tareas periódicas en tiempo real asignando a cada una de ellas una reserva de potencia que es adecuada para proporcionar la mayor de las cantidades de energía demandadas por período. Sin embargo, es posible demostrar que sólo se consigue cumplir con los requisitos impuestos por todos los plazos temporales con reservas de potencia extremadamente conservadoras. En esta tesis, para proporcionar un soporte más flexible y eficiente para diferentes tipos de tareas de tiempo real junto con el resto de tareas, se combina un mecanismo de planificación basado en prioridades con el encolado equitativo basado en energía. En esta clase de algoritmos, gracias al método introducido, que controla el tiempo que se ejecuta con prioridad una tarea de tiempo real, se puede establecer un compromiso entre el cumplimiento de los requisitos de tiempo real y el consumo de potencia. Para evaluar los algoritmos, se ha diseñado en SystemC un banco de pruebas. Los resultados muestran que el algoritmo de encolado equitativo basado en el consumo de energía consigue el balance entre el uso proporcional a la energía reservada y el cumplimiento de los requisitos de tiempo real.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a new multi-objective estimation of distribution algorithm (EDA) based on joint modeling of objectives and variables. This EDA uses the multi-dimensional Bayesian network as its probabilistic model. In this way it can capture the dependencies between objectives, variables and objectives, as well as the dependencies learnt between variables in other Bayesian network-based EDAs. This model leads to a problem decomposition that helps the proposed algorithm to find better trade-off solutions to the multi-objective problem. In addition to Pareto set approximation, the algorithm is also able to estimate the structure of the multi-objective problem. To apply the algorithm to many-objective problems, the algorithm includes four different ranking methods proposed in the literature for this purpose. The algorithm is applied to the set of walking fish group (WFG) problems, and its optimization performance is compared with an evolutionary algorithm and another multi-objective EDA. The experimental results show that the proposed algorithm performs significantly better on many of the problems and for different objective space dimensions, and achieves comparable results on some compared with the other algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Due to recent scientific and technological advances in information sys¬tems, it is now possible to perform almost every application on a mobile device. The need to make sense of such devices more intelligent opens an opportunity to design data mining algorithm that are able to autonomous execute in local devices to provide the device with knowledge. The problem behind autonomous mining deals with the proper configuration of the algorithm to produce the most appropriate results. Contextual information together with resource information of the device have a strong impact on both the feasibility of a particu¬lar execution and on the production of the proper patterns. On the other hand, performance of the algorithm expressed in terms of efficacy and efficiency highly depends on the features of the dataset to be analyzed together with values of the parameters of a particular implementation of an algorithm. However, few existing approaches deal with autonomous configuration of data mining algorithms and in any case they do not deal with contextual or resources information. Both issues are of particular significance, in particular for social net¬works application. In fact, the widespread use of social networks and consequently the amount of information shared have made the need of modeling context in social application a priority. Also the resource consumption has a crucial role in such platforms as the users are using social networks mainly on their mobile devices. This PhD thesis addresses the aforementioned open issues, focusing on i) Analyzing the behavior of algorithms, ii) mapping contextual and resources information to find the most appropriate configuration iii) applying the model for the case of a social recommender. Four main contributions are presented: - The EE-Model: is able to predict the behavior of a data mining algorithm in terms of resource consumed and accuracy of the mining model it will obtain. - The SC-Mapper: maps a situation defined by the context and resource state to a data mining configuration. - SOMAR: is a social activity (event and informal ongoings) recommender for mobile devices. - D-SOMAR: is an evolution of SOMAR which incorporates the configurator in order to provide updated recommendations. Finally, the experimental validation of the proposed contributions using synthetic and real datasets allows us to achieve the objectives and answer the research questions proposed for this dissertation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Knowledge of pesticide selectivity to natural enemies is necessary for a successful implementation of biological and chemical control methods in integrated pest management (IPM) programs. Diacylhydrazine (DAH)-based ecdysone agonists also known as molting-accelerating compounds (MACs) are considered a selective group of insecticides, and their compatibility with predatory Heteroptera, which are used as biological control agents, is known. However, their molecular mode of action has not been explored in beneficial insects such as Orius laevigatus (Fieber) (Hemiptera: Anthocoridae). RESULTS: In this project in vivo toxicity assays demonstrated that the DAH-based RH-5849, tebufenozide and methoxyfenozide have no toxic effect against O. laevigatus. The ligand-binding domain (LBD) of the ecdysone receptor (EcR) of O. laevigatus was sequenced and a homology protein model was constructed which confirmed a cavity structure with 12 ?-helixes, harboring the natural insect molting hormone 20-hydroxyecdysone. However, docking studies showed that a steric clash occurred for the DAH-based insecticides due to a restricted extent of the ligand-binding cavity of the EcR of O. laevigatus. CONCLUSIONS: The insect toxicity assays demonstrated that MACs are selective for O. laevigatus. The modeling/docking experiments are indications that these pesticides do not bind with the LBD-EcR of O. laevigatus and support that they show no biological effects in the predatory bug. These data help in explaining the compatible use of MACs together with predatory bugs in IPM programs. Keywords: Orius laevigatus, selectivity, diacylhydrazine insecticides, ecdysone receptor, homology modelling, docking studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The characteristics of the power-line communication (PLC) channel are difficult to model due to the heterogeneity of the networks and the lack of common wiring practices. To obtain the full variability of the PLC channel, random channel generators are of great importance for the design and testing of communication algorithms. In this respect, we propose a random channel generator that is based on the top-down approach. Basically, we describe the multipath propagation and the coupling effects with an analytical model. We introduce the variability into a restricted set of parameters and, finally, we fit the model to a set of measured channels. The proposed model enables a closed-form description of both the mean path-loss profile and the statistical correlation function of the channel frequency response. As an example of application, we apply the procedure to a set of in-home measured channels in the band 2-100 MHz whose statistics are available in the literature. The measured channels are divided into nine classes according to their channel capacity. We provide the parameters for the random generation of channels for all nine classes, and we show that the results are consistent with the experimental ones. Finally, we merge the classes to capture the entire heterogeneity of in-home PLC channels. In detail, we introduce the class occurrence probability, and we present a random channel generator that targets the ensemble of all nine classes. The statistics of the composite set of channels are also studied, and they are compared to the results of experimental measurement campaigns in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Irradiation with swift heavy ions (SHI), roughly defined as those having atomic masses larger than 15 and energies exceeding 1 MeV/amu, may lead to significant modification of the irradiated material in a nanometric region around the (straight) ion trajectory (latent tracks). In the case of amorphous silica, SHI irradiation originates nano-tracks of higher density than the virgin material (densification). As a result, the refractive index is increased with respect to that of the surroundings. Moreover, track overlapping leads to continuous amorphous layers that present a significant contrast with respect to the pristine substrate. We have recently demonstrated that SHI irradiation produces a large number of point defects, easily detectable by a number of experimental techniques (work presented in the parallel conference ICDIM). The mechanisms of energy transfer from SHI to the target material have their origin in the high electronic excitation induced in the solid. A number of phenomenological approaches have been employed to describe these mechanisms: coulomb explosion, thermal spike, non-radiative exciton decay, bond weakening. However, a detailed microscopic description is missing due to the difficulty of modeling the time evolution of the electronic excitation. In this work we have employed molecular dynamics (MD) calculations to determine whether the irradiation effects are related to the thermal phenomena described by MD (in the ps domain) or to electronic phenomena (sub-ps domain), e.g., exciton localization. We have carried out simulations of up to 100 ps with large boxes (30x30x8 nm3) using a home-modified version of MDCASK that allows us to define a central hot cylinder (ion track) from which heat flows to the surrounding cold bath (unirradiated sample). We observed that once the cylinder has cooled down, the Si and O coordination numbers are 4 and 2, respectively, as in virgin silica. On the other hand, the density of the (cold) cylinder increases with respect to that of silica and, furthermore, the silica network ring size decreases. Both effects are in agreement with the observed densification. In conclusion, purely thermal effects do not explain the generation of point defects upon irradiation, but they do account for the silica densification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The understanding of the structure and dynamics of the intricate network of connections among people that consumes products through Internet appears as an extremely useful asset in order to study emergent properties related to social behavior. This knowledge could be useful, for example, to improve the performance of personal recommendation algorithms. In this contribution, we analyzed five-year records of movie-rating transactions provided by Netflix, a movie rental platform where users rate movies from an online catalog. This dataset can be studied as a bipartite user-item network whose structure evolves in time. Even though several topological properties from subsets of this bipartite network have been reported with a model that combines random and preferential attachment mechanisms [Beguerisse Díaz et al., 2010], there are still many aspects worth to be explored, as they are connected to relevant phenomena underlying the evolution of the network. In this work, we test the hypothesis that bursty human behavior is essential in order to describe how a bipartite user-item network evolves in time. To that end, we propose a novel model that combines, for user nodes, a network growth prescription based on a preferential attachment mechanism acting not only in the topological domain (i.e. based on node degrees) but also in time domain. In the case of items, the model mixes degree preferential attachment and random selection. With these ingredients, the model is not only able to reproduce the asymptotic degree distribution, but also shows an excellent agreement with the Netflix data in several time-dependent topological properties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neutrophil gelatinase associated lipocalin (NGAL) protein is attracting a great interest because of its antibacterial properties played upon modulating iron content in competition against iron acquisition processes developed by pathogenic bacteria that bind selective ferric iron chelators (siderophores). Besides its known high affinity to enterobactin, the most important siderophore, it has been recently shown that NGAL is able to bind Fe(III) coordinated by catechols. The selective binding of Fe(III)-catechol ligands to NGAL is here studied by using iron coordination structures with one, two, and three catecholate ligands. By means of a computational approach that consists of B3LYP/6-311G(d,p) quantum calculations for geometries, electron properties and electrostatic potentials of ligands, protein–ligand flexible docking calculations, analyses of protein–ligand interfaces, and Poisson–Boltzmann electrostatic potentials for proteins, we study the binding of iron catecholate ligands to NGAL as a central member of the lipocalin family of proteins. This approach provides a modeling basis for exploring in silico the selective binding of iron catecholates ligands giving a detailed picture of their interactions in terms of electrostatic effects and a network of hydrogen bonds in the protein binding pocket.