16 resultados para Continuous random network
em Universidad Politécnica de Madrid
Resumo:
Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.
Resumo:
An integrated approach composed of a random utility-based multiregional input-output model and a road transport network model was developed for evaluating the application of a fee to heavy-goods vehicles (HGVs) in Spain. For this purpose, a distance-based charge scenario (in euros per vehicle kilometer) for HGVs was evaluated for a selected motorway network in Spain. Although the aim of this charging policy was to increase the efficiency of transport, the approach strongly identified direct and indirect impacts on the regional economy. Estimates of the magnitude and extent of indirect effects on aggregated macroeconomic indicators (employment and gross domestic product) are provided. The macroeconomic effects of the charging policy were found to be positive for some regions and negative for other regions.
Resumo:
Learning the structure of a graphical model from data is a common task in a wide range of practical applications. In this paper, we focus on Gaussian Bayesian networks, i.e., on continuous data and directed acyclic graphs with a joint probability density of all variables given by a Gaussian. We propose to work in an equivalence class search space, specifically using the k-greedy equivalence search algorithm. This, combined with regularization techniques to guide the structure search, can learn sparse networks close to the one that generated the data. We provide results on some synthetic networks and on modeling the gene network of the two biological pathways regulating the biosynthesis of isoprenoids for the Arabidopsis thaliana plant
Resumo:
This paper presents an algorithm for generating scale-free networks with adjustable clustering coefficient. The algorithm is based on a random walk procedure combined with a triangle generation scheme which takes into account genetic factors; this way, preferential attachment and clustering control are implemented using only local information. Simulations are presented which support the validity of the scheme, characterizing its tuning capabilities.
Resumo:
The extraordinary increase of new information technologies, the development of Internet, the electronic commerce, the e-government, mobile telephony and future cloud computing and storage, have provided great benefits in all areas of society. Besides these, there are new challenges for the protection of information, such as the loss of confidentiality and integrity of electronic documents. Cryptography plays a key role by providing the necessary tools to ensure the safety of these new media. It is imperative to intensify the research in this area, to meet the growing demand for new secure cryptographic techniques. The theory of chaotic nonlinear dynamical systems and the theory of cryptography give rise to the chaotic cryptography, which is the field of study of this thesis. The link between cryptography and chaotic systems is still subject of intense study. The combination of apparently stochastic behavior, the properties of sensitivity to initial conditions and parameters, ergodicity, mixing, and the fact that periodic points are dense, suggests that chaotic orbits resemble random sequences. This fact, and the ability to synchronize multiple chaotic systems, initially described by Pecora and Carroll, has generated an avalanche of research papers that relate cryptography and chaos. The chaotic cryptography addresses two fundamental design paradigms. In the first paradigm, chaotic cryptosystems are designed using continuous time, mainly based on chaotic synchronization techniques; they are implemented with analog circuits or by computer simulation. In the second paradigm, chaotic cryptosystems are constructed using discrete time and generally do not depend on chaos synchronization techniques. The contributions in this thesis involve three aspects about chaotic cryptography. The first one is a theoretical analysis of the geometric properties of some of the most employed chaotic attractors for the design of chaotic cryptosystems. The second one is the cryptanalysis of continuos chaotic cryptosystems and finally concludes with three new designs of cryptographically secure chaotic pseudorandom generators. The main accomplishments contained in this thesis are: v Development of a method for determining the parameters of some double scroll chaotic systems, including Lorenz system and Chua’s circuit. First, some geometrical characteristics of chaotic system have been used to reduce the search space of parameters. Next, a scheme based on the synchronization of chaotic systems was built. The geometric properties have been employed as matching criterion, to determine the values of the parameters with the desired accuracy. The method is not affected by a moderate amount of noise in the waveform. The proposed method has been applied to find security flaws in the continuous chaotic encryption systems. Based on previous results, the chaotic ciphers proposed by Wang and Bu and those proposed by Xu and Li are cryptanalyzed. We propose some solutions to improve the cryptosystems, although very limited because these systems are not suitable for use in cryptography. Development of a method for determining the parameters of the Lorenz system, when it is used in the design of two-channel cryptosystem. The method uses the geometric properties of the Lorenz system. The search space of parameters has been reduced. Next, the parameters have been accurately determined from the ciphertext. The method has been applied to cryptanalysis of an encryption scheme proposed by Jiang. In 2005, Gunay et al. proposed a chaotic encryption system based on a cellular neural network implementation of Chua’s circuit. This scheme has been cryptanalyzed. Some gaps in security design have been identified. Based on the theoretical results of digital chaotic systems and cryptanalysis of several chaotic ciphers recently proposed, a family of pseudorandom generators has been designed using finite precision. The design is based on the coupling of several piecewise linear chaotic maps. Based on the above results a new family of chaotic pseudorandom generators named Trident has been designed. These generators have been specially designed to meet the needs of real-time encryption of mobile technology. According to the above results, this thesis proposes another family of pseudorandom generators called Trifork. These generators are based on a combination of perturbed Lagged Fibonacci generators. This family of generators is cryptographically secure and suitable for use in real-time encryption. Detailed analysis shows that the proposed pseudorandom generator can provide fast encryption speed and a high level of security, at the same time. El extraordinario auge de las nuevas tecnologías de la información, el desarrollo de Internet, el comercio electrónico, la administración electrónica, la telefonía móvil y la futura computación y almacenamiento en la nube, han proporcionado grandes beneficios en todos los ámbitos de la sociedad. Junto a éstos, se presentan nuevos retos para la protección de la información, como la suplantación de personalidad y la pérdida de la confidencialidad e integridad de los documentos electrónicos. La criptografía juega un papel fundamental aportando las herramientas necesarias para garantizar la seguridad de estos nuevos medios, pero es imperativo intensificar la investigación en este ámbito para dar respuesta a la demanda creciente de nuevas técnicas criptográficas seguras. La teoría de los sistemas dinámicos no lineales junto a la criptografía dan lugar a la ((criptografía caótica)), que es el campo de estudio de esta tesis. El vínculo entre la criptografía y los sistemas caóticos continúa siendo objeto de un intenso estudio. La combinación del comportamiento aparentemente estocástico, las propiedades de sensibilidad a las condiciones iniciales y a los parámetros, la ergodicidad, la mezcla, y que los puntos periódicos sean densos asemejan las órbitas caóticas a secuencias aleatorias, lo que supone su potencial utilización en el enmascaramiento de mensajes. Este hecho, junto a la posibilidad de sincronizar varios sistemas caóticos descrita inicialmente en los trabajos de Pecora y Carroll, ha generado una avalancha de trabajos de investigación donde se plantean muchas ideas sobre la forma de realizar sistemas de comunicaciones seguros, relacionando así la criptografía y el caos. La criptografía caótica aborda dos paradigmas de diseño fundamentales. En el primero, los criptosistemas caóticos se diseñan utilizando circuitos analógicos, principalmente basados en las técnicas de sincronización caótica; en el segundo, los criptosistemas caóticos se construyen en circuitos discretos u ordenadores, y generalmente no dependen de las técnicas de sincronización del caos. Nuestra contribución en esta tesis implica tres aspectos sobre el cifrado caótico. En primer lugar, se realiza un análisis teórico de las propiedades geométricas de algunos de los sistemas caóticos más empleados en el diseño de criptosistemas caóticos vii continuos; en segundo lugar, se realiza el criptoanálisis de cifrados caóticos continuos basados en el análisis anterior; y, finalmente, se realizan tres nuevas propuestas de diseño de generadores de secuencias pseudoaleatorias criptográficamente seguros y rápidos. La primera parte de esta memoria realiza un análisis crítico acerca de la seguridad de los criptosistemas caóticos, llegando a la conclusión de que la gran mayoría de los algoritmos de cifrado caóticos continuos —ya sean realizados físicamente o programados numéricamente— tienen serios inconvenientes para proteger la confidencialidad de la información ya que son inseguros e ineficientes. Asimismo una gran parte de los criptosistemas caóticos discretos propuestos se consideran inseguros y otros no han sido atacados por lo que se considera necesario más trabajo de criptoanálisis. Esta parte concluye señalando las principales debilidades encontradas en los criptosistemas analizados y algunas recomendaciones para su mejora. En la segunda parte se diseña un método de criptoanálisis que permite la identificaci ón de los parámetros, que en general forman parte de la clave, de algoritmos de cifrado basados en sistemas caóticos de Lorenz y similares, que utilizan los esquemas de sincronización excitador-respuesta. Este método se basa en algunas características geométricas del atractor de Lorenz. El método diseñado se ha empleado para criptoanalizar eficientemente tres algoritmos de cifrado. Finalmente se realiza el criptoanálisis de otros dos esquemas de cifrado propuestos recientemente. La tercera parte de la tesis abarca el diseño de generadores de secuencias pseudoaleatorias criptográficamente seguras, basadas en aplicaciones caóticas, realizando las pruebas estadísticas, que corroboran las propiedades de aleatoriedad. Estos generadores pueden ser utilizados en el desarrollo de sistemas de cifrado en flujo y para cubrir las necesidades del cifrado en tiempo real. Una cuestión importante en el diseño de sistemas de cifrado discreto caótico es la degradación dinámica debida a la precisión finita; sin embargo, la mayoría de los diseñadores de sistemas de cifrado discreto caótico no ha considerado seriamente este aspecto. En esta tesis se hace hincapié en la importancia de esta cuestión y se contribuye a su esclarecimiento con algunas consideraciones iniciales. Ya que las cuestiones teóricas sobre la dinámica de la degradación de los sistemas caóticos digitales no ha sido totalmente resuelta, en este trabajo utilizamos algunas soluciones prácticas para evitar esta dificultad teórica. Entre las técnicas posibles, se proponen y evalúan varias soluciones, como operaciones de rotación de bits y desplazamiento de bits, que combinadas con la variación dinámica de parámetros y con la perturbación cruzada, proporcionan un excelente remedio al problema de la degradación dinámica. Además de los problemas de seguridad sobre la degradación dinámica, muchos criptosistemas se rompen debido a su diseño descuidado, no a causa de los defectos esenciales de los sistemas caóticos digitales. Este hecho se ha tomado en cuenta en esta tesis y se ha logrado el diseño de generadores pseudoaleatorios caóticos criptogr áficamente seguros.
Resumo:
The implementation of a charging policy for heavy goods vehicles in European Union (EU) member countries has been imposed to reflect costs of construction and maintenance of infrastructure as well as externalities such as congestion, accidents and environmental impact. In this context, EU countries approved the Eurovignette directive (1999/62/EC) and its amending directive (2006 /38/EC) which established a legal framework to regulate the system of tolls. Even if that regulation seek s to increase the efficien cy of freight, it will trigger direct and indirect effects on Spain’s regional economies by increasing transport costs. This paper presents the development of a multiregional Input-Output methodology (MRIO) with elastic trade coefficients to predict in terregional trade, using transport attributes integrated in multinomial logit models. This method is highly useful to carry out an ex-ante evaluation of transport policies because it involves road freight transport cost sensitivity, and determine regional distributive and substitution economic effect s of countries like Spain, characterized by socio-demographic and economic attributes, differentiated region by region. It will thus be possible to determine cost-effective strategies, given different policy scenarios. MRIO mode l would then be used to determine the impact on the employment rate of imposing a charge in the Madrid-Sevilla corridor in Spain. This methodology is important for measuring the impact on the employment rate since it is one of the main macroeconomic indicators of Spain’s regional and national economic situation. A previous research developed (DESTINO) using a MRIO method estimated employment impacts of road pricing policy across Spanish regions considering a fuel tax charge (€/liter) in the entire shortest cost path network for freight transport. Actually, it found that the variation in employment is expected to be substantial for some regions, and negligible for others. For example, in this Spanish case study of regional employment has showed reductions between 16.1% (Rioja) and 1.4% (Madrid region). This variation range seems to be related to either the intensity of freight transport in each region or dependency of regions to transport intensive economic sect ors. In fact, regions with freight transport intensive sectors will lose more jobs while regions with a predominantly service economy undergo a fairly insignificant loss of employment. This paper is focused on evaluating a freight transport vehicle-kilometer charge (€/km) in a non-tolled motorway corridor (A-4) between Madrid-Sevilla (517 Km.). The consequences of the road pricing policy implementation show s that the employment reductions are not as high as the diminution stated in the previous research because this corridor does not affect the whole freight transport system of Spain.
Resumo:
A connectivity function defined by the 3D-Euler number, is a topological indicator and can be related to hydraulic properties (Vogel and Roth, 2001). This study aims to develop connectivity Euler indexes as indicators of the ability of soils for fluid percolation. The starting point was a 3D grey image acquired by X-ray computed tomography of a soil at bulk density of 1.2 mg cm-3. This image was used in the simulation of 40000 particles following a directed random walk algorithms with 7 binarization thresholds. These data consisted of 7 files containing the simulated end points of the 40000 random walks, obtained in Ruiz-Ramos et al. (2010). MATLAB software was used for computing the frequency matrix of the number of particles arriving at every end point of the random walks and their 3D representation.
Resumo:
This work sets out an innovative methodology that aims to facilitate the implementation and continuous improvement of Social Responsibility. It is a methodology that takes account of strategic-economic, social and environmental questions and allows measuring the impact of each of these aspects on the stakeholders and on each of the value areas. It can be extrapolated to all kinds of organisations regardless of their size and sector and admits scaleable models. A marked feature that sets it aside from other methodologies is that it eliminates subjectivity from the qualitative aspects and introduces an algorithm to quantify them.
Resumo:
This work evaluates a spline-based smoothing method applied to the output of a glucose predictor. Methods:Our on-line prediction algorithm is based on a neural network model (NNM). We trained/validated the NNM with a prediction horizon of 30 minutes using 39/54 profiles of patients monitored with the Guardian® Real-Time continuous glucose monitoring system The NNM output is smoothed by fitting a causal cubic spline. The assessment parameters are the error (RMSE), mean delay (MD) and the high-frequency noise (HFCrms). The HFCrms is the root-mean-square values of the high-frequency components isolated with a zero-delay non-causal filter. HFCrms is 2.90±1.37 (mg/dl) for the original profiles.
Resumo:
We propose distributed algorithms for sampling networks based on a new class of random walks that we call Centrifugal Random Walks (CRW). A CRW is a random walk that starts at a source and always moves away from it. We propose CRW algorithms for connected networks with arbitrary probability distributions, and for grids and networks with regular concentric connectivity with distance based distributions. All CRW sampling algorithms select a node with the exact probability distribution, do not need warm-up, and end in a number of hops bounded by the network diameter.
Resumo:
Sampling a network with a given probability distribution has been identified as a useful operation. In this paper we propose distributed algorithms for sampling networks, so that nodes are selected by a special node, called the source, with a given probability distribution. All these algorithms are based on a new class of random walks, that we call Random Centrifugal Walks (RCW). A RCW is a random walk that starts at the source and always moves away from it. Firstly, an algorithm to sample any connected network using RCW is proposed. The algorithm assumes that each node has a weight, so that the sampling process must select a node with a probability proportional to its weight. This algorithm requires a preprocessing phase before the sampling of nodes. In particular, a minimum diameter spanning tree (MDST) is created in the network, and then nodes weights are efficiently aggregated using the tree. The good news are that the preprocessing is done only once, regardless of the number of sources and the number of samples taken from the network. After that, every sample is done with a RCW whose length is bounded by the network diameter. Secondly, RCW algorithms that do not require preprocessing are proposed for grids and networks with regular concentric connectivity, for the case when the probability of selecting a node is a function of its distance to the source. The key features of the RCW algorithms (unlike previous Markovian approaches) are that (1) they do not need to warm-up (stabilize), (2) the sampling always finishes in a number of hops bounded by the network diameter, and (3) it selects a node with the exact probability distribution.
Resumo:
In this paper, a simulation tool for assisting the deployment of wireless sensor network is introduced and simulation results are verified under a specific indoor environment. The simulation tool supports two modes: deterministic mode and stochastic mode. The deterministic mode is environment dependent in which the information of environment should be provided beforehand. Ray tracing method and deterministic propagation model are employed in order to increase the accuracy of the estimated coverage, connectivity and routing; the stochastic mode is useful for large scale random deployment without previous knowledge on geographic information. Dynamic Source Routing protocol (DSR) and Ad hoc On-Demand Distance Vector Routing protocol (AODV) are implemented in order to calculate the topology of WSN. Hence this tool gives direct view on the performance of WSN and assists users in finding the potential problems of wireless sensor network before real deployment. At the end, a case study is realized in Centro de Electronica Industrial (CEI), the simulation results on coverage, connectivity and routing are verified by the measurement.
Resumo:
Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.
Resumo:
El auge del "Internet de las Cosas" (IoT, "Internet of Things") y sus tecnologías asociadas han permitido su aplicación en diversos dominios de la aplicación, entre los que se encuentran la monitorización de ecosistemas forestales, la gestión de catástrofes y emergencias, la domótica, la automatización industrial, los servicios para ciudades inteligentes, la eficiencia energética de edificios, la detección de intrusos, la gestión de desastres y emergencias o la monitorización de señales corporales, entre muchas otras. La desventaja de una red IoT es que una vez desplegada, ésta queda desatendida, es decir queda sujeta, entre otras cosas, a condiciones climáticas cambiantes y expuestas a catástrofes naturales, fallos de software o hardware, o ataques maliciosos de terceros, por lo que se puede considerar que dichas redes son propensas a fallos. El principal requisito de los nodos constituyentes de una red IoT es que estos deben ser capaces de seguir funcionando a pesar de sufrir errores en el propio sistema. La capacidad de la red para recuperarse ante fallos internos y externos inesperados es lo que se conoce actualmente como "Resiliencia" de la red. Por tanto, a la hora de diseñar y desplegar aplicaciones o servicios para IoT, se espera que la red sea tolerante a fallos, que sea auto-configurable, auto-adaptable, auto-optimizable con respecto a nuevas condiciones que puedan aparecer durante su ejecución. Esto lleva al análisis de un problema fundamental en el estudio de las redes IoT, el problema de la "Conectividad". Se dice que una red está conectada si todo par de nodos en la red son capaces de encontrar al menos un camino de comunicación entre ambos. Sin embargo, la red puede desconectarse debido a varias razones, como que se agote la batería, que un nodo sea destruido, etc. Por tanto, se hace necesario gestionar la resiliencia de la red con el objeto de mantener la conectividad entre sus nodos, de tal manera que cada nodo IoT sea capaz de proveer servicios continuos, a otros nodos, a otras redes o, a otros servicios y aplicaciones. En este contexto, el objetivo principal de esta tesis doctoral se centra en el estudio del problema de conectividad IoT, más concretamente en el desarrollo de modelos para el análisis y gestión de la Resiliencia, llevado a la práctica a través de las redes WSN, con el fin de mejorar la capacidad la tolerancia a fallos de los nodos que componen la red. Este reto se aborda teniendo en cuenta dos enfoques distintos, por una parte, a diferencia de otro tipo de redes de dispositivos convencionales, los nodos en una red IoT son propensos a perder la conexión, debido a que se despliegan en entornos aislados, o en entornos con condiciones extremas; por otra parte, los nodos suelen ser recursos con bajas capacidades en términos de procesamiento, almacenamiento y batería, entre otros, por lo que requiere que el diseño de la gestión de su resiliencia sea ligero, distribuido y energéticamente eficiente. En este sentido, esta tesis desarrolla técnicas auto-adaptativas que permiten a una red IoT, desde la perspectiva del control de su topología, ser resiliente ante fallos en sus nodos. Para ello, se utilizan técnicas basadas en lógica difusa y técnicas de control proporcional, integral y derivativa (PID - "proportional-integral-derivative"), con el objeto de mejorar la conectividad de la red, teniendo en cuenta que el consumo de energía debe preservarse tanto como sea posible. De igual manera, se ha tenido en cuenta que el algoritmo de control debe ser distribuido debido a que, en general, los enfoques centralizados no suelen ser factibles a despliegues a gran escala. El presente trabajo de tesis implica varios retos que conciernen a la conectividad de red, entre los que se incluyen: la creación y el análisis de modelos matemáticos que describan la red, una propuesta de sistema de control auto-adaptativo en respuesta a fallos en los nodos, la optimización de los parámetros del sistema de control, la validación mediante una implementación siguiendo un enfoque de ingeniería del software y finalmente la evaluación en una aplicación real. Atendiendo a los retos anteriormente mencionados, el presente trabajo justifica, mediante una análisis matemático, la relación existente entre el "grado de un nodo" (definido como el número de nodos en la vecindad del nodo en cuestión) y la conectividad de la red, y prueba la eficacia de varios tipos de controladores que permiten ajustar la potencia de trasmisión de los nodos de red en respuesta a eventuales fallos, teniendo en cuenta el consumo de energía como parte de los objetivos de control. Así mismo, este trabajo realiza una evaluación y comparación con otros algoritmos representativos; en donde se demuestra que el enfoque desarrollado es más tolerante a fallos aleatorios en los nodos de la red, así como en su eficiencia energética. Adicionalmente, el uso de algoritmos bioinspirados ha permitido la optimización de los parámetros de control de redes dinámicas de gran tamaño. Con respecto a la implementación en un sistema real, se han integrado las propuestas de esta tesis en un modelo de programación OSGi ("Open Services Gateway Initiative") con el objeto de crear un middleware auto-adaptativo que mejore la gestión de la resiliencia, especialmente la reconfiguración en tiempo de ejecución de componentes software cuando se ha producido un fallo. Como conclusión, los resultados de esta tesis doctoral contribuyen a la investigación teórica y, a la aplicación práctica del control resiliente de la topología en redes distribuidas de gran tamaño. Los diseños y algoritmos presentados pueden ser vistos como una prueba novedosa de algunas técnicas para la próxima era de IoT. A continuación, se enuncian de forma resumida las principales contribuciones de esta tesis: (1) Se han analizado matemáticamente propiedades relacionadas con la conectividad de la red. Se estudia, por ejemplo, cómo varía la probabilidad de conexión de la red al modificar el alcance de comunicación de los nodos, así como cuál es el mínimo número de nodos que hay que añadir al sistema desconectado para su re-conexión. (2) Se han propuesto sistemas de control basados en lógica difusa para alcanzar el grado de los nodos deseado, manteniendo la conectividad completa de la red. Se han evaluado diferentes tipos de controladores basados en lógica difusa mediante simulaciones, y los resultados se han comparado con otros algoritmos representativos. (3) Se ha investigado más a fondo, dando un enfoque más simple y aplicable, el sistema de control de doble bucle, y sus parámetros de control se han optimizado empleando algoritmos heurísticos como el método de la entropía cruzada (CE, "Cross Entropy"), la optimización por enjambre de partículas (PSO, "Particle Swarm Optimization"), y la evolución diferencial (DE, "Differential Evolution"). (4) Se han evaluado mediante simulación, la mayoría de los diseños aquí presentados; además, parte de los trabajos se han implementado y validado en una aplicación real combinando técnicas de software auto-adaptativo, como por ejemplo las de una arquitectura orientada a servicios (SOA, "Service-Oriented Architecture"). ABSTRACT The advent of the Internet of Things (IoT) enables a tremendous number of applications, such as forest monitoring, disaster management, home automation, factory automation, smart city, etc. However, various kinds of unexpected disturbances may cause node failure in the IoT, for example battery depletion, software/hardware malfunction issues and malicious attacks. So, it can be considered that the IoT is prone to failure. The ability of the network to recover from unexpected internal and external failures is known as "resilience" of the network. Resilience usually serves as an important non-functional requirement when designing IoT, which can further be broken down into "self-*" properties, such as self-adaptive, self-healing, self-configuring, self-optimization, etc. One of the consequences that node failure brings to the IoT is that some nodes may be disconnected from others, such that they are not capable of providing continuous services for other nodes, networks, and applications. In this sense, the main objective of this dissertation focuses on the IoT connectivity problem. A network is regarded as connected if any pair of different nodes can communicate with each other either directly or via a limited number of intermediate nodes. More specifically, this thesis focuses on the development of models for analysis and management of resilience, implemented through the Wireless Sensor Networks (WSNs), which is a challenging task. On the one hand, unlike other conventional network devices, nodes in the IoT are more likely to be disconnected from each other due to their deployment in a hostile or isolated environment. On the other hand, nodes are resource-constrained in terms of limited processing capability, storage and battery capacity, which requires that the design of the resilience management for IoT has to be lightweight, distributed and energy-efficient. In this context, the thesis presents self-adaptive techniques for IoT, with the aim of making the IoT resilient against node failures from the network topology control point of view. The fuzzy-logic and proportional-integral-derivative (PID) control techniques are leveraged to improve the network connectivity of the IoT in response to node failures, meanwhile taking into consideration that energy consumption must be preserved as much as possible. The control algorithm itself is designed to be distributed, because the centralized approaches are usually not feasible in large scale IoT deployments. The thesis involves various aspects concerning network connectivity, including: creation and analysis of mathematical models describing the network, proposing self-adaptive control systems in response to node failures, control system parameter optimization, implementation using the software engineering approach, and evaluation in a real application. This thesis also justifies the relations between the "node degree" (the number of neighbor(s) of a node) and network connectivity through mathematic analysis, and proves the effectiveness of various types of controllers that can adjust power transmission of the IoT nodes in response to node failures. The controllers also take into consideration the energy consumption as part of the control goals. The evaluation is performed and comparison is made with other representative algorithms. The simulation results show that the proposals in this thesis can tolerate more random node failures and save more energy when compared with those representative algorithms. Additionally, the simulations demonstrate that the use of the bio-inspired algorithms allows optimizing the parameters of the controller. With respect to the implementation in a real system, the programming model called OSGi (Open Service Gateway Initiative) is integrated with the proposals in order to create a self-adaptive middleware, especially reconfiguring the software components at runtime when failures occur. The outcomes of this thesis contribute to theoretic research and practical applications of resilient topology control for large and distributed networks. The presented controller designs and optimization algorithms can be viewed as novel trials of the control and optimization techniques for the coming era of the IoT. The contributions of this thesis can be summarized as follows: (1) Mathematically, the fault-tolerant probability of a large-scale stochastic network is analyzed. It is studied how the probability of network connectivity depends on the communication range of the nodes, and what is the minimum number of neighbors to be added for network re-connection. (2) A fuzzy-logic control system is proposed, which obtains the desired node degree and in turn maintains the network connectivity when it is subject to node failures. There are different types of fuzzy-logic controllers evaluated by simulations, and the results demonstrate the improvement of fault-tolerant capability as compared to some other representative algorithms. (3) A simpler but more applicable approach, the two-loop control system is further investigated, and its control parameters are optimized by using some heuristic algorithms such as Cross Entropy (CE), Particle Swarm Optimization (PSO), and Differential Evolution (DE). (4) Most of the designs are evaluated by means of simulations, but part of the proposals are implemented and tested in a real-world application by combining the self-adaptive software technique and the control algorithms which are presented in this thesis.
Resumo:
Inspections are used to prevent tax evasion or any other unlawful behavior. ? The effect of inspections depends on the network topology and the contagion rule. ? The network is modeled as a Watts?Strogatz Small World that is tuned from regular to random. ? Two contagion rules are applied: continuous and discontinuous. ? The equilibrium populations of payers and evaders are obtained in terms of these system parameters.