991 resultados para PROBABILISTIC NETWORKS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the time scales associated with diffusion processes that take place on multiplex networks, i.e., on a set of networks linked through interconnected layers. To this end, we propose the construction of a supra-Laplacian matrix, which consists of a dimensional lifting of the Laplacian matrix of each layer of the multiplex network. We use perturbative analysis to reveal analytically the structure of eigenvectors and eigenvalues of the complete network in terms of the spectral properties of the individual layers. The spectrum of the supra-Laplacian allows us to understand the physics of diffusionlike processes on top of multiplex networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We uncover the global organization of clustering in real complex networks. To this end, we ask whether triangles in real networks organize as in maximally random graphs with given degree and clustering distributions, or as in maximally ordered graph models where triangles are forced into modules. The answer comes by way of exploring m-core landscapes, where the m-core is defined, akin to the k-core, as the maximal subgraph with edges participating in at least m triangles. This property defines a set of nested subgraphs that, contrarily to k-cores, is able to distinguish between hierarchical and modular architectures. We find that the clustering organization in real networks is neither completely random nor ordered although, surprisingly, it is more random than modular. This supports the idea that the structure of real networks may in fact be the outcome of self-organized processes based on local optimization rules, in contrast to global optimization principles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Résumé Ce travail de thèse étudie des moyens de formalisation permettant d'assister l'expert forensique dans la gestion des facteurs influençant l'évaluation des indices scientifiques, tout en respectant des procédures d'inférence établies et acceptables. Selon une vue préconisée par une partie majoritaire de la littérature forensique et juridique - adoptée ici sans réserve comme point de départ - la conceptualisation d'une procédure évaluative est dite 'cohérente' lors qu'elle repose sur une implémentation systématique de la théorie des probabilités. Souvent, par contre, la mise en oeuvre du raisonnement probabiliste ne découle pas de manière automatique et peut se heurter à des problèmes de complexité, dus, par exemple, à des connaissances limitées du domaine en question ou encore au nombre important de facteurs pouvant entrer en ligne de compte. En vue de gérer ce genre de complications, le présent travail propose d'investiguer une formalisation de la théorie des probabilités au moyen d'un environment graphique, connu sous le nom de Réseaux bayesiens (Bayesian networks). L'hypothèse principale que cette recherche envisage d'examiner considère que les Réseaux bayesiens, en concert avec certains concepts accessoires (tels que des analyses qualitatives et de sensitivité), constituent une ressource clé dont dispose l'expert forensique pour approcher des problèmes d'inférence de manière cohérente, tant sur un plan conceptuel que pratique. De cette hypothèse de travail, des problèmes individuels ont été extraits, articulés et abordés dans une série de recherches distinctes, mais interconnectées, et dont les résultats - publiés dans des revues à comité de lecture - sont présentés sous forme d'annexes. D'un point de vue général, ce travail apporte trois catégories de résultats. Un premier groupe de résultats met en évidence, sur la base de nombreux exemples touchant à des domaines forensiques divers, l'adéquation en termes de compatibilité et complémentarité entre des modèles de Réseaux bayesiens et des procédures d'évaluation probabilistes existantes. Sur la base de ces indications, les deux autres catégories de résultats montrent, respectivement, que les Réseaux bayesiens permettent également d'aborder des domaines auparavant largement inexplorés d'un point de vue probabiliste et que la disponibilité de données numériques dites 'dures' n'est pas une condition indispensable pour permettre l'implémentation des approches proposées dans ce travail. Le présent ouvrage discute ces résultats par rapport à la littérature actuelle et conclut en proposant les Réseaux bayesiens comme moyen d'explorer des nouvelles voies de recherche, telles que l'étude de diverses formes de combinaison d'indices ainsi que l'analyse de la prise de décision. Pour ce dernier aspect, l'évaluation des probabilités constitue, dans la façon dont elle est préconisée dans ce travail, une étape préliminaire fondamentale de même qu'un moyen opérationnel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work focuses on the prediction of the two main nitrogenous variables that describe the water quality at the effluent of a Wastewater Treatment Plant. We have developed two kind of Neural Networks architectures based on considering only one output or, in the other hand, the usual five effluent variables that define the water quality: suspended solids, biochemical organic matter, chemical organic matter, total nitrogen and total Kjedhal nitrogen. Two learning techniques based on a classical adaptative gradient and a Kalman filter have been implemented. In order to try to improve generalization and performance we have selected variables by means genetic algorithms and fuzzy systems. The training, testing and validation sets show that the final networks are able to learn enough well the simulated available data specially for the total nitrogen

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Forensic scientists face increasingly complex inference problems for evaluating likelihood ratios (LRs) for an appropriate pair of propositions. Up to now, scientists and statisticians have derived LR formulae using an algebraic approach. However, this approach reaches its limits when addressing cases with an increasing number of variables and dependence relationships between these variables. In this study, we suggest using a graphical approach, based on the construction of Bayesian networks (BNs). We first construct a BN that captures the problem, and then deduce the expression for calculating the LR from this model to compare it with existing LR formulae. We illustrate this idea by applying it to the evaluation of an activity level LR in the context of the two-trace transfer problem. Our approach allows us to relax assumptions made in previous LR developments, produce a new LR formula for the two-trace transfer problem and generalize this scenario to n traces.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyze the process of informational exchange through complex networks by measuring network efficiencies. Aiming to study nonclustered systems, we propose a modification of this measure on the local level. We apply this method to an extension of the class of small worlds that includes declustered networks and show that they are locally quite efficient, although their clustering coefficient is practically zero. Unweighted systems with small-world and scale-free topologies are shown to be both globally and locally efficient. Our method is also applied to characterize weighted networks. In particular we examine the properties of underground transportation systems of Madrid and Barcelona and reinterpret the results obtained for the Boston subway network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a class of models of social network formation based on a mathematical abstraction of the concept of social distance. Social distance attachment is represented by the tendency of peers to establish acquaintances via a decreasing function of the relative distance in a representative social space. We derive analytical results (corroborated by extensive numerical simulations), showing that the model reproduces the main statistical characteristics of real social networks: large clustering coefficient, positive degree correlations, and the emergence of a hierarchy of communities. The model is confronted with the social network formed by people that shares confidential information using the Pretty Good Privacy (PGP) encryption algorithm, the so-called web of trust of PGP.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose This paper aims to analyse various aspects of an academic social network: the profile of users, the reasons for its use, its perceived benefits and the use of other social media for scholarly purposes. Design/methodology/approach The authors examined the profiles of the users of an academic social network. The users were affiliated with 12 universities. The following were recorded for each user: sex, the number of documents uploaded, the number of followers, and the number of people being followed. In addition, a survey was sent to the individuals who had an email address in their profile. Findings Half of the users of the social network were academics and a third were PhD students. Social sciences scholars accounted for nearly half of all users. Academics used the service to get in touch with other scholars, disseminate research results and follow other scholars. Other widely employed social media included citation indexes, document creation, edition and sharing tools and communication tools. Users complained about the lack of support for the utilisation of these tools. Research limitations/implications The results are based on a single case study. Originality/value This study provides new insights on the impact of social media in academic contexts by analysing the user profiles and benefits of a social network service that is specifically targeted at the academic community.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the general regression neural networks (GRNN) as a nonlinear regression method for the interpolation of monthly wind speeds in complex Alpine orography. GRNN is trained using data coming from Swiss meteorological networks to learn the statistical relationship between topographic features and wind speed. The terrain convexity, slope and exposure are considered by extracting features from the digital elevation model at different spatial scales using specialised convolution filters. A database of gridded monthly wind speeds is then constructed by applying GRNN in prediction mode during the period 1968-2008. This study demonstrates that using topographic features as inputs in GRNN significantly reduces cross-validation errors with respect to low-dimensional models integrating only geographical coordinates and terrain height for the interpolation of wind speed. The spatial predictability of wind speed is found to be lower in summer than in winter due to more complex and weaker wind-topography relationships. The relevance of these relationships is studied using an adaptive version of the GRNN algorithm which allows to select the useful terrain features by eliminating the noisy ones. This research provides a framework for extending the low-dimensional interpolation models to high-dimensional spaces by integrating additional features accounting for the topographic conditions at multiple spatial scales. Copyright (c) 2012 Royal Meteorological Society.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Effective coordination is key to many situations that affect the well-being of two or more humans. Social coordination can be studied in coordination games between individuals located on networks of contacts. We study the behavior of humans in the laboratory when they play the Stag Hunt game - a game that has a risky but socially efficient equilibrium and an inefficient but safe equilibrium. We contrast behavior on a cliquish network to behavior on a random network. The cliquish network is highly clustered and resembles more closely to actual social networks than the random network. In contrast to simulations, we find that human players dynamics do not converge to the efficient outcome more often in the cliquish network than in the random network. Subjects do not use pure myopic best-reply as an individual update rule. Numerical simulations agree with laboratory results once we implement the actual individual updating rule that human subjects use in our laboratory experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The neuropathology of Alzheimer disease is characterized by senile plaques, neurofibrillary tangles and cell death. These hallmarks develop according to the differential vulnerability of brain networks, senile plaques accumulating preferentially in the associative cortical areas and neurofibrillary tangles in the entorhinal cortex and the hippocampus. We suggest that the main aetiological hypotheses such as the beta-amyloid cascade hypothesis or its variant, the synaptic beta-amyloid hypothesis, will have to consider neural networks not just as targets of degenerative processes but also as contributors of the disease's progression and of its phenotype. Three domains of research are highlighted in this review. First, the cerebral reserve and the redundancy of the network's elements are related to brain vulnerability. Indeed, an enriched environment appears to increase the cerebral reserve as well as the threshold of disease's onset. Second, disease's progression and memory performance cannot be explained by synaptic or neuronal loss only, but also by the presence of compensatory mechanisms, such as synaptic scaling, at the microcircuit level. Third, some phenotypes of Alzheimer disease, such as hallucinations, appear to be related to progressive dysfunction of neural networks as a result, for instance, of a decreased signal to noise ratio, involving a diminished activity of the cholinergic system. Overall, converging results from studies of biological as well as artificial neural networks lead to the conclusion that changes in neural networks contribute strongly to Alzheimer disease's progression.