Biblioteca Digital

20 resultados para Software testing. Problem-oriented programming. Teachingmethodology

em Université de Lausanne, Switzerland

Adaptive strategy for the statistical analysis of connectomes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study an adaptive statistical approach to analyze brain networks represented by brain connection matrices of interregional connectivity (connectomes). Our approach is at a middle level between a global analysis and single connections analysis by considering subnetworks of the global brain network. These subnetworks represent either the inter-connectivity between two brain anatomical regions or by the intra-connectivity within the same brain anatomical region. An appropriate summary statistic, that characterizes a meaningful feature of the subnetwork, is evaluated. Based on this summary statistic, a statistical test is performed to derive the corresponding p-value. The reformulation of the problem in this way reduces the number of statistical tests in an orderly fashion based on our understanding of the problem. Considering the global testing problem, the p-values are corrected to control the rate of false discoveries. Finally, the procedure is followed by a local investigation within the significant subnetworks. We contrast this strategy with the one based on the individual measures in terms of power. We show that this strategy has a great potential, in particular in cases where the subnetworks are well defined and the summary statistics are properly chosen. As an application example, we compare structural brain connection matrices of two groups of subjects with a 22q11.2 deletion syndrome, distinguished by their IQ scores.

Grundriss innovativer Polizeiansätze. Eine kritische Begutachtung verschiedener Strategien und Tätigkeiten und deren Implementierung in der Schweiz in Theorie und Praxis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis contains three parts. The first one offers the theoretical basement, where the history of the police from their beginning in the early 19th century to this day is shown. The emphasis however is laid on the last 40 years, which gave birth to a multitude of innovations, such as community, problem-oriented, hot-spots or zero-tolerance policing. Those innovations are described in detail and are critically commented. At the end of this section, I present a scheme, where all the approaches are classified as strategic or methodic innovations, but united under a model called "modern policing". The fact that the innovations are not competitive but rather complementary is the most important finding of this examination. The second part of this work deals with a unique survey about the implementation of four innovations and eight problem- and community-oriented activities in 85 Swiss police forces. This explorative study shows that in the last 15 years the Swiss police forces have increasingly adopted innovative approaches. The most frequent innovation is community policing, which has been implemented all over the country. Due to the results, we can also assume that the implementation of the innovations is mostly substantial and profound. However, particularly in the area of problem-solving there is still a need for improvements. The third section consists of a scientific evaluation of a temporary special unit of the municipal police Zurich, which, during nine months, fought against public drug dealing and illegal prostitution in a particular neighborhood called Langstrasse. The effects of this hot-spot project were measured with police data, observations and several population surveys. In general, the special unit achieved a positive outcome and helped to defuse the hot-spot. Additionally, a survey conducted within the police department showed that the personal attitude towards the special unit differed widely between the policemen. We found significant differences between both police regions East and West, rank-and-file and higher ranking officers, different ages and the personal connection to the special unit. In fact, the higher the rank, the lower the age, and the closer the relationship, the more positive the officers were towards the unit.

Monitoring network optimisation for spatial data classification using support vector machines

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper presents a novel method for monitoring network optimisation, based on a recent machine learning technique known as support vector machine. It is problem-oriented in the sense that it directly answers the question of whether the advised spatial location is important for the classification model. The method can be used to increase the accuracy of classification models by taking a small number of additional measurements. Traditionally, network optimisation is performed by means of the analysis of the kriging variances. The comparison of the method with the traditional approach is presented on a real case study with climate data.

Des faux documents d'identité au renseignement forensique : développement d'une approche systématique et transversale du traitement de la donnée forensique à des fins de renseignement criminel

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La fabrication, la distribution et l'usage de fausses pièces d'identité constituent une menace pour la sécurité autant publique que privée. Ces faux documents représentent en effet un catalyseur pour une multitude de formes de criminalité, des plus anodines aux formes les plus graves et organisées. La dimension, la complexité, la faible visibilité, ainsi que les caractères répétitif et évolutif de la fraude aux documents d'identité appellent des réponses nouvelles qui vont au-delà d'une approche traditionnelle au cas par cas ou de la stratégie du tout technologique dont la perspective historique révèle l'échec. Ces nouvelles réponses passent par un renforcement de la capacité de comprendre les problèmes criminels que posent la fraude aux documents d'identité et les phénomènes qui l'animent. Cette compréhension est tout bonnement nécessaire pour permettre d'imaginer, d'évaluer et de décider les solutions et mesures les plus appropriées. Elle requière de développer les capacités d'analyse et la fonction de renseignement criminel qui fondent en particulier les modèles d'action de sécurité les plus récents, tels que l'intelligence-led policing ou le problem-oriented policing par exemple. Dans ce contexte, le travail doctoral adopte une position originale en postulant que les fausses pièces d'identité se conçoivent utilement comme la trace matérielle ou le vestige résultant de l'activité de fabrication ou d'altération d'un document d'identité menée par les faussaires. Sur la base de ce postulat fondamental, il est avancé que l'exploitation scientifique, méthodique et systématique de ces traces au travers d'un processus de renseignement forensique permet de générer des connaissances phénoménologiques sur les formes de criminalité qui fabriquent, diffusent ou utilisent les fausses pièces d'identité, connaissances qui s'intègrent et se mettent avantageusement au service du renseignement criminel. A l'appui de l'épreuve de cette thèse de départ et de l'étude plus générale du renseignement forensique, le travail doctoral propose des définitions et des modèles. Il décrit des nouvelles méthodes de profilage et initie la constitution d'un catalogue de formes d'analyses. Il recourt également à des expérimentations et des études de cas. Les résultats obtenus démontrent que le traitement systématique de la donnée forensique apporte une contribution utile et pertinente pour le renseignement criminel stratégique, opérationnel et tactique, ou encore la criminologie. Combiné aux informations disponibles par ailleurs, le renseignement forensique produit est susceptible de soutenir l'action de sécurité dans ses dimensions répressive, proactive, préventive et de contrôle. En particulier, les méthodes de profilage des fausses pièces d'identité proposées permettent de révéler des tendances au travers de jeux de données étendus, d'analyser des modus operandi ou d'inférer une communauté ou différence de source. Ces méthodes appuient des moyens de détection et de suivi des séries, des problèmes et des phénomènes criminels qui s'intègrent dans le cadre de la veille opérationnelle. Ils permettent de regrouper par problèmes les cas isolés, de mettre en évidence les formes organisées de criminalité qui méritent le plus d'attention, ou de produire des connaissances robustes et inédites qui offrent une perception plus profonde de la criminalité. Le travail discute également les difficultés associées à la gestion de données et d'informations propres à différents niveaux de généralité, ou les difficultés relatives à l'implémentation du processus de renseignement forensique dans la pratique. Ce travail doctoral porte en premier lieu sur les fausses pièces d'identité et leur traitement par les protagonistes de l'action de sécurité. Au travers d'une démarche inductive, il procède également à une généralisation qui souligne que les observations ci-dessus ne valent pas uniquement pour le traitement systématique des fausses pièces d'identité, mais pour celui de tout type de trace dès lors qu'un profil en est extrait. Il ressort de ces travaux une définition et une compréhension plus transversales de la notion et de la fonction de renseignement forensique. The production, distribution and use of false identity documents constitute a threat to both public and private security. Fraudulent documents are a catalyser for a multitude of crimes, from the most trivial to the most serious and organised forms. The dimension, complexity, low visibility as well as the repetitive and evolving character of the production and use of false identity documents call for new solutions that go beyond the traditional case-by-case approach, or the technology-focused strategy whose failure is revealed by the historic perspective. These new solutions require to strengthen the ability to understand crime phenomena and crime problems posed by false identity documents. Such an understanding is pivotal in order to be able to imagine, evaluate and decide on the most appropriate measures and responses. Therefore, analysis capacities and crime intelligence functions, which found the most recent policing models such as intelligence-led policing or problem-oriented policing for instance, have to be developed. In this context, the doctoral research work adopts an original position by postulating that false identity documents can be usefully perceived as the material remnant resulting from the criminal activity undertook by forgers, namely the manufacture or the modification of identity documents. Based on this fundamental postulate, it is proposed that a scientific, methodical and systematic processing of these traces through a forensic intelligence approach can generate phenomenological knowledge on the forms of crime that produce, distribute and use false identity documents. Such knowledge should integrate and serve advantageously crime intelligence efforts. In support of this original thesis and of a more general study of forensic intelligence, the doctoral work proposes definitions and models. It describes new profiling methods and initiates the construction of a catalogue of analysis forms. It also leverages experimentations and case studies. Results demonstrate that the systematic processing of forensic data usefully and relevantly contributes to strategic, tactical and operational crime intelligence, and also to criminology. Combined with alternative information available, forensic intelligence may support policing in its repressive, proactive, preventive and control activities. In particular, the proposed profiling methods enable to reveal trends among extended datasets, to analyse modus operandi, or to infer that false identity documents have a common or different source. These methods support the detection and follow-up of crime series, crime problems and phenomena and therefore contribute to crime monitoring efforts. They enable to link and regroup by problems cases that were previously viewed as isolated, to highlight organised forms of crime which deserve greatest attention, and to elicit robust and novel knowledge offering a deeper perception of crime. The doctoral research work discusses also difficulties associated with the management of data and information relating to different levels of generality, or difficulties associated with the implementation in practice of the forensic intelligence process. The doctoral work focuses primarily on false identity documents and their treatment by policing stakeholders. However, through an inductive process, it makes a generalisation which underlines that observations do not only apply to false identity documents but to any kind of trace as soon as a profile is extracted. A more transversal definition and understanding of the concept and function of forensic intelligence therefore derives from the doctoral work.

Points to consider for prioritizing clinical genetic testing services: a European consensus process oriented at accountability for reasonableness.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Given the cost constraints of the European health-care systems, criteria are needed to decide which genetic services to fund from the public budgets, if not all can be covered. To ensure that high-priority services are available equitably within and across the European countries, a shared set of prioritization criteria would be desirable. A decision process following the accountability for reasonableness framework was undertaken, including a multidisciplinary EuroGentest/PPPC-ESHG workshop to develop shared prioritization criteria. Resources are currently too limited to fund all the beneficial genetic testing services available in the next decade. Ethically and economically reflected prioritization criteria are needed. Prioritization should be based on considerations of medical benefit, health need and costs. Medical benefit includes evidence of benefit in terms of clinical benefit, benefit of information for important life decisions, benefit for other people apart from the person tested and the patient-specific likelihood of being affected by the condition tested for. It may be subject to a finite time window. Health need includes the severity of the condition tested for and its progression at the time of testing. Further discussion and better evidence is needed before clearly defined recommendations can be made or a prioritization algorithm proposed. To our knowledge, this is the first time a clinical society has initiated a decision process about health-care prioritization on a European level, following the principles of accountability for reasonableness. We provide points to consider to stimulate this debate across the EU and to serve as a reference for improving patient management.

Testing mediation: The endogeneity problem and the solution

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A mediator is a dependent variable, m (e.g., charisma), that is thought to channel the effect of an independent variable, x (e.g., receiving training or not), on another dependent variable (e.g., subordinate satisfaction), y. In experimental settings x is manipulated-subjects are randomized to treatment-to isolate the causal effect of x on other variables. If m is not or cannot be manipulated, which is often the case, its causal effect on other variables cannot be determined; thus, standard mediation tests cannot inform policy or practice. I will show how an econometric procedure, called instrumental-variable estimation, can examine mediation in such cases.

Assessing patient's belief on HIV testing before elective surgery: room for improvements in Switzerland !

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: ln Switzerland no HIV test is performed without the patient's consent based on a Voluntary Counseling and Testing policy (VCT). We hypothesized that a substantial proportion of patients going through an elective surgery falsely believed that an HIV test was performed on a routine basis and that the lack of transmission of result was interpreted as being HIV negative. Method: All patients with elective orthopedic surgery during 2007 were contacted by phone in 2008. A structured questionnaire assessed their belief about routine preoperative blood analysis (diabetes, coagulation function, HIV test and cholesterol level) as well as result awareness and interpretation. Variables included age and gender. Analysis were conducted using the software JMP 6.0.3. Results: 1123 patients were included. 130 (12 %) were excluded (Le. unreachable, unable to communicate on the phone, not operated). 993 completed the survey (89 %). Median age was 51 (16-79). 50 % were female. 376 (38 %) patients thought they had an HIV test performed before surgery but none of them had one. 298 (79 %) interpreted the absence of result as a negative HIV test. A predictive factor to believe an HIV test had been done was an age below 50 years old (45 % vs 33 % for 16-49 years old and 50-79 years old respectively, p < 0.001). No difference was observed between genders. Conclusion: ln Switzerland, nearly 40 % of the patients falsely thought an HIV test had been performed on a routine basis before surgery and were erroneously reassured about their HIV status. These results should either improve the information given to the patient regarding preoperative exams, or motivate public health policy to consider HIV opt-out screening instead of VCT strategy.

Methods for testing association between uncertain genotypes and quantitative traits.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interpretability and power of genome-wide association studies can be increased by imputing unobserved genotypes, using a reference panel of individuals genotyped at higher marker density. For many markers, genotypes cannot be imputed with complete certainty, and the uncertainty needs to be taken into account when testing for association with a given phenotype. In this paper, we compare currently available methods for testing association between uncertain genotypes and quantitative traits. We show that some previously described methods offer poor control of the false-positive rate (FPR), and that satisfactory performance of these methods is obtained only by using ad hoc filtering rules or by using a harsh transformation of the trait under study. We propose new methods that are based on exact maximum likelihood estimation and use a mixture model to accommodate nonnormal trait distributions when necessary. The new methods adequately control the FPR and also have equal or better power compared to all previously described methods. We provide a fast software implementation of all the methods studied here; our new method requires computation time of less than one computer-day for a typical genome-wide scan, with 2.5 M single nucleotide polymorphisms and 5000 individuals.

Effects of motive-oriented therapeutic relationship in early-phase treatment of borderline personality disorder: a pilot study of a randomized trial.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motive-oriented therapeutic relationship (MOTR, also called complementary therapeutic relationship) was postulated to be a particularly helpful therapeutic ingredient in the early-phase treatment of patients with personality disorders, in particular borderline personality disorder (BPD). The present pilot study of randomized controlled trial using an add-on design aims to investigate the effects of MOTR in early-phase treatment (up to session 10), with BPD patients on therapeutic alliance, session impact, and outcome. In total, N = 25 patients participated in the study. BPD patients were randomly allocated to a manual-based investigation process in 10 sessions or to the same investigation process infused with MOTR. Adherence ratings were performed and yielded satisfactory results. The results suggested a specific effectiveness of MOTR on the interpersonal problem area, on the quality of the therapeutic alliance and the quality of the therapeutic relationship, as rated by the patient. These results may have important clinical implications for the early-phase treatment of patients presenting with BPD.

Object-oriented Bayesian networks for evaluating DIP-STR profiling results from unbalanced DNA mixtures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The genetic characterization of unbalanced mixed stains remains an important area where improvement is imperative. In fact, with current methods for DNA analysis (Polymerase Chain Reaction with the SGM Plus™ multiplex kit), it is generally not possible to obtain a conventional autosomal DNA profile of the minor contributor if the ratio between the two contributors in a mixture is smaller than 1:10. This is a consequence of the fact that the major contributor's profile 'masks' that of the minor contributor. Besides known remedies to this problem, such as Y-STR analysis, a new compound genetic marker that consists of a Deletion/Insertion Polymorphism (DIP), linked to a Short Tandem Repeat (STR) polymorphism, has recently been developed and proposed elsewhere in literature [1]. The present paper reports on the derivation of an approach for the probabilistic evaluation of DIP-STR profiling results obtained from unbalanced DNA mixtures. The procedure is based on object-oriented Bayesian networks (OOBNs) and uses the likelihood ratio as an expression of the probative value. OOBNs are retained in this paper because they allow one to provide a clear description of the genotypic configuration observed for the mixed stain as well as for the various potential contributors (e.g., victim and suspect). These models also allow one to depict the assumed relevance relationships and perform the necessary probabilistic computations.

Testing the predictive adaptive response in a host-parasite system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

1. Harsh environmental conditions experienced during development can reduce the performance of the same individuals in adulthood. However, the 'predictive adaptive response' hypothesis postulates that if individuals adapt their phenotype during development to the environments where they are likely to live in the future, individuals exposed to harsh conditions in early life perform better when encountering the same harsh conditions in adulthood compared to those never exposed to these conditions before. 2. Using the common vole (Microtus arvalis) as study organism, we tested how exposure to flea parasitism during the juvenile stage affects the physiology (haematocrit, resistance to oxidative stress, resting metabolism, spleen mass, and testosterone), morphology (body mass, testis mass) and motor performance (open field activity and swimming speed) of the same individuals when infested with fleas in adulthood. According to the 'predictive adaptive response' hypothesis, we predicted that voles parasitized at the adult stage would perform better if they had already been parasitized with fleas at the juvenile stage. 3. We found that voles exposed to fleas in adulthood had a higher metabolic rate if already exposed to fleas when juvenile, compared to voles free of fleas when juvenile and voles free of fleas in adulthood. Independently of juvenile parasitism, adult parasitism impaired adult haematocrit and motor performances. Independently of adult parasitism, juvenile parasitism slowed down crawling speed in adult female voles. 4. Our results suggest that juvenile parasitism has long-term effects that do not protect from the detrimental effects of adult parasitism. On the contrary, experiencing parasitism in early-life incurs additional costs upon adult parasitism measured in terms of higher energy expenditure, rather than inducing an adaptive shift in the developmental trajectory. 5. Hence, our study provides experimental evidence for long term costs of parasitism. We found no support for a predictive adaptive response in this host-parasite system.

Testing new identity models and processes in French-speaking adolescents and emerging adults students

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Developing a sense of identity is a crucial psychosocial task for young people. The purpose of this study was to evaluate identity development in French-speaking adolescents and emerging adults (in France and Switzerland) using a process-oriented model of identity formation including five dimensions (i.e., exploration in breadth, commitment making, exploration in depth, identification with commitment, and ruminative exploration). The study included participants from three different samples (total N = 2239, 66.7% women): two samples of emerging adult student and one sample of adolescents. Results confirmed the hypothesized five-factor dimensional model of identity in our three samples and provided evidence for convergent validity of the model. The results also indicated that exploration in depth might be subdivided in two aspects: a first form of exploration in depth leading to a better understanding and to an increase of the strength of current commitments and a second form of exploration in depth leading to a re-evaluation and a reconsideration of current commitments. Further, the identity status cluster solution that emerged is globally in line with previous literature (i.e., achievement, foreclosure, moratorium, carefree diffusion, diffused diffusion, undifferentiated). However, despite a structural similarity, we found variations in identity profiles because identity development is shaped by cultural context. These specific variations are discussed in light of social, educational and economic differences between France and the French-speaking part of Switzerland. Implications and suggestions for future research are offered.

Modular support for developing mobile ad hoc applications

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract This PhD thesis addresses the issue of alleviating the burden of developing ad hoc applications. Such applications have the particularity of running on mobile devices, communicating in a peer-to-peer manner and implement some proximity-based semantics. A typical example of such application can be a radar application where users see their avatar as well as the avatars of their friends on a map on their mobile phone. Such application become increasingly popular with the advent of the latest generation of mobile smart phones with their impressive computational power, their peer-to-peer communication capabilities and their location detection technology. Unfortunately, the existing programming support for such applications is limited, hence the need to address this issue in order to alleviate their development burden. This thesis specifically tackles this problem by providing several tools for application development support. First, it provides the location-based publish/subscribe service (LPSS), a communication abstraction, which elegantly captures recurrent communication issues and thus allows to dramatically reduce the code complexity. LPSS is implemented in a modular manner in order to be able to target two different network architectures. One pragmatic implementation is aimed at mainstream infrastructure-based mobile networks, where mobile devices can communicate through fixed antennas. The other fully decentralized implementation targets emerging mobile ad hoc networks (MANETs), where no fixed infrastructure is available and communication can only occur in a peer-to-peer fashion. For each of these architectures, various implementation strategies tailored for different application scenarios that can be parametrized at deployment time. Second, this thesis provides two location-based message diffusion protocols, namely 6Shot broadcast and 6Shot multicast, specifically aimed at MANETs and fine tuned to be used as building blocks for LPSS. Finally this thesis proposes Phomo, a phone motion testing tool that allows to test proximity semantics of ad hoc applications without having to move around with mobile devices. These different developing support tools have been packaged in a coherent middleware framework called Pervaho.

A novel in vitro metabolomics approach for neurotoxicity testing, proof of principle for methyl mercury chloride and caffeine

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a need for more efficient methods giving insight into the complex mechanisms of neurotoxicity. Testing strategies including in vitro methods have been proposed to comply with this requirement. With the present study we aimed to develop a novel in vitro approach which mimics in vivo complexity, detects neurotoxicity comprehensively, and provides mechanistic insight. For this purpose we combined rat primary re-aggregating brain cell cultures with a mass spectrometry (MS)-based metabolomics approach. For the proof of principle we treated developing re-aggregating brain cell cultures for 48h with the neurotoxicant methyl mercury chloride (0.1-100muM) and the brain stimulant caffeine (1-100muM) and acquired cellular metabolic profiles. To detect toxicant-induced metabolic alterations the profiles were analysed using commercial software which revealed patterns in the multi-parametric dataset by principal component analyses (PCA), and recognised the most significantly altered metabolites. PCA revealed concentration-dependent cluster formations for methyl mercury chloride (0.1-1muM), and treatment-dependent cluster formations for caffeine (1-100muM) at sub-cytotoxic concentrations. Four relevant metabolites responsible for the concentration-dependent alterations following methyl mercury chloride treatment could be identified using MS-MS fragmentation analysis. These were gamma-aminobutyric acid, choline, glutamine, creatine and spermine. Their respective mass ion intensities demonstrated metabolic alterations in line with the literature and suggest that the metabolites could be biomarkers for mechanisms of neurotoxicity or neuroprotection. In addition, we evaluated whether the approach could identify neurotoxic potential by testing eight compounds which have target organ toxicity in the liver, kidney or brain at sub-cytotoxic concentrations. PCA revealed cluster formations largely dependent on target organ toxicity indicating possible potential for the development of a neurotoxicity prediction model. With such results it could be useful to perform a validation study to determine the reliability, relevance and applicability of this approach to neurotoxicity screening. Thus, for the first time we show the benefits and utility of in vitro metabolomics to comprehensively detect neurotoxicity and to discover new biomarkers.

Machine learning for geospatial data : algorithms, software tools and case studies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

«
1
2
»