149 resultados para machine recognition
em Université de Lausanne, Switzerland
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
1. Identifying the boundary of a species' niche from observational and environmental data is a common problem in ecology and conservation biology and a variety of techniques have been developed or applied to model niches and predict distributions. Here, we examine the performance of some pattern-recognition methods as ecological niche models (ENMs). Particularly, one-class pattern recognition is a flexible and seldom used methodology for modelling ecological niches and distributions from presence-only data. The development of one-class methods that perform comparably to two-class methods (for presence/absence data) would remove modelling decisions about sampling pseudo-absences or background data points when absence points are unavailable. 2. We studied nine methods for one-class classification and seven methods for two-class classification (five common to both), all primarily used in pattern recognition and therefore not common in species distribution and ecological niche modelling, across a set of 106 mountain plant species for which presence-absence data was available. We assessed accuracy using standard metrics and compared trade-offs in omission and commission errors between classification groups as well as effects of prevalence and spatial autocorrelation on accuracy. 3. One-class models fit to presence-only data were comparable to two-class models fit to presence-absence data when performance was evaluated with a measure weighting omission and commission errors equally. One-class models were superior for reducing omission errors (i.e. yielding higher sensitivity), and two-classes models were superior for reducing commission errors (i.e. yielding higher specificity). For these methods, spatial autocorrelation was only influential when prevalence was low. 4. These results differ from previous efforts to evaluate alternative modelling approaches to build ENM and are particularly noteworthy because data are from exhaustively sampled populations minimizing false absence records. Accurate, transferable models of species' ecological niches and distributions are needed to advance ecological research and are crucial for effective environmental planning and conservation; the pattern-recognition approaches studied here show good potential for future modelling studies. This study also provides an introduction to promising methods for ecological modelling inherited from the pattern-recognition discipline.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
Inbreeding avoidance is predicted to induce sex biases in dispersal. But which sex should disperse? In polygynous species, females pay higher costs to inbreeding and thus might be expected to disperse more, but empirical evidence consistently reveals male biases. Here, we show that theoretical expectations change drastically if females are allowed to avoid inbreeding via kin recognition. At high inbreeding loads, females should prefer immigrants over residents, thereby boosting male dispersal. At lower inbreeding loads, by contrast, inclusive fitness benefits should induce females to prefer relatives, thereby promoting male philopatry. This result points to disruptive effects of sexual selection. The inbreeding load that females are ready to accept is surprisingly high. In absence of search costs, females should prefer related partners as long as delta<r/(1+r) where r is relatedness and delta is the fecundity loss relative to an outbred mating. This amounts to fitness losses up to one-fifth for a half-sib mating and one-third for a full-sib mating, which lie in the upper range of inbreeding depression values currently reported in natural populations. The observation of active inbreeding avoidance in a polygynous species thus suggests that inbreeding depression exceeds this threshold in the species under scrutiny or that inbred matings at least partly forfeit other mating opportunities for males. Our model also shows that female choosiness should decline rapidly with search costs, stemming from, for example, reproductive delays. Species under strong time constraints on reproduction should thus be tolerant of inbreeding.
Resumo:
Brittle cornea syndrome (BCS) is an autosomal recessive disorder characterised by extreme corneal thinning and fragility. Corneal rupture can therefore occur either spontaneously or following minimal trauma in affected patients. Two genes, ZNF469 and PRDM5, have now been identified, in which causative pathogenic mutations collectively account for the condition in nearly all patients with BCS ascertained to date. Therefore, effective molecular diagnosis is now available for affected patients, and those at risk of being heterozygous carriers for BCS. We have previously identified mutations in ZNF469 in 14 families (in addition to 6 reported by others in the literature), and in PRDM5 in 8 families (with 1 further family now published by others). Clinical features include extreme corneal thinning with rupture, high myopia, blue sclerae, deafness of mixed aetiology with hypercompliant tympanic membranes, and variable skeletal manifestations. Corneal rupture may be the presenting feature of BCS, and it is possible that this may be incorrectly attributed to non-accidental injury. Mainstays of management include the prevention of ocular rupture by provision of protective polycarbonate spectacles, careful monitoring of visual and auditory function, and assessment for skeletal complications such as developmental dysplasia of the hip. Effective management depends upon appropriate identification of affected individuals, which may be challenging given the phenotypic overlap of BCS with other connective tissue disorders.
Resumo:
Interleukin 1 beta (IL-1 beta) is a potent proinflammatory factor during viral infection. Its production is tightly controlled by transcription of Il1b dependent on the transcription factor NF-kappaB and subsequent processing of pro-IL-1 beta by an inflammasome. However, the sensors and mechanisms that facilitate RNA virus-induced production of IL-1 beta are not well defined. Here we report a dual role for the RNA helicase RIG-I in RNA virus-induced proinflammatory responses. Whereas RIG-I-mediated activation of NF-kappaB required the signaling adaptor MAVS and a complex of the adaptors CARD9 and Bcl-10, RIG-I also bound to the adaptor ASC to trigger caspase-1-dependent inflammasome activation by a mechanism independent of MAVS, CARD9 and the Nod-like receptor protein NLRP3. Our results identify the CARD9-Bcl-10 module as an essential component of the RIG-I-dependent proinflammatory response and establish RIG-I as a sensor able to activate the inflammasome in response to certain RNA viruses.
Resumo:
This work compares the structural/dynamics features of the wild-type alb-adrenergic receptor (AR) with those of the D142A active mutant and the agonist-bound state. The two active receptor forms were compared in their isolated states as well as in their ability to form homodimers and to recognize the G alpha q beta 1 gamma 2 heterotrimer. The analysis of the isolated structures revealed that, although the mutation- and agonist-induced active states of the alpha 1b-AR are different, they, however, share several structural peculiarities including (a) the release of some constraining interactions found in the wild-type receptor and (b) the opening of a cytosolic crevice formed by the second and third intracellular loops and the cytosolic extensions of helices 5 and 6. Accordingly, also their tendency to form homodimers shows commonalties and differences. In fact, in both the active receptor forms, helix 6 plays a crucial role in mediating homodimerization. However, the homodimeric models result from different interhelical assemblies. On the same line of evidence, in both of the active receptor forms, the cytosolic opened crevice recognizes similar domains on the G protein. However, the docking solutions are differently populated and the receptor-G protein preorientation models suggest that the final complexes should be characterized by different interaction patterns.
Resumo:
A right-handed man developed a sudden transient, amnestic syndrome associated with bilateral hemorrhage of the hippocampi, probably due to Urbach-Wiethe disease. In the 3rd month, despite significant hippocampal structural damage on imaging, only a milder degree of retrograde and anterograde amnesia persisted on detailed neuropsychological examination. On systematic testing of recognition of facial and vocal expression of emotion, we found an impairment of the vocal perception of fear, but not that of other emotions, such as joy, sadness and anger. Such selective impairment of fear perception was not present in the recognition of facial expression of emotion. Thus emotional perception varies according to the different aspects of emotions and the different modality of presentation (faces versus voices). This is consistent with the idea that there may be multiple emotion systems. The study of emotional perception in this unique case of bilateral involvement of hippocampus suggests that this structure may play a critical role in the recognition of fear in vocal expression, possibly dissociated from that of other emotions and from that of fear in facial expression. In regard of recent data suggesting that the amygdala is playing a role in the recognition of fear in the auditory as well as in the visual modality this could suggest that the hippocampus may be part of the auditory pathway of fear recognition.
Resumo:
Pyochelin (Pch) and enantiopyochelin (EPch) are enantiomeric siderophores, with three chiral centers, produced under iron limitation conditions by Pseudomonas aeruginosa and Pseudomonas fluorescens , respectively. After iron chelation in the extracellular medium, Pch-Fe and EPch-Fe are recognized and transported by their specific outer-membrane transporters: FptA in P. aeruginosa and FetA in P. fluorescens . Structural analysis of FetA-EPch-Fe and FptA-Pch-Fe, combined with mutagenesis and docking studies revealed the structural basis of the stereospecific recognition of these enantiomers by their respective transporters. Whereas FetA and FptA have a low sequence identity but high structural homology, the Pch and EPch binding pockets do not share any structural homology, but display similar physicochemical properties. The stereospecific recognition of both enantiomers by their corresponding transporters is imposed by the configuration of the siderophore's C4'' and C2'' chiral centers. This recognition involves specific hydrogen bonds between the Arg91 guanidinium group and EPch-Fe for FetA and between the Leu117-Leu116 main chain and Pch-Fe for FptA. FetA and FptA are the first membrane receptors to be structurally described with opposite binding enantioselectivities for their ligands, giving insights into the structural basis of their enantiospecificity.
Resumo:
Background: Natural Killer (NK) cells are thought to protect from residual leukemic cells in patients receiving stem cell transplantation. However, multiple retrospective analyses of patient data have yielded conflicting conclusions regarding a putative role of NK cells and the essential NK cell recognition events mediating a protective effect against leukemia. Further, a NK cell mediated protective effect against primary leukemia in vivo has not been shown directly.Methodology/Principal Findings: Here we addressed whether NK cells have the potential to control chronic myeloid leukemia (CML) arising based on the transplantation of BCR-ABL1 oncogene expressing primary bone marrow precursor cells into lethally irradiated recipient mice. These analyses identified missing-self recognition as the only NK cell-mediated recognition strategy, which is able to significantly protect from the development of CML disease in vivo.Conclusion: Our data provide a proof of principle that NK cells can control primary leukemic cells in vivo. Since the presence of NK cells reduced the abundance of leukemia propagating cancer stem cells, the data raise the possibility that NK cell recognition has the potential to cure CML, which may be difficult using small molecule BCR-ABL1 inhibitors. Finally, our findings validate approaches to treat leukemia using antibody-based blockade of self-specific inhibitory MHC class I receptors.
Resumo:
The Cinque Torri group (Cortina d'Ampezzo, Italy) is an articulated system of unstable carbonatic rock monoliths located in a very important tourism area and therefore characterized by a significant risk. The instability phenomena involved represent an example of lateral spreading developed over a larger deep seated gravitational slope deformation (DSGSD) area. After the recent fall of a monolith of more than 10 000 m3, a scientific study was initiated to monitor the more unstable sectors and to characterize the past movements as a fundamental tool for predicting future movements and hazard assessment. To achieve greater insight on the ongoing lateral spreading process, a method for a quantitative analysis of rotational movements associated with the lateral spreading has been developed, applied and validated. The method is based on: i) detailed geometrical characterization of the area by means of laser scanner techniques; ii) recognition of the discontinuity sets and definition of a reference frame for each set, iii) correlation between the obtained reference frames related to a specific sector and a stable external reference frame, and iv) determination of the 3D rotations in terms of Euler angles to describe the present settlement of the Cinque Torri system with respect to the surrounding stable areas. In this way, significant information on the processes involved in the fragmentation and spreading of a former dolomitic plateau into different rock cliffs has been gained. The method is suitable to be applied to similar case studies.
Advanced mapping of environmental data: Geostatistics, Machine Learning and Bayesian Maximum Entropy
Resumo:
This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.