30 resultados para Learning Models


Relevância:

30.00% 30.00%

Publicador:

Resumo:

1. Identifying the boundary of a species' niche from observational and environmental data is a common problem in ecology and conservation biology and a variety of techniques have been developed or applied to model niches and predict distributions. Here, we examine the performance of some pattern-recognition methods as ecological niche models (ENMs). Particularly, one-class pattern recognition is a flexible and seldom used methodology for modelling ecological niches and distributions from presence-only data. The development of one-class methods that perform comparably to two-class methods (for presence/absence data) would remove modelling decisions about sampling pseudo-absences or background data points when absence points are unavailable. 2. We studied nine methods for one-class classification and seven methods for two-class classification (five common to both), all primarily used in pattern recognition and therefore not common in species distribution and ecological niche modelling, across a set of 106 mountain plant species for which presence-absence data was available. We assessed accuracy using standard metrics and compared trade-offs in omission and commission errors between classification groups as well as effects of prevalence and spatial autocorrelation on accuracy. 3. One-class models fit to presence-only data were comparable to two-class models fit to presence-absence data when performance was evaluated with a measure weighting omission and commission errors equally. One-class models were superior for reducing omission errors (i.e. yielding higher sensitivity), and two-classes models were superior for reducing commission errors (i.e. yielding higher specificity). For these methods, spatial autocorrelation was only influential when prevalence was low. 4. These results differ from previous efforts to evaluate alternative modelling approaches to build ENM and are particularly noteworthy because data are from exhaustively sampled populations minimizing false absence records. Accurate, transferable models of species' ecological niches and distributions are needed to advance ecological research and are crucial for effective environmental planning and conservation; the pattern-recognition approaches studied here show good potential for future modelling studies. This study also provides an introduction to promising methods for ecological modelling inherited from the pattern-recognition discipline.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present research deals with an application of artificial neural networks for multitask learning from spatial environmental data. The real case study (sediments contamination of Geneva Lake) consists of 8 pollutants. There are different relationships between these variables, from linear correlations to strong nonlinear dependencies. The main idea is to construct a subsets of pollutants which can be efficiently modeled together within the multitask framework. The proposed two-step approach is based on: 1) the criterion of nonlinear predictability of each variable ?k? by analyzing all possible models composed from the rest of the variables by using a General Regression Neural Network (GRNN) as a model; 2) a multitask learning of the best model using multilayer perceptron and spatial predictions. The results of the study are analyzed using both machine learning and geostatistical tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents multiple kernel learning (MKL) regression as an exploratory spatial data analysis and modelling tool. The MKL approach is introduced as an extension of support vector regression, where MKL uses dedicated kernels to divide a given task into sub-problems and to treat them separately in an effective way. It provides better interpretability to non-linear robust kernel regression at the cost of a more complex numerical optimization. In particular, we investigate the use of MKL as a tool that allows us to avoid using ad-hoc topographic indices as covariables in statistical models in complex terrains. Instead, MKL learns these relationships from the data in a non-parametric fashion. A study on data simulated from real terrain features confirms the ability of MKL to enhance the interpretability of data-driven models and to aid feature selection without degrading predictive performances. Here we examine the stability of the MKL algorithm with respect to the number of training data samples and to the presence of noise. The results of a real case study are also presented, where MKL is able to exploit a large set of terrain features computed at multiple spatial scales, when predicting mean wind speed in an Alpine region.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents the Multiple Kernel Learning (MKL) approach as a modelling and data exploratory tool and applies it to the problem of wind speed mapping. Support Vector Regression (SVR) is used to predict spatial variations of the mean wind speed from terrain features (slopes, terrain curvature, directional derivatives) generated at different spatial scales. Multiple Kernel Learning is applied to learn kernels for individual features and thematic feature subsets, both in the context of feature selection and optimal parameters determination. An empirical study on real-life data confirms the usefulness of MKL as a tool that enhances the interpretability of data-driven models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As a thorough aggregation of probability and graph theory, Bayesian networks currently enjoy widespread interest as a means for studying factors that affect the coherent evaluation of scientific evidence in forensic science. Paper I of this series of papers intends to contribute to the discussion of Bayesian networks as a framework that is helpful for both illustrating and implementing statistical procedures that are commonly employed for the study of uncertainties (e.g. the estimation of unknown quantities). While the respective statistical procedures are widely described in literature, the primary aim of this paper is to offer an essentially non-technical introduction on how interested readers may use these analytical approaches - with the help of Bayesian networks - for processing their own forensic science data. Attention is mainly drawn to the structure and underlying rationale of a series of basic and context-independent network fragments that users may incorporate as building blocs while constructing larger inference models. As an example of how this may be done, the proposed concepts will be used in a second paper (Part II) for specifying graphical probability networks whose purpose is to assist forensic scientists in the evaluation of scientific evidence encountered in the context of forensic document examination (i.e. results of the analysis of black toners present on printed or copied documents).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Glucose-dependent insulinotropic polypeptide (GIP) is a key incretin hormone, released from intestine after a meal, producing a glucose-dependent insulin secretion. The GIP receptor (GIPR) is expressed on pyramidal neurons in the cortex and hippocampus, and GIP is synthesized in a subset of neurons in the brain. However, the role of the GIPR in neuronal signaling is not clear. In this study, we used a mouse strain with GIPR gene deletion (GIPR KO) to elucidate the role of the GIPR in neuronal communication and brain function. Compared with C57BL/6 control mice, GIPR KO mice displayed higher locomotor activity in an open-field task. Impairment of recognition and spatial learning and memory of GIPR KO mice were found in the object recognition task and a spatial water maze task, respectively. In an object location task, no impairment was found. GIPR KO mice also showed impaired synaptic plasticity in paired-pulse facilitation and a block of long-term potentiation in area CA1 of the hippocampus. Moreover, a large decrease in the number of neuronal progenitor cells was found in the dentate gyrus of transgenic mice, although the numbers of young neurons was not changed. Together the results suggest that GIP receptors play an important role in cognition, neurotransmission, and cell proliferation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Randomized controlled trials (RCTs) may be discontinued because of apparent harm, benefit, or futility. Other RCTs are discontinued early because of insufficient recruitment. Trial discontinuation has ethical implications, because participants consent on the premise of contributing to new medical knowledge, Research Ethics Committees (RECs) spend considerable effort reviewing study protocols, and limited resources for conducting research are wasted. Currently, little is known regarding the frequency and characteristics of discontinued RCTs. METHODS/DESIGN: Our aims are, first, to determine the prevalence of RCT discontinuation for specific reasons; second, to determine whether the risk of RCT discontinuation for specific reasons differs between investigator- and industry-initiated RCTs; third, to identify risk factors for RCT discontinuation due to insufficient recruitment; fourth, to determine at what stage RCTs are discontinued; and fifth, to examine the publication history of discontinued RCTs.We are currently assembling a multicenter cohort of RCTs based on protocols approved between 2000 and 2002/3 by 6 RECs in Switzerland, Germany, and Canada. We are extracting data on RCT characteristics and planned recruitment for all included protocols. Completion and publication status is determined using information from correspondence between investigators and RECs, publications identified through literature searches, or by contacting the investigators. We will use multivariable regression models to identify risk factors for trial discontinuation due to insufficient recruitment. We aim to include over 1000 RCTs of which an anticipated 150 will have been discontinued due to insufficient recruitment. DISCUSSION: Our study will provide insights into the prevalence and characteristics of RCTs that were discontinued. Effective recruitment strategies and the anticipation of problems are key issues in the planning and evaluation of trials by investigators, Clinical Trial Units, RECs and funding agencies. Identification and modification of barriers to successful study completion at an early stage could help to reduce the risk of trial discontinuation, save limited resources, and enable RCTs to better meet their ethical requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic environmental monitoring networks enforced by wireless communication technologies provide large and ever increasing volumes of data nowadays. The use of this information in natural hazard research is an important issue. Particularly useful for risk assessment and decision making are the spatial maps of hazard-related parameters produced from point observations and available auxiliary information. The purpose of this article is to present and explore the appropriate tools to process large amounts of available data and produce predictions at fine spatial scales. These are the algorithms of machine learning, which are aimed at non-parametric robust modelling of non-linear dependencies from empirical data. The computational efficiency of the data-driven methods allows producing the prediction maps in real time which makes them superior to physical models for the operational use in risk assessment and mitigation. Particularly, this situation encounters in spatial prediction of climatic variables (topo-climatic mapping). In complex topographies of the mountainous regions, the meteorological processes are highly influenced by the relief. The article shows how these relations, possibly regionalized and non-linear, can be modelled from data using the information from digital elevation models. The particular illustration of the developed methodology concerns the mapping of temperatures (including the situations of Föhn and temperature inversion) given the measurements taken from the Swiss meteorological monitoring network. The range of the methods used in the study includes data-driven feature selection, support vector algorithms and artificial neural networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to understand the development of non-genetically encoded actions during an animal's lifespan, it is necessary to analyze the dynamics and evolution of learning rules producing behavior. Owing to the intrinsic stochastic and frequency-dependent nature of learning dynamics, these rules are often studied in evolutionary biology via agent-based computer simulations. In this paper, we show that stochastic approximation theory can help to qualitatively understand learning dynamics and formulate analytical models for the evolution of learning rules. We consider a population of individuals repeatedly interacting during their lifespan, and where the stage game faced by the individuals fluctuates according to an environmental stochastic process. Individuals adjust their behavioral actions according to learning rules belonging to the class of experience-weighted attraction learning mechanisms, which includes standard reinforcement and Bayesian learning as special cases. We use stochastic approximation theory in order to derive differential equations governing action play probabilities, which turn out to have qualitative features of mutator-selection equations. We then perform agent-based simulations to find the conditions where the deterministic approximation is closest to the original stochastic learning process for standard 2-action 2-player fluctuating games, where interaction between learning rules and preference reversal may occur. Finally, we analyze a simplified model for the evolution of learning in a producer-scrounger game, which shows that the exploration rate can interact in a non-intuitive way with other features of co-evolving learning rules. Overall, our analyses illustrate the usefulness of applying stochastic approximation theory in the study of animal learning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract: The expansion of a recovering population - whether re-introduced or spontaneously returning - is shaped by (i) biological (intrinsic) factors such as the land tenure system or dispersal, (ii) the distribution and availability of resources (e.g. prey), (iii) habitat and landscape features, and (iv) human attitudes and activities. In order to develop efficient conservation and recovery strategies, we need to understand all these factors and to predict the potential distribution and explore ways to reach it. An increased number of lynx in the north-western Swiss Alps in the nineties lead to a new controversy about the return of this cat. When the large carnivores were given legal protection in many European countries, most organizations and individuals promoting their protection did not foresee the consequences. Management plans describing how to handle conflicts with large predators are needed to find a balance between "overabundance" and extinction. Wildlife and conservation biologists need to evaluate the various threats confronting populations so that adequate management decisions can be taken. I developed a GIS probability model for the lynx, based on habitat information and radio-telemetry data from the Swiss Jura Mountains, in order to predict the potential distribution of the lynx in this mountain range, which is presently only partly occupied by lynx. Three of the 18 variables tested for each square kilometre describing land use, vegetation, and topography, qualified to predict the probability of lynx presence. The resulting map was evaluated with data from dispersing subadult lynx. Young lynx that were not able to establish home ranges in what was identified as good lynx habitat did not survive their first year of independence, whereas the only one that died in good lynx habitat was illegally killed. Radio-telemetry fixes are often used as input data to calibrate habitat models. Radio-telemetry is the only way to gather accurate and unbiased data on habitat use of elusive larger terrestrial mammals. However, it is time consuming and expensive, and can therefore only be applied in limited areas. Habitat models extrapolated over large areas can in turn be problematic, as habitat characteristics and availability may change from one area to the other. I analysed the predictive power of Ecological Niche Factor Analysis (ENFA) in Switzerland with the lynx as focal species. According to my results, the optimal sampling strategy to predict species distribution in an Alpine area lacking available data would be to pool presence cells from contrasted regions (Jura Mountains, Alps), whereas in regions with a low ecological variance (Jura Mountains), only local presence cells should be used for the calibration of the model. Dispersal influences the dynamics and persistence of populations, the distribution and abundance of species, and gives the communities and ecosystems their characteristic texture in space and time. Between 1988 and 2001, the spatio-temporal behaviour of subadult Eurasian lynx in two re-introduced populations in Switzerland was studied, based on 39 juvenile lynx of which 24 were radio-tagged to understand the factors influencing dispersal. Subadults become independent from their mothers at the age of 8-11 months. No sex bias neither in the dispersal rate nor in the distance moved was detected. Lynx are conservative dispersers, compared to bear and wolf, and settled within or close to known lynx occurrences. Dispersal distances reached in the high lynx density population - shorter than those reported in other Eurasian lynx studies - are limited by habitat restriction hindering connections with neighbouring metapopulations. I postulated that high lynx density would lead to an expansion of the population and validated my predictions with data from the north-western Swiss Alps where about 1995 a strong increase in lynx abundance took place. The general hypothesis that high population density will foster the expansion of the population was not confirmed. This has consequences for the re-introduction and recovery of carnivores in a fragmented landscape. To establish a strong source population in one place might not be an optimal strategy. Rather, population nuclei should be founded in several neighbouring patches. Exchange between established neighbouring subpopulations will later on take place, as adult lynx show a higher propensity to cross barriers than subadults. To estimate the potential population size of the lynx in the Jura Mountains and to assess possible corridors between this population and adjacent areas, I adapted a habitat probability model for lynx distribution in the Jura Mountains with new environmental data and extrapolated it over the entire mountain range. The model predicts a breeding population ranging from 74-101 individuals and from 51-79 individuals when continuous habitat patches < 50 km2 are disregarded. The Jura Mountains could once be part of a metapopulation, as potential corridors exist to the adjoining areas (Alps, Vosges Mountains, and Black Forest). Monitoring of the population size, spatial expansion, and the genetic surveillance in the Jura Mountains must be continued, as the status of the population is still critical. ENFA was used to predict the potential distribution of lynx in the Alps. The resulting model divided the Alps into 37 suitable habitat patches ranging from 50 to 18,711 km2, covering a total area of about 93,600 km2. When using the range of lynx densities found in field studies in Switzerland, the Alps could host a population of 961 to 1,827 residents. The results of the cost-distance analysis revealed that all patches were within the reach of dispersing lynx, as the connection costs were in the range of dispersal cost of radio-tagged subadult lynx moving through unfavorable habitat. Thus, the whole Alps could once be considered as a metapopulation. But experience suggests that only few disperser will cross unsuitable areas and barriers. This low migration rate may seldom allow the spontaneous foundation of new populations in unsettled areas. As an alternative to natural dispersal, artificial transfer of individuals across the barriers should be considered. Wildlife biologists can play a crucial role in developing adaptive management experiments to help managers learning by trial. The case of the lynx in Switzerland is a good example of a fruitful cooperation between wildlife biologists, managers, decision makers and politician in an adaptive management process. This cooperation resulted in a Lynx Management Plan which was implemented in 2000 and updated in 2004 to give the cantons directives on how to handle lynx-related problems. This plan was put into practice e.g. in regard to translocation of lynx into unsettled areas. Résumé: L'expansion d'une population en phase de recolonisation, qu'elle soit issue de réintroductions ou d'un retour naturel dépend 1) de facteurs biologiques tels que le système social et le mode de dispersion, 2) de la distribution et la disponibilité des ressources (proies), 3) de l'habitat et des éléments du paysage, 4) de l'acceptation de l'espèce par la population locale et des activités humaines. Afin de pouvoir développer des stratégies efficaces de conservation et de favoriser la recolonisation, chacun de ces facteurs doit être pris en compte. En plus, la distribution potentielle de l'espèce doit pouvoir être déterminée et enfin, toutes les possibilités pour atteindre les objectifs, examinées. La phase de haute densité que la population de lynx a connue dans les années nonante dans le nord-ouest des Alpes suisses a donné lieu à une controverse assez vive. La protection du lynx dans de nombreux pays européens, promue par différentes organisations, a entraîné des conséquences inattendues; ces dernières montrent que tout plan de gestion doit impérativement indiquer des pistes quant à la manière de gérer les conflits, tout en trouvant un équilibre entre l'extinction et la surabondance de l'espèce. Les biologistes de la conservation et de la faune sauvage doivent pour cela évaluer les différents risques encourus par les populations de lynx, afin de pouvoir rapidement prendre les meilleuresmdécisions de gestion. Un modèle d'habitat pour le lynx, basé sur des caractéristiques de l'habitat et des données radio télémétriques collectées dans la chaîne du Jura, a été élaboré afin de prédire la distribution potentielle dans cette région, qui n'est que partiellement occupée par l'espèce. Trois des 18 variables testées, décrivant pour chaque kilomètre carré l'utilisation du sol, la végétation ainsi que la topographie, ont été retenues pour déterminer la probabilité de présence du lynx. La carte qui en résulte a été comparée aux données télémétriques de lynx subadultes en phase de dispersion. Les jeunes qui n'ont pas pu établir leur domaine vital dans l'habitat favorable prédit par le modèle n'ont pas survécu leur première année d'indépendance alors que le seul individu qui est mort dans l'habitat favorable a été braconné. Les données radio-télémétriques sont souvent utilisées pour l'étalonnage de modèles d'habitat. C'est un des seuls moyens à disposition qui permette de récolter des données non biaisées et précises sur l'occupation de l'habitat par des mammifères terrestres aux moeurs discrètes. Mais ces méthodes de- mandent un important investissement en moyens financiers et en temps et peuvent, de ce fait, n'être appliquées qu'à des zones limitées. Les modèles d'habitat sont ainsi souvent extrapolés à de grandes surfaces malgré le risque d'imprécision, qui résulte des variations des caractéristiques et de la disponibilité de l'habitat d'une zone à l'autre. Le pouvoir de prédiction de l'Analyse Ecologique de la Niche (AEN) dans les zones où les données de présence n'ont pas été prises en compte dans le calibrage du modèle a été analysée dans le cas du lynx en Suisse. D'après les résultats obtenus, la meilleure mé- thode pour prédire la distribution du lynx dans une zone alpine dépourvue d'indices de présence est de combiner des données provenant de régions contrastées (Alpes, Jura). Par contre, seules les données sur la présence locale de l'espèce doivent être utilisées pour les zones présentant une faible variance écologique tel que le Jura. La dispersion influence la dynamique et la stabilité des populations, la distribution et l'abondance des espèces et détermine les caractéristiques spatiales et temporelles des communautés vivantes et des écosystèmes. Entre 1988 et 2001, le comportement spatio-temporel de lynx eurasiens subadultes de deux populations réintroduites en Suisse a été étudié, basé sur le suivi de 39 individus juvéniles dont 24 étaient munis d'un collier émetteur, afin de déterminer les facteurs qui influencent la dispersion. Les subadultes se sont séparés de leur mère à l'âge de 8 à 11 mois. Le sexe n'a pas eu d'influence sur le nombre d'individus ayant dispersés et la distance parcourue au cours de la dispersion. Comparé à l'ours et au loup, le lynx reste très modéré dans ses mouvements de dispersion. Tous les individus ayant dispersés se sont établis à proximité ou dans des zones déjà occupées par des lynx. Les distances parcourues lors de la dispersion ont été plus courtes pour la population en phase de haute densité que celles relevées par les autres études de dispersion du lynx eurasien. Les zones d'habitat peu favorables et les barrières qui interrompent la connectivité entre les populations sont les principales entraves aux déplacements, lors de la dispersion. Dans un premier temps, nous avons fait l'hypothèse que les phases de haute densité favorisaient l'expansion des populations. Mais cette hypothèse a été infirmée par les résultats issus du suivi des lynx réalisé dans le nord-ouest des Alpes, où la population connaissait une phase de haute densité depuis 1995. Ce constat est important pour la conservation d'une population de carnivores dans un habitat fragmenté. Ainsi, instaurer une forte population source à un seul endroit n'est pas forcément la stratégie la plus judicieuse. Il est préférable d'établir des noyaux de populations dans des régions voisines où l'habitat est favorable. Des échanges entre des populations avoisinantes pourront avoir lieu par la suite car les lynx adultes sont plus enclins à franchir les barrières qui entravent leurs déplacements que les individus subadultes. Afin d'estimer la taille de la population de lynx dans le Jura et de déterminer les corridors potentiels entre cette région et les zones avoisinantes, un modèle d'habitat a été utilisé, basé sur un nouveau jeu de variables environnementales et extrapolé à l'ensemble du Jura. Le modèle prédit une population reproductrice de 74 à 101 individus et de 51 à 79 individus lorsque les surfaces d'habitat d'un seul tenant de moins de 50 km2 sont soustraites. Comme des corridors potentiels existent effectivement entre le Jura et les régions avoisinantes (Alpes, Vosges, et Forêt Noire), le Jura pourrait faire partie à l'avenir d'une métapopulation, lorsque les zones avoisinantes seront colonisées par l'espèce. La surveillance de la taille de la population, de son expansion spatiale et de sa structure génétique doit être maintenue car le statut de cette population est encore critique. L'AEN a également été utilisée pour prédire l'habitat favorable du lynx dans les Alpes. Le modèle qui en résulte divise les Alpes en 37 sous-unités d'habitat favorable dont la surface varie de 50 à 18'711 km2, pour une superficie totale de 93'600 km2. En utilisant le spectre des densités observées dans les études radio-télémétriques effectuées en Suisse, les Alpes pourraient accueillir une population de lynx résidents variant de 961 à 1'827 individus. Les résultats des analyses de connectivité montrent que les sous-unités d'habitat favorable se situent à des distances telles que le coût de la dispersion pour l'espèce est admissible. L'ensemble des Alpes pourrait donc un jour former une métapopulation. Mais l'expérience montre que très peu d'individus traverseront des habitats peu favorables et des barrières au cours de leur dispersion. Ce faible taux de migration rendra difficile toute nouvelle implantation de populations dans des zones inoccupées. Une solution alternative existe cependant : transférer artificiellement des individus d'une zone à l'autre. Les biologistes spécialistes de la faune sauvage peuvent jouer un rôle important et complémentaire pour les gestionnaires de la faune, en les aidant à mener des expériences de gestion par essai. Le cas du lynx en Suisse est un bel exemple d'une collaboration fructueuse entre biologistes de la faune sauvage, gestionnaires, organes décisionnaires et politiciens. Cette coopération a permis l'élaboration du Concept Lynx Suisse qui est entré en vigueur en 2000 et remis à jour en 2004. Ce plan donne des directives aux cantons pour appréhender la problématique du lynx. Il y a déjà eu des applications concrètes sur le terrain, notamment par des translocations d'individus dans des zones encore inoccupées.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been convincingly argued that computer simulation modeling differs from traditional science. If we understand simulation modeling as a new way of doing science, the manner in which scientists learn about the world through models must also be considered differently. This article examines how researchers learn about environmental processes through computer simulation modeling. Suggesting a conceptual framework anchored in a performative philosophical approach, we examine two modeling projects undertaken by research teams in England, both aiming to inform flood risk management. One of the modeling teams operated in the research wing of a consultancy firm, the other were university scientists taking part in an interdisciplinary project experimenting with public engagement. We found that in the first context the use of standardized software was critical to the process of improvisation, the obstacles emerging in the process concerned data and were resolved through exploiting affordances for generating, organizing, and combining scientific information in new ways. In the second context, an environmental competency group, obstacles were related to the computer program and affordances emerged in the combination of experience-based knowledge with the scientists' skill enabling a reconfiguration of the mathematical structure of the model, allowing the group to learn about local flooding.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.