894 resultados para Data modelling
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
Debris flow hazard modelling at medium (regional) scale has been subject of various studies in recent years. In this study, hazard zonation was carried out, incorporating information about debris flow initiation probability (spatial and temporal), and the delimitation of the potential runout areas. Debris flow hazard zonation was carried out in the area of the Consortium of Mountain Municipalities of Valtellina di Tirano (Central Alps, Italy). The complexity of the phenomenon, the scale of the study, the variability of local conditioning factors, and the lacking data limited the use of process-based models for the runout zone delimitation. Firstly, a map of hazard initiation probabilities was prepared for the study area, based on the available susceptibility zoning information, and the analysis of two sets of aerial photographs for the temporal probability estimation. Afterwards, the hazard initiation map was used as one of the inputs for an empirical GIS-based model (Flow-R), developed at the University of Lausanne (Switzerland). An estimation of the debris flow magnitude was neglected as the main aim of the analysis was to prepare a debris flow hazard map at medium scale. A digital elevation model, with a 10 m resolution, was used together with landuse, geology and debris flow hazard initiation maps as inputs of the Flow-R model to restrict potential areas within each hazard initiation probability class to locations where debris flows are most likely to initiate. Afterwards, runout areas were calculated using multiple flow direction and energy based algorithms. Maximum probable runout zones were calibrated using documented past events and aerial photographs. Finally, two debris flow hazard maps were prepared. The first simply delimits five hazard zones, while the second incorporates the information about debris flow spreading direction probabilities, showing areas more likely to be affected by future debris flows. Limitations of the modelling arise mainly from the models applied and analysis scale, which are neglecting local controlling factors of debris flow hazard. The presented approach of debris flow hazard analysis, associating automatic detection of the source areas and a simple assessment of the debris flow spreading, provided results for consequent hazard and risk studies. However, for the validation and transferability of the parameters and results to other study areas, more testing is needed.
Resumo:
The paper describes how to integrate audience measurement and site visibility as the main research approaches in outdoor advertising research in a single concept. Details are portrayed on how GPS is used on a large scale in Switzerland for mobility analysis and audience measurement. Furthermore, the development of a software solution is introduced that allows the integration of all mobility data and poster location information. Finally a model and its results is presented for the calculation of coverage of individual poster campaigns and for the calculation of the number of contacts generated by each billboard.
Resumo:
Research has demonstrated that landscape or watershed scale processes can influence instream aquatic ecosystems, in terms of the impacts of delivery of fine sediment, solutes and organic matter. Testing such impacts upon populations of organisms (i.e. at the catchment scale) has not proven straightforward and differences have emerged in the conclusions reached. This is: (1) partly because different studies have focused upon different scales of enquiry; but also (2) because the emphasis upon upstream land cover has rarely addressed the extent to which such land covers are hydrologically connected, and hence able to deliver diffuse pollution, to the drainage network However, there is a third issue. In order to develop suitable hydrological models, we need to conceptualise the process cascade. To do this, we need to know what matters to the organism being impacted by the hydrological system, such that we can identify which processes need to be modelled. Acquiring such knowledge is not easy, especially for organisms like fish that might occupy very different locations in the river over relatively short periods of time. However, and inevitably, hydrological modellers have started by building up piecemeal the aspects of the problem that we think matter to fish. Herein, we report two developments: (a) for the case of sediment associated diffuse pollution from agriculture, a risk-based modelling framework, SCIMAP, has been developed, which is distinct because it has an explicit focus upon hydrological connectivity; and (b) we use spatially distributed ecological data to infer the processes and the associated process parameters that matter to salmonid fry. We apply the model to spatially distributed salmon and fry data from the River Eden, Cumbria, England. The analysis shows, quite surprisingly, that arable land covers are relatively unimportant as drivers of fry abundance. What matters most is intensive pasture, a land cover that could be associated with a number of stressors on salmonid fry (e.g. pesticides, fine sediment) and which allows us to identify a series of risky field locations, where this land cover is readily connected to the river system by overland flow. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Aim, Location Although the alpine mouse Apodemus alpicola has been given species status since 1989, no distribution map has ever been constructed for this endemic alpine rodent in Switzerland. Based on redetermined museum material and using the Ecological-Niche Factor Analysis (ENFA), habitat-suitability maps were computed for A. alpicola, and also for the co-occurring A. flavicollis and A. sylvaticus. Methods In the particular case of habitat suitability models, classical approaches (GLMs, GAMs, discriminant analysis, etc.) generally require presence and absence data. The presence records provided by museums can clearly give useful information about species distribution and ecology and have already been used for knowledge-based mapping. In this paper, we apply the ENFA which requires only presence data, to build a habitat-suitability map of three species of Apodemus on the basis of museum skull collections. Results Interspecific niche comparisons showed that A. alpicola is very specialized concerning habitat selection, meaning that its habitat differs unequivocally from the average conditions in Switzerland, while both A. flavicollis and A. sylvaticus could be considered as 'generalists' in the study area. Main conclusions Although an adequate sampling design is the best way to collect ecological data for predictive modelling, this is a time and money consuming process and there are cases where time is simply not available, as for instance with endangered species conservation. On the other hand, museums, herbariums and other similar institutions are treasuring huge presence data sets. By applying the ENFA to such data it is possible to rapidly construct a habitat suitability model. The ENFA method not only provides two key measurements regarding the niche of a species (i.e. marginality and specialization), but also has ecological meaning, and allows the scientist to compare directly the niches of different species.
Resumo:
OBJECTIVE: To compare the pharmacokinetic and pharmacodynamic characteristics of angiotensin II receptor antagonists as a therapeutic class. DESIGN: Population pharmacokinetic-pharmacodynamic modelling study. METHODS: The data of 14 phase I studies with 10 different drugs were analysed. A common population pharmacokinetic model (two compartments, mixed zero- and first-order absorption, two metabolite compartments) was applied to the 2685 drug and 900 metabolite concentration measurements. A standard nonlinear mixed effect modelling approach was used to estimate the drug-specific parameters and their variabilities. Similarly, a pharmacodynamic model was applied to the 7360 effect measurements, i.e. the decrease of peak blood pressure response to intravenous angiotensin challenge recorded by finger photoplethysmography. The concentration of drug and metabolite in an effect compartment was assumed to translate into receptor blockade [maximum effect (Emax) model with first-order link]. RESULTS: A general pharmacokinetic-pharmacodynamic (PK-PD) model for angiotensin antagonism in healthy individuals was successfully built up for the 10 drugs studied. Representatives of this class share different pharmacokinetic and pharmacodynamic profiles. Their effects on blood pressure are dose-dependent, but the time course of the effect varies between the drugs. CONCLUSIONS: The characterisation of PK-PD relationships for these drugs gives the opportunity to optimise therapeutic regimens and to suggest dosage adjustments in specific conditions. Such a model can be used to further refine the use of this class of drugs.
Resumo:
Schistosomiasis mansoni is not just a physical disease, but is related to social and behavioural factors as well. Snails of the Biomphalaria genus are an intermediate host for Schistosoma mansoni and infect humans through water. The objective of this study is to classify the risk of schistosomiasis in the state of Minas Gerais (MG). We focus on socioeconomic and demographic features, basic sanitation features, the presence of accumulated water bodies, dense vegetation in the summer and winter seasons and related terrain characteristics. We draw on the decision tree approach to infection risk modelling and mapping. The model robustness was properly verified. The main variables that were selected by the procedure included the terrain's water accumulation capacity, temperature extremes and the Human Development Index. In addition, the model was used to generate two maps, one that included risk classification for the entire of MG and another that included classification errors. The resulting map was 62.9% accurate.
Resumo:
Spatial data on species distributions are available in two main forms, point locations and distribution maps (polygon ranges and grids). The first are often temporally and spatially biased, and too discontinuous, to be useful (untransformed) in spatial analyses. A variety of modelling approaches are used to transform point locations into maps. We discuss the attributes that point location data and distribution maps must satisfy in order to be useful in conservation planning. We recommend that before point location data are used to produce and/or evaluate distribution models, the dataset should be assessed under a set of criteria, including sample size, age of data, environmental/geographical coverage, independence, accuracy, time relevance and (often forgotten) representation of areas of permanent and natural presence of the species. Distribution maps must satisfy additional attributes if used for conservation analyses and strategies, including minimizing commission and omission errors, credibility of the source/assessors and availability for public screening. We review currently available databases for mammals globally and show that they are highly variable in complying with these attributes. The heterogeneity and weakness of spatial data seriously constrain their utility to global and also sub-global scale conservation analyses.
Resumo:
One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By anessential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur inmany compositional situations, such as household budget patterns, time budgets,palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful insuch situations. From consideration of such examples it seems sensible to build up amodel in two stages, the first determining where the zeros will occur and the secondhow the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data
Resumo:
The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the mostimportant targets in monitoring programs. From a general point of view, many systematic (often cyclic) and randomprocesses control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statisticalmodelling.Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) fromeight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed byconsidering their dependence from time in the period 2000-2007. Each intermediate chemical composition has beenconsidered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumarolesF5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve theproposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations,features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and inthe evaluation of the natural risk related to a quiescent volcano
Resumo:
A four compartment model of the cardiovascular system is developed. To allow for easy interpretation and to minimise the number of parameters, an effort was made to keep the model as simple as possible. A sensitivity analysis is first carried out to determine which are the most important model parameters to characterise the blood pressure signal. A four stage process is then described which accurately determines all parameter values. This process is applied to data from three patients and good agreement is shown in all cases.
Resumo:
Aim We investigated the late Quaternary history of two closely related and partly sympatric species of Primula from the south-western European Alps, P. latifolia Lapeyr. and P. marginata Curtis, by combining phylogeographical and palaeodistribution modelling approaches. In particular, we were interested in whether the two approaches were congruent and identified the same glacial refugia. Location South-western European Alps. Methods For the phylogeographical analysis we included 353 individuals from 28 populations of P. marginata and 172 individuals from 15 populations of P. latifolia and used amplified fragment length polymorphisms (AFLPs). For palaeodistribution modelling, species distribution models (SDMs) were based on extant species occurrences and then projected to climate models (CCSM, MIROC) of the Last Glacial Maximum (LGM), approximately 21 ka. Results The locations of the modelled LGM refugia were confirmed by various indices of genetic variation. The refugia of the two species were largely geographically isolated, overlapping only 6% to 11% of the species' total LGM distribution. This overlap decreased when the position of the glacial ice sheet and the differential elevational and edaphic distributions of the two species were considered. Main conclusions The combination of phylogeography and palaeodistribution modelling proved useful in locating putative glacial refugia of two alpine species of Primula. The phylogeographical data allowed us to identify those parts of the modelled LGM refugial area that were likely source areas for recolonization. The use of SDMs predicted LGM refugial areas substantially larger and geographically more divergent than could have been predicted by phylogeographical data alone
Resumo:
Résumé Le cancer du sein est le cancer le plus commun chez les femmes et est responsable de presque 30% de tous les nouveaux cas de cancer en Europe. On estime le nombre de décès liés au cancer du sein en Europe est à plus de 130.000 par an. Ces chiffres expliquent l'impact social considérable de cette maladie. Les objectifs de cette thèse étaient: (1) d'identifier les prédispositions et les mécanismes biologiques responsables de l'établissement des sous-types spécifiques de cancer du sein; (2) les valider dans un modèle ín vivo "humain-dans-souris"; et (3) de développer des traitements spécifiques à chaque sous-type de cancer du sein identifiés. Le premier objectif a été atteint par l'intermédiaire de l'analyse des données d'expression de gènes des tumeurs, produite dans notre laboratoire. Les données obtenues par puces à ADN ont été produites à partir de 49 biopsies des tumeurs du sein provenant des patientes participant dans l'essai clinique EORTC 10994/BIG00-01. Les données étaient très riches en information et m'ont permis de valider des données précédentes des autres études d'expression des gènes dans des tumeurs du sein. De plus, cette analyse m'a permis d'identifier un nouveau sous-type biologique de cancer du sein. Dans la première partie de la thèse, je décris I identification des tumeurs apocrines du sein par l'analyse des puces à ADN et les implications potentielles de cette découverte pour les applications cliniques. Le deuxième objectif a été atteint par l'établissement d'un modèle de cancer du sein humain, basé sur des cellules épithéliales mammaires humaines primaires (HMECs) dérivées de réductions mammaires. J'ai choisi d'adapter un système de culture des cellules en suspension basé sur des mammosphères précédemment décrit et pat décidé d'exprimer des gènes en utilisant des lentivirus. Dans la deuxième partie de ma thèse je décris l'établissement d'un système de culture cellulaire qui permet la transformation quantitative des HMECs. Par la suite, j'ai établi un modèle de xénogreffe dans les souris immunodéficientes NOD/SCID, qui permet de modéliser la maladie humaine chez la souris. Dans la troisième partie de ma thèse je décris et je discute les résultats que j'ai obtenus en établissant un modèle estrogène-dépendant de cancer du sein par transformation quantitative des HMECs avec des gènes définis, identifiés par analyse de données d'expression des gènes dans le cancer du sein. Les cellules transformées dans notre modèle étaient estrogène-dépendantes pour la croissance, diploïdes et génétiquement normales même après la culture cellulaire in vitro prolongée. Les cellules formaient des tumeurs dans notre modèle de xénogreffe et constituaient des métastases péritonéales disséminées et du foie. Afin d'atteindre le troisième objectif de ma thèse, j'ai défini et examiné des stratégies de traitement qui permettent réduire les tumeurs et les métastases. J'ai produit un modèle de cancer du sein génétiquement défini et positif pour le récepteur de l'estrogène qui permet de modéliser le cancer du sein estrogène-dépendant humain chez la souris. Ce modèle permet l'étude des mécanismes impliqués dans la formation des tumeurs et des métastases. Abstract Breast cancer is the most common cancer in women and accounts for nearly 30% of all new cancer cases in Europe. The number of deaths from breast cancer in Europe is estimated to be over 130,000 each year, implying the social impact of the disease. The goals of this thesis were first, to identify biological features and mechanisms --responsible for the establishment of specific breast cancer subtypes, second to validate them in a human-in-mouse in vivo model and third to develop specific treatments for identified breast cancer subtypes. The first objective was achieved via the analysis of tumour gene expression data produced in our lab. The microarray data were generated from 49 breast tumour biopsies that were collected from patients enrolled in the clinical trial EORTC 10994/BIG00-01. The data set was very rich in information and allowed me to validate data of previous breast cancer gene expression studies and to identify biological features of a novel breast cancer subtype. In the first part of the thesis I focus on the identification of molecular apacrine breast tumours by microarray analysis and the potential imptìcation of this finding for the clinics. The second objective was attained by the production of a human breast cancer model system based on primary human mammary epithelial cells {HMECs) derived from reduction mammoplasties. I have chosen to adopt a previously described suspension culture system based on mammospheres and expressed selected target genes using lentiviral expression constructs. In the second part of my thesis I mainly focus on the establishment of a cell culture system allowing for quantitative transformation of HMECs. I then established a xenograft model in immunodeficient NOD/SCID mice, allowing to model human disease in a mouse. In the third part of my thesis I describe and discuss the results that I obtained while establishing an oestrogen-dependent model of breast cancer by quantitative transformation of HMECs with defined genes identified after breast cancer gene expression data analysis. The transformed cells in our model are oestrogen-dependent for growth; remain diploid and genetically normal even after prolonged cell culture in vitro. The cells farm tumours and form disseminated peritoneal and liver metastases in our xenograft model. Along the lines of the third objective of my thesis I defined and tested treatment schemes allowing reducing tumours and metastases. I have generated a genetically defined model of oestrogen receptor alpha positive human breast cancer that allows to model human oestrogen-dependent breast cancer in a mouse and enables the study of mechanisms involved in tumorigenesis and metastasis.
Resumo:
Predicting which species will occur together in the future, and where, remains one of the greatest challenges in ecology, and requires a sound understanding of how the abiotic and biotic environments interact with dispersal processes and history across scales. Biotic interactions and their dynamics influence species' relationships to climate, and this also has important implications for predicting future distributions of species. It is already well accepted that biotic interactions shape species' spatial distributions at local spatial extents, but the role of these interactions beyond local extents (e.g. 10 km(2) to global extents) are usually dismissed as unimportant. In this review we consolidate evidence for how biotic interactions shape species distributions beyond local extents and review methods for integrating biotic interactions into species distribution modelling tools. Drawing upon evidence from contemporary and palaeoecological studies of individual species ranges, functional groups, and species richness patterns, we show that biotic interactions have clearly left their mark on species distributions and realised assemblages of species across all spatial extents. We demonstrate this with examples from within and across trophic groups. A range of species distribution modelling tools is available to quantify species environmental relationships and predict species occurrence, such as: (i) integrating pairwise dependencies, (ii) using integrative predictors, and (iii) hybridising species distribution models (SDMs) with dynamic models. These methods have typically only been applied to interacting pairs of species at a single time, require a priori ecological knowledge about which species interact, and due to data paucity must assume that biotic interactions are constant in space and time. To better inform the future development of these models across spatial scales, we call for accelerated collection of spatially and temporally explicit species data. Ideally, these data should be sampled to reflect variation in the underlying environment across large spatial extents, and at fine spatial resolution. Simplified ecosystems where there are relatively few interacting species and sometimes a wealth of existing ecosystem monitoring data (e.g. arctic, alpine or island habitats) offer settings where the development of modelling tools that account for biotic interactions may be less difficult than elsewhere.