860 resultados para conceptual data modelling


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents an approach for mapping of precipitation data. The main goal is to perform spatial predictions and simulations of precipitation fields using geostatistical methods (ordinary kriging, kriging with external drift) as well as machine learning algorithms (neural networks). More practically, the objective is to reproduce simultaneously both the spatial patterns and the extreme values. This objective is best reached by models integrating geostatistics and machine learning algorithms. To demonstrate how such models work, two case studies have been considered: first, a 2-day accumulation of heavy precipitation and second, a 6-day accumulation of extreme orographic precipitation. The first example is used to compare the performance of two optimization algorithms (conjugate gradients and Levenberg-Marquardt) of a neural network for the reproduction of extreme values. Hybrid models, which combine geostatistical and machine learning algorithms, are also treated in this context. The second dataset is used to analyze the contribution of radar Doppler imagery when used as external drift or as input in the models (kriging with external drift and neural networks). Model assessment is carried out by comparing independent validation errors as well as analyzing data patterns.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Debris flow hazard modelling at medium (regional) scale has been subject of various studies in recent years. In this study, hazard zonation was carried out, incorporating information about debris flow initiation probability (spatial and temporal), and the delimitation of the potential runout areas. Debris flow hazard zonation was carried out in the area of the Consortium of Mountain Municipalities of Valtellina di Tirano (Central Alps, Italy). The complexity of the phenomenon, the scale of the study, the variability of local conditioning factors, and the lacking data limited the use of process-based models for the runout zone delimitation. Firstly, a map of hazard initiation probabilities was prepared for the study area, based on the available susceptibility zoning information, and the analysis of two sets of aerial photographs for the temporal probability estimation. Afterwards, the hazard initiation map was used as one of the inputs for an empirical GIS-based model (Flow-R), developed at the University of Lausanne (Switzerland). An estimation of the debris flow magnitude was neglected as the main aim of the analysis was to prepare a debris flow hazard map at medium scale. A digital elevation model, with a 10 m resolution, was used together with landuse, geology and debris flow hazard initiation maps as inputs of the Flow-R model to restrict potential areas within each hazard initiation probability class to locations where debris flows are most likely to initiate. Afterwards, runout areas were calculated using multiple flow direction and energy based algorithms. Maximum probable runout zones were calibrated using documented past events and aerial photographs. Finally, two debris flow hazard maps were prepared. The first simply delimits five hazard zones, while the second incorporates the information about debris flow spreading direction probabilities, showing areas more likely to be affected by future debris flows. Limitations of the modelling arise mainly from the models applied and analysis scale, which are neglecting local controlling factors of debris flow hazard. The presented approach of debris flow hazard analysis, associating automatic detection of the source areas and a simple assessment of the debris flow spreading, provided results for consequent hazard and risk studies. However, for the validation and transferability of the parameters and results to other study areas, more testing is needed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper describes how to integrate audience measurement and site visibility as the main research approaches in outdoor advertising research in a single concept. Details are portrayed on how GPS is used on a large scale in Switzerland for mobility analysis and audience measurement. Furthermore, the development of a software solution is introduced that allows the integration of all mobility data and poster location information. Finally a model and its results is presented for the calculation of coverage of individual poster campaigns and for the calculation of the number of contacts generated by each billboard.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Research has demonstrated that landscape or watershed scale processes can influence instream aquatic ecosystems, in terms of the impacts of delivery of fine sediment, solutes and organic matter. Testing such impacts upon populations of organisms (i.e. at the catchment scale) has not proven straightforward and differences have emerged in the conclusions reached. This is: (1) partly because different studies have focused upon different scales of enquiry; but also (2) because the emphasis upon upstream land cover has rarely addressed the extent to which such land covers are hydrologically connected, and hence able to deliver diffuse pollution, to the drainage network However, there is a third issue. In order to develop suitable hydrological models, we need to conceptualise the process cascade. To do this, we need to know what matters to the organism being impacted by the hydrological system, such that we can identify which processes need to be modelled. Acquiring such knowledge is not easy, especially for organisms like fish that might occupy very different locations in the river over relatively short periods of time. However, and inevitably, hydrological modellers have started by building up piecemeal the aspects of the problem that we think matter to fish. Herein, we report two developments: (a) for the case of sediment associated diffuse pollution from agriculture, a risk-based modelling framework, SCIMAP, has been developed, which is distinct because it has an explicit focus upon hydrological connectivity; and (b) we use spatially distributed ecological data to infer the processes and the associated process parameters that matter to salmonid fry. We apply the model to spatially distributed salmon and fry data from the River Eden, Cumbria, England. The analysis shows, quite surprisingly, that arable land covers are relatively unimportant as drivers of fry abundance. What matters most is intensive pasture, a land cover that could be associated with a number of stressors on salmonid fry (e.g. pesticides, fine sediment) and which allows us to identify a series of risky field locations, where this land cover is readily connected to the river system by overland flow. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aim, Location Although the alpine mouse Apodemus alpicola has been given species status since 1989, no distribution map has ever been constructed for this endemic alpine rodent in Switzerland. Based on redetermined museum material and using the Ecological-Niche Factor Analysis (ENFA), habitat-suitability maps were computed for A. alpicola, and also for the co-occurring A. flavicollis and A. sylvaticus. Methods In the particular case of habitat suitability models, classical approaches (GLMs, GAMs, discriminant analysis, etc.) generally require presence and absence data. The presence records provided by museums can clearly give useful information about species distribution and ecology and have already been used for knowledge-based mapping. In this paper, we apply the ENFA which requires only presence data, to build a habitat-suitability map of three species of Apodemus on the basis of museum skull collections. Results Interspecific niche comparisons showed that A. alpicola is very specialized concerning habitat selection, meaning that its habitat differs unequivocally from the average conditions in Switzerland, while both A. flavicollis and A. sylvaticus could be considered as 'generalists' in the study area. Main conclusions Although an adequate sampling design is the best way to collect ecological data for predictive modelling, this is a time and money consuming process and there are cases where time is simply not available, as for instance with endangered species conservation. On the other hand, museums, herbariums and other similar institutions are treasuring huge presence data sets. By applying the ENFA to such data it is possible to rapidly construct a habitat suitability model. The ENFA method not only provides two key measurements regarding the niche of a species (i.e. marginality and specialization), but also has ecological meaning, and allows the scientist to compare directly the niches of different species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE: To compare the pharmacokinetic and pharmacodynamic characteristics of angiotensin II receptor antagonists as a therapeutic class. DESIGN: Population pharmacokinetic-pharmacodynamic modelling study. METHODS: The data of 14 phase I studies with 10 different drugs were analysed. A common population pharmacokinetic model (two compartments, mixed zero- and first-order absorption, two metabolite compartments) was applied to the 2685 drug and 900 metabolite concentration measurements. A standard nonlinear mixed effect modelling approach was used to estimate the drug-specific parameters and their variabilities. Similarly, a pharmacodynamic model was applied to the 7360 effect measurements, i.e. the decrease of peak blood pressure response to intravenous angiotensin challenge recorded by finger photoplethysmography. The concentration of drug and metabolite in an effect compartment was assumed to translate into receptor blockade [maximum effect (Emax) model with first-order link]. RESULTS: A general pharmacokinetic-pharmacodynamic (PK-PD) model for angiotensin antagonism in healthy individuals was successfully built up for the 10 drugs studied. Representatives of this class share different pharmacokinetic and pharmacodynamic profiles. Their effects on blood pressure are dose-dependent, but the time course of the effect varies between the drugs. CONCLUSIONS: The characterisation of PK-PD relationships for these drugs gives the opportunity to optimise therapeutic regimens and to suggest dosage adjustments in specific conditions. Such a model can be used to further refine the use of this class of drugs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Schistosomiasis mansoni is not just a physical disease, but is related to social and behavioural factors as well. Snails of the Biomphalaria genus are an intermediate host for Schistosoma mansoni and infect humans through water. The objective of this study is to classify the risk of schistosomiasis in the state of Minas Gerais (MG). We focus on socioeconomic and demographic features, basic sanitation features, the presence of accumulated water bodies, dense vegetation in the summer and winter seasons and related terrain characteristics. We draw on the decision tree approach to infection risk modelling and mapping. The model robustness was properly verified. The main variables that were selected by the procedure included the terrain's water accumulation capacity, temperature extremes and the Human Development Index. In addition, the model was used to generate two maps, one that included risk classification for the entire of MG and another that included classification errors. The resulting map was 62.9% accurate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In their safety evaluations of bisphenol A (BPA), the U.S. Food and Drug Administration (FDA) and a counterpart in Europe, the European Food Safety Authority (EFSA), have given special prominence to two industry-funded studies that adhered to standards defined by Good Laboratory Practices (GLP). These same agencies have given much less weight in risk assessments to a large number of independently replicated non-GLP studies conducted with government funding by the leading experts in various fields of science from around the world. OBJECTIVES: We reviewed differences between industry-funded GLP studies of BPA conducted by commercial laboratories for regulatory purposes and non-GLP studies conducted in academic and government laboratories to identify hazards and molecular mechanisms mediating adverse effects. We examined the methods and results in the GLP studies that were pivotal in the draft decision of the U.S. FDA declaring BPA safe in relation to findings from studies that were competitive for U.S. National Institutes of Health (NIH) funding, peer-reviewed for publication in leading journals, subject to independent replication, but rejected by the U.S. FDA for regulatory purposes. DISCUSSION: Although the U.S. FDA and EFSA have deemed two industry-funded GLP studies of BPA to be superior to hundreds of studies funded by the U.S. NIH and NIH counterparts in other countries, the GLP studies on which the agencies based their decisions have serious conceptual and methodologic flaws. In addition, the U.S. FDA and EFSA have mistakenly assumed that GLP yields valid and reliable scientific findings (i.e., "good science"). Their rationale for favoring GLP studies over hundreds of publically funded studies ignores the central factor in determining the reliability and validity of scientific findings, namely, independent replication, and use of the most appropriate and sensitive state-of-the-art assays, neither of which is an expectation of industry-funded GLP research. CONCLUSIONS: Public health decisions should be based on studies using appropriate protocols with appropriate controls and the most sensitive assays, not GLP. Relevant NIH-funded research using state-of-the-art techniques should play a prominent role in safety evaluations of chemicals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Roughly fifteen years ago, the Church of Jesus Christ of Latter-day Saints published a new proposed standard file format. They call it GEDCOM. It was designed to allow different genealogy programs to exchange data.Five years later, in may 2000, appeared the GENTECH Data Modeling Project, with the support of the Federation of Genealogical Societies (FGS) and other American genealogical societies. They attempted to define a genealogical logic data model to facilitate data exchange between different genealogical programs. Although genealogists deal with an enormous variety of data sources, one of the central concepts of this data model was that all genealogical data could be broken down into a series of short, formal genealogical statements. It was something more versatile than only export/import data records on a predefined fields. This project was finally absorbed in 2004 by the National Genealogical Society (NGS).Despite being a genealogical reference in many applications, these models have serious drawbacks to adapt to different cultural and social environments. At the present time we have no formal proposal for a recognized standard to represent the family domain.Here we propose an alternative conceptual model, largely inherited from aforementioned models. The design is intended to overcome their limitations. However, its major innovation lies in applying the ontological paradigm when modeling statements and entities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spatial data on species distributions are available in two main forms, point locations and distribution maps (polygon ranges and grids). The first are often temporally and spatially biased, and too discontinuous, to be useful (untransformed) in spatial analyses. A variety of modelling approaches are used to transform point locations into maps. We discuss the attributes that point location data and distribution maps must satisfy in order to be useful in conservation planning. We recommend that before point location data are used to produce and/or evaluate distribution models, the dataset should be assessed under a set of criteria, including sample size, age of data, environmental/geographical coverage, independence, accuracy, time relevance and (often forgotten) representation of areas of permanent and natural presence of the species. Distribution maps must satisfy additional attributes if used for conservation analyses and strategies, including minimizing commission and omission errors, credibility of the source/assessors and availability for public screening. We review currently available databases for mammals globally and show that they are highly variable in complying with these attributes. The heterogeneity and weakness of spatial data seriously constrain their utility to global and also sub-global scale conservation analyses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By anessential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur inmany compositional situations, such as household budget patterns, time budgets,palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful insuch situations. From consideration of such examples it seems sensible to build up amodel in two stages, the first determining where the zeros will occur and the secondhow the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the mostimportant targets in monitoring programs. From a general point of view, many systematic (often cyclic) and randomprocesses control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statisticalmodelling.Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) fromeight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed byconsidering their dependence from time in the period 2000-2007. Each intermediate chemical composition has beenconsidered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumarolesF5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve theproposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations,features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and inthe evaluation of the natural risk related to a quiescent volcano

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A four compartment model of the cardiovascular system is developed. To allow for easy interpretation and to minimise the number of parameters, an effort was made to keep the model as simple as possible. A sensitivity analysis is first carried out to determine which are the most important model parameters to characterise the blood pressure signal. A four stage process is then described which accurately determines all parameter values. This process is applied to data from three patients and good agreement is shown in all cases.