829 resultados para Bayesian mapping
Resumo:
During 2012-2013, the homicide rate in El Salvador came down from 69.9 to 42.2 per 100,000 population following a truce between the leaders of the two major gangs, “Mara Salvatrucha” and “Barrio 18”, and government. But despite the apparent successes of the truce, it was speculated that the drop in murders could have been due to the killers simply hid the bodies of their victims. This paper aims at determining whether gangs effectively disappeared their victims to cut down the official counts of murders, or they committed these crimes for other reasons. The results from this study suggest that Salvadoran gangs had been using disappearance as a resource to gain sustained social control among residents of already gang-dominated areas, that together with homicide, disappearance is part of a process of territorial spread and strategic strengthening by which these groups are enhancing their capabilities to interfere in the alliances of Mexican drug trafficking organizations with Central American criminal organizations specializing in the transshipment of drugs and in providing access to local markets to distribute and sell drugs. Our findings show that the risk for disappearance has been large even before the truce was in place and that actually, it continues as such and going through a process of geographic expansion.
Advanced mapping of environmental data: Geostatistics, Machine Learning and Bayesian Maximum Entropy
Resumo:
This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.
Resumo:
In geographical epidemiology, maps of disease rates and disease risk provide a spatial perspective for researching disease etiology. For rare diseases or when the population base is small, the rate and risk estimates may be unstable. Empirical Bayesian (EB) methods have been used to spatially smooth the estimates by permitting an area estimate to "borrow strength" from its neighbors. Such EB methods include the use of a Gamma model, of a James-Stein estimator, and of a conditional autoregressive (CAR) process. A fully Bayesian analysis of the CAR process is proposed. One advantage of this fully Bayesian analysis is that it can be implemented simply by using repeated sampling from the posterior densities. Use of a Markov chain Monte Carlo technique such as Gibbs sampler was not necessary. Direct resampling from the posterior densities provides exact small sample inferences instead of the approximate asymptotic analyses of maximum likelihood methods (Clayton & Kaldor, 1987). Further, the proposed CAR model provides for covariates to be included in the model. A simulation demonstrates the effect of sample size on the fully Bayesian analysis of the CAR process. The methods are applied to lip cancer data from Scotland, and the results are compared. ^
Resumo:
The evolution and population dynamics of avian coronaviruses (AvCoVs) remain underexplored. In the present study, in-depth phylogenetic and Bayesian phylogeographic studies were conducted to investigate the evolutionary dynamics of AvCoVs detected in wild and synanthropic birds. A total of 500 samples, including tracheal and cloacal swabs collected from 312 wild birds belonging to 42 species, were analysed using molecular assays. A total of 65 samples (13%) from 22 bird species were positive for AvCoV. Molecular evolution analyses revealed that the sequences from samples collected in Brazil did not cluster with any of the AvCoV S1 gene sequences deposited in the GenBank database. Bayesian framework analysis estimated an AvCoV strain from Sweden (1999) as the most recent common ancestor of the AvCoVs detected in this study. Furthermore, the analysis inferred an increase in the AvCoV dynamic demographic population in different wild and synanthropic bird species, suggesting that birds may be potential new hosts responsible for spreading this virus.
Resumo:
The Genetic Investigation of Anthropometric Traits (GIANT) consortium identified 14 loci in European Ancestry (EA) individuals associated with waist-to-hip ratio (WHR) adjusted for body mass index. These loci are wide and narrowing the signals remains necessary. Twelve of 14 loci identified in GIANT EA samples retained strong associations with WHR in our joint EA/individuals of African Ancestry (AA) analysis (log-Bayes factor >6.1). Trans-ethnic analyses at five loci (TBX15-WARS2, LYPLAL1, ADAMTS9, LY86 and ITPR2-SSPN) substantially narrowed the signals to smaller sets of variants, some of which are in regions that have evidence of regulatory activity. By leveraging varying linkage disequilibrium structures across different populations, single-nucleotide polymorphisms (SNPs) with strong signals and narrower credible sets from trans-ethnic meta-analysis of central obesity provide more precise localizations of potential functional variants and suggest a possible regulatory role. Meta-analysis results for WHR were obtained from 77 167 EA participants from GIANT and 23 564 AA participants from the African Ancestry Anthropometry Genetics Consortium. For fine mapping we interrogated SNPs within ± 250 kb flanking regions of 14 previously reported index SNPs from loci discovered in EA populations by performing trans-ethnic meta-analysis of results from the EA and AA meta-analyses. We applied a Bayesian approach that leverages allelic heterogeneity across populations to combine meta-analysis results and aids in fine-mapping shared variants at these locations. We annotated variants using information from the ENCODE Consortium and Roadmap Epigenomics Project to prioritize variants for possible functionality.
Resumo:
PURPOSE: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. METHOD: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). RESULTS: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. CONCLUSION: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as well as more detailed geological information.
Resumo:
Many weeds occur in patches but farmers frequently spray whole fields to control the weeds in these patches. Given a geo-referenced weed map, technology exists to confine spraying to these patches. Adoption of patch spraying by arable farmers has, however, been negligible partly due to the difficulty of constructing weed maps. Building on previous DEFRA and HGCA projects, this proposal aims to develop and evaluate a machine vision system to automate the weed mapping process. The project thereby addresses the principal technical stumbling block to widespread adoption of site specific weed management (SSWM). The accuracy of weed identification by machine vision based on a single field survey may be inadequate to create herbicide application maps. We therefore propose to test the hypothesis that sufficiently accurate weed maps can be constructed by integrating information from geo-referenced images captured automatically at different times of the year during normal field activities. Accuracy of identification will also be increased by utilising a priori knowledge of weeds present in fields. To prove this concept, images will be captured from arable fields on two farms and processed offline to identify and map the weeds, focussing especially on black-grass, wild oats, barren brome, couch grass and cleavers. As advocated by Lutman et al. (2002), the approach uncouples the weed mapping and treatment processes and builds on the observation that patches of these weeds are quite stable in arable fields. There are three main aspects to the project. 1) Machine vision hardware. Hardware component parts of the system are one or more cameras connected to a single board computer (Concurrent Solutions LLC) and interfaced with an accurate Global Positioning System (GPS) supplied by Patchwork Technology. The camera(s) will take separate measurements for each of the three primary colours of visible light (red, green and blue) in each pixel. The basic proof of concept can be achieved in principle using a single camera system, but in practice systems with more than one camera may need to be installed so that larger fractions of each field can be photographed. Hardware will be reviewed regularly during the project in response to feedback from other work packages and updated as required. 2) Image capture and weed identification software. The machine vision system will be attached to toolbars of farm machinery so that images can be collected during different field operations. Images will be captured at different ground speeds, in different directions and at different crop growth stages as well as in different crop backgrounds. Having captured geo-referenced images in the field, image analysis software will be developed to identify weed species by Murray State and Reading Universities with advice from The Arable Group. A wide range of pattern recognition and in particular Bayesian Networks will be used to advance the state of the art in machine vision-based weed identification and mapping. Weed identification algorithms used by others are inadequate for this project as we intend to collect and correlate images collected at different growth stages. Plants grown for this purpose by Herbiseed will be used in the first instance. In addition, our image capture and analysis system will include plant characteristics such as leaf shape, size, vein structure, colour and textural pattern, some of which are not detectable by other machine vision systems or are omitted by their algorithms. Using such a list of features observable using our machine vision system, we will determine those that can be used to distinguish weed species of interest. 3) Weed mapping. Geo-referenced maps of weeds in arable fields (Reading University and Syngenta) will be produced with advice from The Arable Group and Patchwork Technology. Natural infestations will be mapped in the fields but we will also introduce specimen plants in pots to facilitate more rigorous system evaluation and testing. Manual weed maps of the same fields will be generated by Reading University, Syngenta and Peter Lutman so that the accuracy of automated mapping can be assessed. The principal hypothesis and concept to be tested is that by combining maps from several surveys, a weed map with acceptable accuracy for endusers can be produced. If the concept is proved and can be commercialised, systems could be retrofitted at low cost onto existing farm machinery. The outputs of the weed mapping software would then link with the precision farming options already built into many commercial sprayers, allowing their use for targeted, site-specific herbicide applications. Immediate economic benefits would, therefore, arise directly from reducing herbicide costs. SSWM will also reduce the overall pesticide load on the crop and so may reduce pesticide residues in food and drinking water, and reduce adverse impacts of pesticides on non-target species and beneficials. Farmers may even choose to leave unsprayed some non-injurious, environmentally-beneficial, low density weed infestations. These benefits fit very well with the anticipated legislation emerging in the new EU Thematic Strategy for Pesticides which will encourage more targeted use of pesticides and greater uptake of Integrated Crop (Pest) Management approaches, and also with the requirements of the Water Framework Directive to reduce levels of pesticides in water bodies. The greater precision of weed management offered by SSWM is therefore a key element in preparing arable farming systems for the future, where policy makers and consumers want to minimise pesticide use and the carbon footprint of farming while maintaining food production and security. The mapping technology could also be used on organic farms to identify areas of fields needing mechanical weed control thereby reducing both carbon footprints and also damage to crops by, for example, spring tines. Objective i. To develop a prototype machine vision system for automated image capture during agricultural field operations; ii. To prove the concept that images captured by the machine vision system over a series of field operations can be processed to identify and geo-reference specific weeds in the field; iii. To generate weed maps from the geo-referenced, weed plants/patches identified in objective (ii).
Resumo:
The advent of molecular markers has created opportunities for a better understanding of quantitative inheritance and for developing novel strategies for genetic improvement of agricultural species, using information on quantitative trait loci (QTL). A QTL analysis relies on accurate genetic marker maps. At present, most statistical methods used for map construction ignore the fact that molecular data may be read with error. Often, however, there is ambiguity about some marker genotypes. A Bayesian MCMC approach for inferences about a genetic marker map when random miscoding of genotypes occurs is presented, and simulated and real data sets are analyzed. The results suggest that unless there is strong reason to believe that genotypes are ascertained without error, the proposed approach provides more reliable inference on the genetic map.
Resumo:
In this work we aim to propose a new approach for preliminary epidemiological studies on Standardized Mortality Ratios (SMR) collected in many spatial regions. A preliminary study on SMRs aims to formulate hypotheses to be investigated via individual epidemiological studies that avoid bias carried on by aggregated analyses. Starting from collecting disease counts and calculating expected disease counts by means of reference population disease rates, in each area an SMR is derived as the MLE under the Poisson assumption on each observation. Such estimators have high standard errors in small areas, i.e. where the expected count is low either because of the low population underlying the area or the rarity of the disease under study. Disease mapping models and other techniques for screening disease rates among the map aiming to detect anomalies and possible high-risk areas have been proposed in literature according to the classic and the Bayesian paradigm. Our proposal is approaching this issue by a decision-oriented method, which focus on multiple testing control, without however leaving the preliminary study perspective that an analysis on SMR indicators is asked to. We implement the control of the FDR, a quantity largely used to address multiple comparisons problems in the eld of microarray data analysis but which is not usually employed in disease mapping. Controlling the FDR means providing an estimate of the FDR for a set of rejected null hypotheses. The small areas issue arises diculties in applying traditional methods for FDR estimation, that are usually based only on the p-values knowledge (Benjamini and Hochberg, 1995; Storey, 2003). Tests evaluated by a traditional p-value provide weak power in small areas, where the expected number of disease cases is small. Moreover tests cannot be assumed as independent when spatial correlation between SMRs is expected, neither they are identical distributed when population underlying the map is heterogeneous. The Bayesian paradigm oers a way to overcome the inappropriateness of p-values based methods. Another peculiarity of the present work is to propose a hierarchical full Bayesian model for FDR estimation in testing many null hypothesis of absence of risk.We will use concepts of Bayesian models for disease mapping, referring in particular to the Besag York and Mollié model (1991) often used in practice for its exible prior assumption on the risks distribution across regions. The borrowing of strength between prior and likelihood typical of a hierarchical Bayesian model takes the advantage of evaluating a singular test (i.e. a test in a singular area) by means of all observations in the map under study, rather than just by means of the singular observation. This allows to improve the power test in small areas and addressing more appropriately the spatial correlation issue that suggests that relative risks are closer in spatially contiguous regions. The proposed model aims to estimate the FDR by means of the MCMC estimated posterior probabilities b i's of the null hypothesis (absence of risk) for each area. An estimate of the expected FDR conditional on data (\FDR) can be calculated in any set of b i's relative to areas declared at high-risk (where thenull hypothesis is rejected) by averaging the b i's themselves. The\FDR can be used to provide an easy decision rule for selecting high-risk areas, i.e. selecting as many as possible areas such that the\FDR is non-lower than a prexed value; we call them\FDR based decision (or selection) rules. The sensitivity and specicity of such rule depend on the accuracy of the FDR estimate, the over-estimation of FDR causing a loss of power and the under-estimation of FDR producing a loss of specicity. Moreover, our model has the interesting feature of still being able to provide an estimate of relative risk values as in the Besag York and Mollié model (1991). A simulation study to evaluate the model performance in FDR estimation accuracy, sensitivity and specificity of the decision rule, and goodness of estimation of relative risks, was set up. We chose a real map from which we generated several spatial scenarios whose counts of disease vary according to the spatial correlation degree, the size areas, the number of areas where the null hypothesis is true and the risk level in the latter areas. In summarizing simulation results we will always consider the FDR estimation in sets constituted by all b i's selected lower than a threshold t. We will show graphs of the\FDR and the true FDR (known by simulation) plotted against a threshold t to assess the FDR estimation. Varying the threshold we can learn which FDR values can be accurately estimated by the practitioner willing to apply the model (by the closeness between\FDR and true FDR). By plotting the calculated sensitivity and specicity (both known by simulation) vs the\FDR we can check the sensitivity and specicity of the corresponding\FDR based decision rules. For investigating the over-smoothing level of relative risk estimates we will compare box-plots of such estimates in high-risk areas (known by simulation), obtained by both our model and the classic Besag York Mollié model. All the summary tools are worked out for all simulated scenarios (in total 54 scenarios). Results show that FDR is well estimated (in the worst case we get an overestimation, hence a conservative FDR control) in small areas, low risk levels and spatially correlated risks scenarios, that are our primary aims. In such scenarios we have good estimates of the FDR for all values less or equal than 0.10. The sensitivity of\FDR based decision rules is generally low but specicity is high. In such scenario the use of\FDR = 0:05 or\FDR = 0:10 based selection rule can be suggested. In cases where the number of true alternative hypotheses (number of true high-risk areas) is small, also FDR = 0:15 values are well estimated, and \FDR = 0:15 based decision rules gains power maintaining an high specicity. On the other hand, in non-small areas and non-small risk level scenarios the FDR is under-estimated unless for very small values of it (much lower than 0.05); this resulting in a loss of specicity of a\FDR = 0:05 based decision rule. In such scenario\FDR = 0:05 or, even worse,\FDR = 0:1 based decision rules cannot be suggested because the true FDR is actually much higher. As regards the relative risk estimation, our model achieves almost the same results of the classic Besag York Molliè model. For this reason, our model is interesting for its ability to perform both the estimation of relative risk values and the FDR control, except for non-small areas and large risk level scenarios. A case of study is nally presented to show how the method can be used in epidemiology.
Resumo:
In this master thesis I evaluated the performance of a Ultra-Wide Bandwidth (UWB) radar system for indoor environments mapping. In particular, I used a statistical Bayesian approach which is able to combine all the measurements collected by the radar, including system non-idealities such as the error on the estimated antenna pointing direction or on the estimated radar position. First I verified through simulations that the system was able to provide a sufficiently accurate reconstruction of the surrounding environment despite the limitations imposed by the UWB technology. In fact, the emission of UWB pulses is limited in terms of transmitted power by international regulations. Motivated by the promising results obtained through simulations, I successively carried out a measurement campaign in a real indoor environment using a UWB commercial device. The obtained results showed that the UWB radar system is capable of providing an accurate reconstruction of indoor environments also adopting not directional antennas.
Resumo:
La mayor parte de los entornos diseñados por el hombre presentan características geométricas específicas. En ellos es frecuente encontrar formas poligonales, rectangulares, circulares . . . con una serie de relaciones típicas entre distintos elementos del entorno. Introducir este tipo de conocimiento en el proceso de construcción de mapas de un robot móvil puede mejorar notablemente la calidad y la precisión de los mapas resultantes. También puede hacerlos más útiles de cara a un razonamiento de más alto nivel. Cuando la construcción de mapas se formula en un marco probabilístico Bayesiano, una especificación completa del problema requiere considerar cierta información a priori sobre el tipo de entorno. El conocimiento previo puede aplicarse de varias maneras, en esta tesis se presentan dos marcos diferentes: uno basado en el uso de primitivas geométricas y otro que emplea un método de representación cercano al espacio de las medidas brutas. Un enfoque basado en características geométricas supone implícitamente imponer un cierto modelo a priori para el entorno. En este sentido, el desarrollo de una solución al problema SLAM mediante la optimización de un grafo de características geométricas constituye un primer paso hacia nuevos métodos de construcción de mapas en entornos estructurados. En el primero de los dos marcos propuestos, el sistema deduce la información a priori a aplicar en cada caso en base a una extensa colección de posibles modelos geométricos genéricos, siguiendo un método de Maximización de la Esperanza para hallar la estructura y el mapa más probables. La representación de la estructura del entorno se basa en un enfoque jerárquico, con diferentes niveles de abstracción para los distintos elementos geométricos que puedan describirlo. Se llevaron a cabo diversos experimentos para mostrar la versatilidad y el buen funcionamiento del método propuesto. En el segundo marco, el usuario puede definir diferentes modelos de estructura para el entorno mediante grupos de restricciones y energías locales entre puntos vecinos de un conjunto de datos del mismo. El grupo de restricciones que se aplica a cada grupo de puntos depende de la topología, que es inferida por el propio sistema. De este modo, se pueden incorporar nuevos modelos genéricos de estructura para el entorno con gran flexibilidad y facilidad. Se realizaron distintos experimentos para demostrar la flexibilidad y los buenos resultados del enfoque propuesto. Abstract Most human designed environments present specific geometrical characteristics. In them, it is easy to find polygonal, rectangular and circular shapes, with a series of typical relations between different elements of the environment. Introducing this kind of knowledge in the mapping process of mobile robots can notably improve the quality and accuracy of the resulting maps. It can also make them more suitable for higher level reasoning applications. When mapping is formulated in a Bayesian probabilistic framework, a complete specification of the problem requires considering a prior for the environment. The prior over the structure of the environment can be applied in several ways; this dissertation presents two different frameworks, one using a feature based approach and another one employing a dense representation close to the measurements space. A feature based approach implicitly imposes a prior for the environment. In this sense, feature based graph SLAM was a first step towards a new mapping solution for structured scenarios. In the first framework, the prior is inferred by the system from a wide collection of feature based priors, following an Expectation-Maximization approach to obtain the most probable structure and the most probable map. The representation of the structure of the environment is based on a hierarchical model with different levels of abstraction for the geometrical elements describing it. Various experiments were conducted to show the versatility and the good performance of the proposed method. In the second framework, different priors can be defined by the user as sets of local constraints and energies for consecutive points in a range scan from a given environment. The set of constraints applied to each group of points depends on the topology, which is inferred by the system. This way, flexible and generic priors can be incorporated very easily. Several tests were carried out to demonstrate the flexibility and the good results of the proposed approach.
Resumo:
The impact of the Parkinson's disease and its treatment on the patients' health-related quality of life can be estimated either by means of generic measures such as the european quality of Life-5 Dimensions (EQ-5D) or specific measures such as the 8-item Parkinson's disease questionnaire (PDQ-8). In clinical studies, PDQ-8 could be used in detriment of EQ-5D due to the lack of resources, time or clinical interest in generic measures. Nevertheless, PDQ-8 cannot be applied in cost-effectiveness analyses which require generic measures and quantitative utility scores, such as EQ-5D. To deal with this problem, a commonly used solution is the prediction of EQ-5D from PDQ-8. In this paper, we propose a new probabilistic method to predict EQ-5D from PDQ-8 using multi-dimensional Bayesian network classifiers. Our approach is evaluated using five-fold cross-validation experiments carried out on a Parkinson's data set containing 488 patients, and is compared with two additional Bayesian network-based approaches, two commonly used mapping methods namely, ordinary least squares and censored least absolute deviations, and a deterministic model. Experimental results are promising in terms of predictive performance as well as the identification of dependence relationships among EQ-5D and PDQ-8 items that the mapping approaches are unable to detect
Resumo:
The generative topographic mapping (GTM) model was introduced by Bishop et al. (1998, Neural Comput. 10(1), 215-234) as a probabilistic re- formulation of the self-organizing map (SOM). It offers a number of advantages compared with the standard SOM, and has already been used in a variety of applications. In this paper we report on several extensions of the GTM, including an incremental version of the EM algorithm for estimating the model parameters, the use of local subspace models, extensions to mixed discrete and continuous data, semi-linear models which permit the use of high-dimensional manifolds whilst avoiding computational intractability, Bayesian inference applied to hyper-parameters, and an alternative framework for the GTM based on Gaussian processes. All of these developments directly exploit the probabilistic structure of the GTM, thereby allowing the underlying modelling assumptions to be made explicit. They also highlight the advantages of adopting a consistent probabilistic framework for the formulation of pattern recognition algorithms.
Resumo:
In this paper we discuss a fast Bayesian extension to kriging algorithms which has been used successfully for fast, automatic mapping in emergency conditions in the Spatial Interpolation Comparison 2004 (SIC2004) exercise. The application of kriging to automatic mapping raises several issues such as robustness, scalability, speed and parameter estimation. Various ad-hoc solutions have been proposed and used extensively but they lack a sound theoretical basis. In this paper we show how observations can be projected onto a representative subset of the data, without losing significant information. This allows the complexity of the algorithm to grow as O(n m 2), where n is the total number of observations and m is the size of the subset of the observations retained for prediction. The main contribution of this paper is to further extend this projective method through the application of space-limited covariance functions, which can be used as an alternative to the commonly used covariance models. In many real world applications the correlation between observations essentially vanishes beyond a certain separation distance. Thus it makes sense to use a covariance model that encompasses this belief since this leads to sparse covariance matrices for which optimised sparse matrix techniques can be used. In the presence of extreme values we show that space-limited covariance functions offer an additional benefit, they maintain the smoothness locally but at the same time lead to a more robust, and compact, global model. We show the performance of this technique coupled with the sparse extension to the kriging algorithm on synthetic data and outline a number of computational benefits such an approach brings. To test the relevance to automatic mapping we apply the method to the data used in a recent comparison of interpolation techniques (SIC2004) to map the levels of background ambient gamma radiation. © Springer-Verlag 2007.