8 resultados para Bayesian statistical decision theory
em DRUM (Digital Repository at the University of Maryland)
Resumo:
As the number of fungal pathogen outbreaks become more frequent worldwide across taxa, so have the number of species extirpations and communities persisting with the pathogen. This phenomenon raises questions, such as: “what leads to host extinction during an outbreak?” and “how are hosts persisting once the pathogen establishes?.” But the data on host populations and communities across life stages before and after pathogen arrival rarely exist to answer these questions. Over the past three to four decades, the amphibian-killing fungus Batrachochytrim dendrobatidis (Bd) spread in a wave-like manner across Central America, leading to rapid species extirpations and population declines. I collected data on tadpole and adult amphibians in El Copé, Panama before, during, and after the Bd outbreak to answer these questions. I used Bayesian statistical approaches to account for imperfect host and pathogen detection of marked and unmarked individuals. In the tadpole community, within 11 months of Bds arrival, density and occupancy rapidly declined. Species losses were phylogenetically correlated, with glass frogs disappearing first, and tree frogs and poison-dart frogs remaining. I found that tadpole communities resembled one another more strongly after the outbreak than they did before Bd invasion. I found no tadpoles within 22 months of the outbreak and limited signs of recovery within 10 years. In contrast, at the same site, for a population of male glass frogs, Espadarana prosopleon, I found that 10 years post-outbreak, the population was consistently half its historic abundance, and that the lack of recruits to the population explained why the population had not rebounded, rather than high pathogen-induced mortality. And finally, examining the entire amphibian community, I found high pathogen prevalence, low infection intensities, and high survival rates of uninfected and infected hosts. Bd transmission risk, i.e., the probability a susceptible host becomes infected, did not relate to host density, pathogen prevalence, or infection intensity– Bd transmission risk was uniform across the study area. My results are especially relevant to conservation biologists aiming to predict the future impacts of Bd outbreaks, those trying to manage persisting populations, and those interested in reintroducing species back into wild amphibian communities.
Resumo:
Finding rare events in multidimensional data is an important detection problem that has applications in many fields, such as risk estimation in insurance industry, finance, flood prediction, medical diagnosis, quality assurance, security, or safety in transportation. The occurrence of such anomalies is so infrequent that there is usually not enough training data to learn an accurate statistical model of the anomaly class. In some cases, such events may have never been observed, so the only information that is available is a set of normal samples and an assumed pairwise similarity function. Such metric may only be known up to a certain number of unspecified parameters, which would either need to be learned from training data, or fixed by a domain expert. Sometimes, the anomalous condition may be formulated algebraically, such as a measure exceeding a predefined threshold, but nuisance variables may complicate the estimation of such a measure. Change detection methods used in time series analysis are not easily extendable to the multidimensional case, where discontinuities are not localized to a single point. On the other hand, in higher dimensions, data exhibits more complex interdependencies, and there is redundancy that could be exploited to adaptively model the normal data. In the first part of this dissertation, we review the theoretical framework for anomaly detection in images and previous anomaly detection work done in the context of crack detection and detection of anomalous components in railway tracks. In the second part, we propose new anomaly detection algorithms. The fact that curvilinear discontinuities in images are sparse with respect to the frame of shearlets, allows us to pose this anomaly detection problem as basis pursuit optimization. Therefore, we pose the problem of detecting curvilinear anomalies in noisy textured images as a blind source separation problem under sparsity constraints, and propose an iterative shrinkage algorithm to solve it. Taking advantage of the parallel nature of this algorithm, we describe how this method can be accelerated using graphical processing units (GPU). Then, we propose a new method for finding defective components on railway tracks using cameras mounted on a train. We describe how to extract features and use a combination of classifiers to solve this problem. Then, we scale anomaly detection to bigger datasets with complex interdependencies. We show that the anomaly detection problem naturally fits in the multitask learning framework. The first task consists of learning a compact representation of the good samples, while the second task consists of learning the anomaly detector. Using deep convolutional neural networks, we show that it is possible to train a deep model with a limited number of anomalous examples. In sequential detection problems, the presence of time-variant nuisance parameters affect the detection performance. In the last part of this dissertation, we present a method for adaptively estimating the threshold of sequential detectors using Extreme Value Theory on a Bayesian framework. Finally, conclusions on the results obtained are provided, followed by a discussion of possible future work.
Resumo:
Traffic demand increases are pushing aging ground transportation infrastructures to their theoretical capacity. The result of this demand is traffic bottlenecks that are a major cause of delay on urban freeways. In addition, the queues associated with those bottlenecks increase the probability of a crash while adversely affecting environmental measures such as emissions and fuel consumption. With limited resources available for network expansion, traffic professionals have developed active traffic management systems (ATMS) in an attempt to mitigate the negative consequences of traffic bottlenecks. Among these ATMS strategies, variable speed limits (VSL) and ramp metering (RM) have been gaining international interests for their potential to improve safety, mobility, and environmental measures at freeway bottlenecks. Though previous studies have shown the tremendous potential of variable speed limit (VSL) and VSL paired with ramp metering (VSLRM) control, little guidance has been developed to assist decision makers in the planning phase of a congestion mitigation project that is considering VSL or VSLRM control. To address this need, this study has developed a comprehensive decision/deployment support tool for the application of VSL and VSLRM control in recurrently congested environments. The decision tool will assist practitioners in deciding the most appropriate control strategy at a candidate site, which candidate sites have the most potential to benefit from the suggested control strategy, and how to most effectively design the field deployment of the suggested control strategy at each implementation site. To do so, the tool is comprised of three key modules, (1) Decision Module, (2) Benefits Module, and (3) Deployment Guidelines Module. Each module uses commonly known traffic flow and geometric parameters as inputs to statistical models and empirically based procedures to provide guidance on the application of VSL and VSLRM at each candidate site. These models and procedures were developed from the outputs of simulated experiments, calibrated with field data. To demonstrate the application of the tool, a list of real-world candidate sites were selected from the Maryland State Highway Administration Mobility Report. Here, field data from each candidate site was input into the tool to illustrate the step-by-step process required for efficient planning of VSL or VSLRM control. The output of the tool includes the suggested control system at each site, a ranking of the sites based on the expected benefit-to-cost ratio, and guidelines on how to deploy the VSL signs, ramp meters, and detectors at the deployment site(s). This research has the potential to assist traffic engineers in the planning of VSL and VSLRM control, thus enhancing the procedure for allocating limited resources for mobility and safety improvements on highways plagued by recurrent congestion.
Resumo:
An inference task in one in which some known set of information is used to produce an estimate about an unknown quantity. Existing theories of how humans make inferences include specialized heuristics that allow people to make these inferences in familiar environments quickly and without unnecessarily complex computation. Specialized heuristic processing may be unnecessary, however; other research suggests that the same patterns in judgment can be explained by existing patterns in encoding and retrieving memories. This dissertation compares and attempts to reconcile three alternate explanations of human inference. After justifying three hierarchical Bayesian version of existing inference models, the three models are com- pared on simulated, observed, and experimental data. The results suggest that the three models capture different patterns in human behavior but, based on posterior prediction using laboratory data, potentially ignore important determinants of the decision process.
Resumo:
I examine determinants of refugee return after conflicts. I argue that institutional constraints placed on the executive provide a credible commitment that signals to refugees that the conditions required for durable return will be created. This results in increased return flows for refugees. Further, when credible commitments are stronger in the country of origin than in the country of asylum, the level of return increases. Finally, I find that specific commitments made to refugees in the peace agreement do not lead to increased return because they are not credible without institutional constraints. Using data on returnees that has only recently been made available, along with network analysis and an original coding of the provisions in refugee agreements, statistical results are found to support this theory. An examination of cases in Djibouti, Sierra Leone, and Liberia provides additional support for this argument.
Resumo:
In a microscopic setting, humans behave in rich and unexpected ways. In a macroscopic setting, however, distinctive patterns of group behavior emerge, leading statistical physicists to search for an underlying mechanism. The aim of this dissertation is to analyze the macroscopic patterns of competing ideas in order to discern the mechanics of how group opinions form at the microscopic level. First, we explore the competition of answers in online Q&A (question and answer) boards. We find that a simple individual-level model can capture important features of user behavior, especially as the number of answers to a question grows. Our model further suggests that the wisdom of crowds may be constrained by information overload, in which users are unable to thoroughly evaluate each answer and therefore tend to use heuristics to pick what they believe is the best answer. Next, we explore models of opinion spread among voters to explain observed universal statistical patterns such as rescaled vote distributions and logarithmic vote correlations. We introduce a simple model that can explain both properties, as well as why it takes so long for large groups to reach consensus. An important feature of the model that facilitates agreement with data is that individuals become more stubborn (unwilling to change their opinion) over time. Finally, we explore potential underlying mechanisms for opinion formation in juries, by comparing data to various types of models. We find that different null hypotheses in which jurors do not interact when reaching a decision are in strong disagreement with data compared to a simple interaction model. These findings provide conceptual and mechanistic support for previous work that has found mutual influence can play a large role in group decisions. In addition, by matching our models to data, we are able to infer the time scales over which individuals change their opinions for different jury contexts. We find that these values increase as a function of the trial time, suggesting that jurors and judicial panels exhibit a kind of stubbornness similar to what we include in our model of voting behavior.
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
This document is the Online Supplement to ‘Myopic Allocation Policy with Asymptotically Optimal Sampling Rate,’ to be published in the IEEE Transactions of Automatic Control in 2017.