940 resultados para Bayesian hierarchical models
Resumo:
Spatial prediction of hourly rainfall via radar calibration is addressed. The change of support problem (COSP), arising when the spatial supports of different data sources do not coincide, is faced in a non-Gaussian setting; in fact, hourly rainfall in Emilia-Romagna region, in Italy, is characterized by abundance of zero values and right-skeweness of the distribution of positive amounts. Rain gauge direct measurements on sparsely distributed locations and hourly cumulated radar grids are provided by the ARPA-SIMC Emilia-Romagna. We propose a three-stage Bayesian hierarchical model for radar calibration, exploiting rain gauges as reference measure. Rain probability and amounts are modeled via linear relationships with radar in the log scale; spatial correlated Gaussian effects capture the residual information. We employ a probit link for rainfall probability and Gamma distribution for rainfall positive amounts; the two steps are joined via a two-part semicontinuous model. Three model specifications differently addressing COSP are presented; in particular, a stochastic weighting of all radar pixels, driven by a latent Gaussian process defined on the grid, is employed. Estimation is performed via MCMC procedures implemented in C, linked to R software. Communication and evaluation of probabilistic, point and interval predictions is investigated. A non-randomized PIT histogram is proposed for correctly assessing calibration and coverage of two-part semicontinuous models. Predictions obtained with the different model specifications are evaluated via graphical tools (Reliability Plot, Sharpness Histogram, PIT Histogram, Brier Score Plot and Quantile Decomposition Plot), proper scoring rules (Brier Score, Continuous Rank Probability Score) and consistent scoring functions (Root Mean Square Error and Mean Absolute Error addressing the predictive mean and median, respectively). Calibration is reached and the inclusion of neighbouring information slightly improves predictions. All specifications outperform a benchmark model with incorrelated effects, confirming the relevance of spatial correlation for modeling rainfall probability and accumulation.
Resumo:
In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being more efficient computationally than other Bayesian approaches. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. Conclusions based on a real dataset of cancer cases in Taiwan are similar albeit less conclusive with respect to comparing the approaches. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models.
Resumo:
Time series models relating short-term changes in air pollution levels to daily mortality counts typically assume that the effects of air pollution on the log relative rate of mortality do not vary with time. However, these short-term effects might plausibly vary by season. Changes in the sources of air pollution and meteorology can result in changes in characteristics of the air pollution mixture across seasons. The authors develop Bayesian semi-parametric hierarchical models for estimating time-varying effects of pollution on mortality in multi-site time series studies. The methods are applied to the updated National Morbidity and Mortality Air Pollution Study database for the period 1987--2000, which includes data for 100 U.S. cities. At the national level, a 10 micro-gram/m3 increase in PM(10) at lag 1 is associated with a 0.15 (95% posterior interval: -0.08, 0.39),0.14 (-0.14, 0.42), 0.36 (0.11, 0.61), and 0.14 (-0.06, 0.34) percent increase in mortality for winter, spring, summer, and fall, respectively. An analysis by geographical regions finds a strong seasonal pattern in the northeast (with a peak in summer) and little seasonal variation in the southern regions of the country. These results provide useful information for understanding particle toxicity and guiding future analyses of particle constituent data.
Resumo:
OBJECTIVE: Hierarchical modeling has been proposed as a solution to the multiple exposure problem. We estimate associations between metabolic syndrome and different components of antiretroviral therapy using both conventional and hierarchical models. STUDY DESIGN AND SETTING: We use discrete time survival analysis to estimate the association between metabolic syndrome and cumulative exposure to 16 antiretrovirals from four drug classes. We fit a hierarchical model where the drug class provides a prior model of the association between metabolic syndrome and exposure to each antiretroviral. RESULTS: One thousand two hundred and eighteen patients were followed for a median of 27 months, with 242 cases of metabolic syndrome (20%) at a rate of 7.5 cases per 100 patient years. Metabolic syndrome was more likely to develop in patients exposed to stavudine, but was less likely to develop in those exposed to atazanavir. The estimate for exposure to atazanavir increased from hazard ratio of 0.06 per 6 months' use in the conventional model to 0.37 in the hierarchical model (or from 0.57 to 0.81 when using spline-based covariate adjustment). CONCLUSION: These results are consistent with trials that show the disadvantage of stavudine and advantage of atazanavir relative to other drugs in their respective classes. The hierarchical model gave more plausible results than the equivalent conventional model.
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
The spatial context is critical when assessing present-day climate anomalies, attributing them to potential forcings and making statements regarding their frequency and severity in a long-term perspective. Recent international initiatives have expanded the number of high-quality proxy-records and developed new statistical reconstruction methods. These advances allow more rigorous regional past temperature reconstructions and, in turn, the possibility of evaluating climate models on policy-relevant, spatio-temporal scales. Here we provide a new proxy-based, annually-resolved, spatial reconstruction of the European summer (June–August) temperature fields back to 755 CE based on Bayesian hierarchical modelling (BHM), together with estimates of the European mean temperature variation since 138 BCE based on BHM and composite-plus-scaling (CPS). Our reconstructions compare well with independent instrumental and proxy-based temperature estimates, but suggest a larger amplitude in summer temperature variability than previously reported. Both CPS and BHM reconstructions indicate that the mean 20th century European summer temperature was not significantly different from some earlier centuries, including the 1st, 2nd, 8th and 10th centuries CE. The 1st century (in BHM also the 10th century) may even have been slightly warmer than the 20th century, but the difference is not statistically significant. Comparing each 50 yr period with the 1951–2000 period reveals a similar pattern. Recent summers, however, have been unusually warm in the context of the last two millennia and there are no 30 yr periods in either reconstruction that exceed the mean average European summer temperature of the last 3 decades (1986–2015 CE). A comparison with an ensemble of climate model simulations suggests that the reconstructed European summer temperature variability over the period 850–2000 CE reflects changes in both internal variability and external forcing on multi-decadal time-scales. For pan-European temperatures we find slightly better agreement between the reconstruction and the model simulations with high-end estimates for total solar irradiance. Temperature differences between the medieval period, the recent period and the Little Ice Age are larger in the reconstructions than the simulations. This may indicate inflated variability of the reconstructions, a lack of sensitivity and processes to changes in external forcing on the simulated European climate and/or an underestimation of internal variability on centennial and longer time scales.
Resumo:
Many studies have shown relationships between air pollution and the rate of hospital admissions for asthma. A few studies have controlled for age-specific effects by adding separate smoothing functions for each age group. However, it has not yet been reported whether air pollution effects are significantly different for different age groups. This lack of information is the motivation for this study, which tests the hypothesis that air pollution effects on asthmatic hospital admissions are significantly different by age groups. Each air pollutant's effect on asthmatic hospital admissions by age groups was estimated separately. In this study, daily time-series data for hospital admission rates from seven cities in Korea from June 1999 through 2003 were analyzed. The outcome variable, daily hospital admission rates for asthma, was related to five air pollutants which were used as the independent variables, namely particulate matter <10 micrometers (μm) in aerodynamic diameter (PM10), carbon monoxide (CO), ozone (O3), nitrogen dioxide (NO2), and sulfur dioxide (SO2). Meteorological variables were considered as confounders. Admission data were divided into three age groups: children (<15 years of age), adults (ages 15-64), and elderly (≥ 65 years of age). The adult age group was considered to be the reference group for each city. In order to estimate age-specific air pollution effects, the analysis was separated into two stages. In the first stage, Generalized Additive Models (GAMs) with cubic spline for smoothing were applied to estimate the age-city-specific air pollution effects on asthmatic hospital admission rates by city and age group. In the second stage, the Bayesian Hierarchical Model with non-informative prior which has large variance was used to combine city-specific effects by age groups. The hypothesis test showed that the effects of PM10, CO and NO2 were significantly different by age groups. Assuming that the air pollution effect for adults is zero as a reference, age-specific air pollution effects were: -0.00154 (95% confidence interval(CI)= (-0.0030,-0.0001)) for children and 0.00126 (95% CI = (0.0006, 0.0019)) for the elderly for PM 10; -0.0195 (95% CI = (-0.0386,-0.0004)) for children for CO; and 0.00494 (95% CI = (0.0028, 0.0071)) for the elderly for NO2. Relative rates (RRs) were 1.008 (95% CI = (1.000-1.017)) in adults and 1.021 (95% CI = (1.012-1.030)) in the elderly for every 10 μg/m3 increase of PM10 , 1.019 (95% CI = (1.005-1.033)) in adults and 1.022 (95% CI = (1.012-1.033)) in the elderly for every 0.1 part per million (ppm) increase of CO; 1.006 (95%CI = (1.002-1.009)) and 1.019 (95%CI = (1.007-1.032)) in the elderly for every 1 part per billion (ppb) increase of NO2 and SO2, respectively. Asthma hospital admissions were significantly increased for PM10 and CO in adults, and for PM10, CO, NO2 and SO2 in the elderly.^
Resumo:
Negli ultimi anni i modelli VAR sono diventati il principale strumento econometrico per verificare se può esistere una relazione tra le variabili e per valutare gli effetti delle politiche economiche. Questa tesi studia tre diversi approcci di identificazione a partire dai modelli VAR in forma ridotta (tra cui periodo di campionamento, set di variabili endogene, termini deterministici). Usiamo nel caso di modelli VAR il test di Causalità di Granger per verificare la capacità di una variabile di prevedere un altra, nel caso di cointegrazione usiamo modelli VECM per stimare congiuntamente i coefficienti di lungo periodo ed i coefficienti di breve periodo e nel caso di piccoli set di dati e problemi di overfitting usiamo modelli VAR bayesiani con funzioni di risposta di impulso e decomposizione della varianza, per analizzare l'effetto degli shock sulle variabili macroeconomiche. A tale scopo, gli studi empirici sono effettuati utilizzando serie storiche di dati specifici e formulando diverse ipotesi. Sono stati utilizzati tre modelli VAR: in primis per studiare le decisioni di politica monetaria e discriminare tra le varie teorie post-keynesiane sulla politica monetaria ed in particolare sulla cosiddetta "regola di solvibilità" (Brancaccio e Fontana 2013, 2015) e regola del GDP nominale in Area Euro (paper 1); secondo per estendere l'evidenza dell'ipotesi di endogeneità della moneta valutando gli effetti della cartolarizzazione delle banche sul meccanismo di trasmissione della politica monetaria negli Stati Uniti (paper 2); terzo per valutare gli effetti dell'invecchiamento sulla spesa sanitaria in Italia in termini di implicazioni di politiche economiche (paper 3). La tesi è introdotta dal capitolo 1 in cui si delinea il contesto, la motivazione e lo scopo di questa ricerca, mentre la struttura e la sintesi, così come i principali risultati, sono descritti nei rimanenti capitoli. Nel capitolo 2 sono esaminati, utilizzando un modello VAR in differenze prime con dati trimestrali della zona Euro, se le decisioni in materia di politica monetaria possono essere interpretate in termini di una "regola di politica monetaria", con specifico riferimento alla cosiddetta "nominal GDP targeting rule" (McCallum 1988 Hall e Mankiw 1994; Woodford 2012). I risultati evidenziano una relazione causale che va dallo scostamento tra i tassi di crescita del PIL nominale e PIL obiettivo alle variazioni dei tassi di interesse di mercato a tre mesi. La stessa analisi non sembra confermare l'esistenza di una relazione causale significativa inversa dalla variazione del tasso di interesse di mercato allo scostamento tra i tassi di crescita del PIL nominale e PIL obiettivo. Risultati simili sono stati ottenuti sostituendo il tasso di interesse di mercato con il tasso di interesse di rifinanziamento della BCE. Questa conferma di una sola delle due direzioni di causalità non supporta un'interpretazione della politica monetaria basata sulla nominal GDP targeting rule e dà adito a dubbi in termini più generali per l'applicabilità della regola di Taylor e tutte le regole convenzionali della politica monetaria per il caso in questione. I risultati appaiono invece essere più in linea con altri approcci possibili, come quelli basati su alcune analisi post-keynesiane e marxiste della teoria monetaria e più in particolare la cosiddetta "regola di solvibilità" (Brancaccio e Fontana 2013, 2015). Queste linee di ricerca contestano la tesi semplicistica che l'ambito della politica monetaria consiste nella stabilizzazione dell'inflazione, del PIL reale o del reddito nominale intorno ad un livello "naturale equilibrio". Piuttosto, essi suggeriscono che le banche centrali in realtà seguono uno scopo più complesso, che è il regolamento del sistema finanziario, con particolare riferimento ai rapporti tra creditori e debitori e la relativa solvibilità delle unità economiche. Il capitolo 3 analizza l’offerta di prestiti considerando l’endogeneità della moneta derivante dall'attività di cartolarizzazione delle banche nel corso del periodo 1999-2012. Anche se gran parte della letteratura indaga sulla endogenità dell'offerta di moneta, questo approccio è stato adottato raramente per indagare la endogeneità della moneta nel breve e lungo termine con uno studio degli Stati Uniti durante le due crisi principali: scoppio della bolla dot-com (1998-1999) e la crisi dei mutui sub-prime (2008-2009). In particolare, si considerano gli effetti dell'innovazione finanziaria sul canale dei prestiti utilizzando la serie dei prestiti aggiustata per la cartolarizzazione al fine di verificare se il sistema bancario americano è stimolato a ricercare fonti più economiche di finanziamento come la cartolarizzazione, in caso di politica monetaria restrittiva (Altunbas et al., 2009). L'analisi si basa sull'aggregato monetario M1 ed M2. Utilizzando modelli VECM, esaminiamo una relazione di lungo periodo tra le variabili in livello e valutiamo gli effetti dell’offerta di moneta analizzando quanto la politica monetaria influisce sulle deviazioni di breve periodo dalla relazione di lungo periodo. I risultati mostrano che la cartolarizzazione influenza l'impatto dei prestiti su M1 ed M2. Ciò implica che l'offerta di moneta è endogena confermando l'approccio strutturalista ed evidenziando che gli agenti economici sono motivati ad aumentare la cartolarizzazione per una preventiva copertura contro shock di politica monetaria. Il capitolo 4 indaga il rapporto tra spesa pro capite sanitaria, PIL pro capite, indice di vecchiaia ed aspettativa di vita in Italia nel periodo 1990-2013, utilizzando i modelli VAR bayesiani e dati annuali estratti dalla banca dati OCSE ed Eurostat. Le funzioni di risposta d'impulso e la scomposizione della varianza evidenziano una relazione positiva: dal PIL pro capite alla spesa pro capite sanitaria, dalla speranza di vita alla spesa sanitaria, e dall'indice di invecchiamento alla spesa pro capite sanitaria. L'impatto dell'invecchiamento sulla spesa sanitaria è più significativo rispetto alle altre variabili. Nel complesso, i nostri risultati suggeriscono che le disabilità strettamente connesse all'invecchiamento possono essere il driver principale della spesa sanitaria nel breve-medio periodo. Una buona gestione della sanità contribuisce a migliorare il benessere del paziente, senza aumentare la spesa sanitaria totale. Tuttavia, le politiche che migliorano lo stato di salute delle persone anziane potrebbe essere necessarie per una più bassa domanda pro capite dei servizi sanitari e sociali.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Multiple hierarchical models of representative democracies in which, for instance, voters elect county representatives, county representatives elect district representatives, district representatives elect state representatives and state representatives a president, reduces the number of electors a representative is answerable for, and therefore, considering each level separately, these models could come closer to direct democracy. In this paper we show that worst case policy bias increases with the number of hierarchical levels. This also means that the opportunities of a gerrymanderer increase in the number of hierarchical levels.
Resumo:
Stable isotope analysis has emerged as one of the primary means for examining the structure and dynamics of food webs, and numerous analytical approaches are now commonly used in the field. Techniques range from simple, qualitative inferences based on the isotopic niche, to Bayesian mixing models that can be used to characterize food-web structure at multiple hierarchical levels. We provide a comprehensive review of these techniques, and thus a single reference source to help identify the most useful approaches to apply to a given data set. We structure the review around four general questions: (1) what is the trophic position of an organism in a food web?; (2) which resource pools support consumers?; (3) what additional information does relative position of consumers in isotopic space reveal about food-web structure?; and (4) what is the degree of trophic variability at the intrapopulation level? For each general question, we detail different approaches that have been applied, discussing the strengths and weaknesses of each. We conclude with a set of suggestions that transcend individual analytical approaches, and provide guidance for future applications in the field.
Resumo:
This research explores Bayesian updating as a tool for estimating parameters probabilistically by dynamic analysis of data sequences. Two distinct Bayesian updating methodologies are assessed. The first approach focuses on Bayesian updating of failure rates for primary events in fault trees. A Poisson Exponentially Moving Average (PEWMA) model is implemnented to carry out Bayesian updating of failure rates for individual primary events in the fault tree. To provide a basis for testing of the PEWMA model, a fault tree is developed based on the Texas City Refinery incident which occurred in 2005. A qualitative fault tree analysis is then carried out to obtain a logical expression for the top event. A dynamic Fault Tree analysis is carried out by evaluating the top event probability at each Bayesian updating step by Monte Carlo sampling from posterior failure rate distributions. It is demonstrated that PEWMA modeling is advantageous over conventional conjugate Poisson-Gamma updating techniques when failure data is collected over long time spans. The second approach focuses on Bayesian updating of parameters in non-linear forward models. Specifically, the technique is applied to the hydrocarbon material balance equation. In order to test the accuracy of the implemented Bayesian updating models, a synthetic data set is developed using the Eclipse reservoir simulator. Both structured grid and MCMC sampling based solution techniques are implemented and are shown to model the synthetic data set with good accuracy. Furthermore, a graphical analysis shows that the implemented MCMC model displays good convergence properties. A case study demonstrates that Likelihood variance affects the rate at which the posterior assimilates information from the measured data sequence. Error in the measured data significantly affects the accuracy of the posterior parameter distributions. Increasing the likelihood variance mitigates random measurement errors, but casuses the overall variance of the posterior to increase. Bayesian updating is shown to be advantageous over deterministic regression techniques as it allows for incorporation of prior belief and full modeling uncertainty over the parameter ranges. As such, the Bayesian approach to estimation of parameters in the material balance equation shows utility for incorporation into reservoir engineering workflows.
Resumo:
This dissertation contributes to the rapidly growing empirical research area in the field of operations management. It contains two essays, tackling two different sets of operations management questions which are motivated by and built on field data sets from two very different industries --- air cargo logistics and retailing.
The first essay, based on the data set obtained from a world leading third-party logistics company, develops a novel and general Bayesian hierarchical learning framework for estimating customers' spillover learning, that is, customers' learning about the quality of a service (or product) from their previous experiences with similar yet not identical services. We then apply our model to the data set to study how customers' experiences from shipping on a particular route affect their future decisions about shipping not only on that route, but also on other routes serviced by the same logistics company. We find that customers indeed borrow experiences from similar but different services to update their quality beliefs that determine future purchase decisions. Also, service quality beliefs have a significant impact on their future purchasing decisions. Moreover, customers are risk averse; they are averse to not only experience variability but also belief uncertainty (i.e., customer's uncertainty about their beliefs). Finally, belief uncertainty affects customers' utilities more compared to experience variability.
The second essay is based on a data set obtained from a large Chinese supermarket chain, which contains sales as well as both wholesale and retail prices of un-packaged perishable vegetables. Recognizing the special characteristics of this particularly product category, we develop a structural estimation model in a discrete-continuous choice model framework. Building on this framework, we then study an optimization model for joint pricing and inventory management strategies of multiple products, which aims at improving the company's profit from direct sales and at the same time reducing food waste and thus improving social welfare.
Collectively, the studies in this dissertation provide useful modeling ideas, decision tools, insights, and guidance for firms to utilize vast sales and operations data to devise more effective business strategies.
Resumo:
Bayesian nonparametric models, such as the Gaussian process and the Dirichlet process, have been extensively applied for target kinematics modeling in various applications including environmental monitoring, traffic planning, endangered species tracking, dynamic scene analysis, autonomous robot navigation, and human motion modeling. As shown by these successful applications, Bayesian nonparametric models are able to adjust their complexities adaptively from data as necessary, and are resistant to overfitting or underfitting. However, most existing works assume that the sensor measurements used to learn the Bayesian nonparametric target kinematics models are obtained a priori or that the target kinematics can be measured by the sensor at any given time throughout the task. Little work has been done for controlling the sensor with bounded field of view to obtain measurements of mobile targets that are most informative for reducing the uncertainty of the Bayesian nonparametric models. To present the systematic sensor planning approach to leaning Bayesian nonparametric models, the Gaussian process target kinematics model is introduced at first, which is capable of describing time-invariant spatial phenomena, such as ocean currents, temperature distributions and wind velocity fields. The Dirichlet process-Gaussian process target kinematics model is subsequently discussed for modeling mixture of mobile targets, such as pedestrian motion patterns.
Novel information theoretic functions are developed for these introduced Bayesian nonparametric target kinematics models to represent the expected utility of measurements as a function of sensor control inputs and random environmental variables. A Gaussian process expected Kullback Leibler divergence is developed as the expectation of the KL divergence between the current (prior) and posterior Gaussian process target kinematics models with respect to the future measurements. Then, this approach is extended to develop a new information value function that can be used to estimate target kinematics described by a Dirichlet process-Gaussian process mixture model. A theorem is proposed that shows the novel information theoretic functions are bounded. Based on this theorem, efficient estimators of the new information theoretic functions are designed, which are proved to be unbiased with the variance of the resultant approximation error decreasing linearly as the number of samples increases. Computational complexities for optimizing the novel information theoretic functions under sensor dynamics constraints are studied, and are proved to be NP-hard. A cumulative lower bound is then proposed to reduce the computational complexity to polynomial time.
Three sensor planning algorithms are developed according to the assumptions on the target kinematics and the sensor dynamics. For problems where the control space of the sensor is discrete, a greedy algorithm is proposed. The efficiency of the greedy algorithm is demonstrated by a numerical experiment with data of ocean currents obtained by moored buoys. A sweep line algorithm is developed for applications where the sensor control space is continuous and unconstrained. Synthetic simulations as well as physical experiments with ground robots and a surveillance camera are conducted to evaluate the performance of the sweep line algorithm. Moreover, a lexicographic algorithm is designed based on the cumulative lower bound of the novel information theoretic functions, for the scenario where the sensor dynamics are constrained. Numerical experiments with real data collected from indoor pedestrians by a commercial pan-tilt camera are performed to examine the lexicographic algorithm. Results from both the numerical simulations and the physical experiments show that the three sensor planning algorithms proposed in this dissertation based on the novel information theoretic functions are superior at learning the target kinematics with
little or no prior knowledge
Resumo:
Temporal replicate counts are often aggregated to improve model fit by reducing zero-inflation and count variability, and in the case of migration counts collected hourly throughout a migration, allows one to ignore nonindependence. However, aggregation can represent a loss of potentially useful information on the hourly or seasonal distribution of counts, which might impact our ability to estimate reliable trends. We simulated 20-year hourly raptor migration count datasets with known rate of change to test the effect of aggregating hourly counts to daily or annual totals on our ability to recover known trend. We simulated data for three types of species, to test whether results varied with species abundance or migration strategy: a commonly detected species, e.g., Northern Harrier, Circus cyaneus; a rarely detected species, e.g., Peregrine Falcon, Falco peregrinus; and a species typically counted in large aggregations with overdispersed counts, e.g., Broad-winged Hawk, Buteo platypterus. We compared accuracy and precision of estimated trends across species and count types (hourly/daily/annual) using hierarchical models that assumed a Poisson, negative binomial (NB) or zero-inflated negative binomial (ZINB) count distribution. We found little benefit of modeling zero-inflation or of modeling the hourly distribution of migration counts. For the rare species, trends analyzed using daily totals and an NB or ZINB data distribution resulted in a higher probability of detecting an accurate and precise trend. In contrast, trends of the common and overdispersed species benefited from aggregation to annual totals, and for the overdispersed species in particular, trends estimating using annual totals were more precise, and resulted in lower probabilities of estimating a trend (1) in the wrong direction, or (2) with credible intervals that excluded the true trend, as compared with hourly and daily counts.