947 resultados para Empirical Bayes Methods


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the study of traffic safety, expected crash frequencies across sites are generally estimated via the negative binomial model, assuming time invariant safety. Since the time invariant safety assumption may be invalid, Hauer (1997) proposed a modified empirical Bayes (EB) method. Despite the modification, no attempts have been made to examine the generalisable form of the marginal distribution resulting from the modified EB framework. Because the hyper-parameters needed to apply the modified EB method are not readily available, an assessment is lacking on how accurately the modified EB method estimates safety in the presence of the time variant safety and regression-to-the-mean (RTM) effects. This study derives the closed form marginal distribution, and reveals that the marginal distribution in the modified EB method is equivalent to the negative multinomial (NM) distribution, which is essentially the same as the likelihood function used in the random effects Poisson model. As a result, this study shows that the gamma posterior distribution from the multivariate Poisson-gamma mixture can be estimated using the NM model or the random effects Poisson model. This study also shows that the estimation errors from the modified EB method are systematically smaller than those from the comparison group method by simultaneously accounting for the RTM and time variant safety effects. Hence, the modified EB method via the NM model is a generalisable method for estimating safety in the presence of the time variant safety and the RTM effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Minimizing bycatch of seabirds is a major goal of the U.S. National Marine Fisheries Service. In Alaska waters, the bycatch (i.e., inadvertent catches) of seabirds has been an incidental result of demersal groundfish longline fishery operations. Notably, the endangered short-tailed albatross (Phoebastria albatrus) has been taken in this groundfish fishery. Bycatch rates of seabirds from individual vessels may be of particular interest because vessels with high bycatch rates may not be functioning effectively with seabird avoidance gears, and there may be a need for suggestions on how to use these avoidance gears more effectively. Therefore, bycatch estimates are usually made on an individual vessel basis and then summed to obtain the total estimate for the entire fleet.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper studies the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression. Our first goal is to clarify when, and how, multiplicity correction happens automatically in Bayesian analysis, and to distinguish this correction from the Bayesian Ockham's-razor effect. Our second goal is to contrast empirical-Bayes and fully Bayesian approaches to variable selection through examples, theoretical results and simulations. Considerable differences between the two approaches are found. In particular, we prove a theorem that characterizes a surprising aymptotic discrepancy between fully Bayes and empirical Bayes. This discrepancy arises from a different source than the failure to account for hyperparameter uncertainty in the empirical-Bayes estimate. Indeed, even at the extreme, when the empirical-Bayes estimate converges asymptotically to the true variable-inclusion probability, the potential for a serious difference remains. © Institute of Mathematical Statistics, 2010.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

id 34 additional quiz resource

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crash reduction factors (CRFs) are used to estimate the potential number of traffic crashes expected to be prevented from investment in safety improvement projects. The method used to develop CRFs in Florida has been based on the commonly used before-and-after approach. This approach suffers from a widely recognized problem known as regression-to-the-mean (RTM). The Empirical Bayes (EB) method has been introduced as a means to addressing the RTM problem. This method requires the information from both the treatment and reference sites in order to predict the expected number of crashes had the safety improvement projects at the treatment sites not been implemented. The information from the reference sites is estimated from a safety performance function (SPF), which is a mathematical relationship that links crashes to traffic exposure. The objective of this dissertation was to develop the SPFs for different functional classes of the Florida State Highway System. Crash data from years 2001 through 2003 along with traffic and geometric data were used in the SPF model development. SPFs for both rural and urban roadway categories were developed. The modeling data used were based on one-mile segments that contain homogeneous traffic and geometric conditions within each segment. Segments involving intersections were excluded. The scatter plots of data show that the relationships between crashes and traffic exposure are nonlinear, that crashes increase with traffic exposure in an increasing rate. Four regression models, namely, Poisson (PRM), Negative Binomial (NBRM), zero-inflated Poisson (ZIP), and zero-inflated Negative Binomial (ZINB), were fitted to the one-mile segment records for individual roadway categories. The best model was selected for each category based on a combination of the Likelihood Ratio test, the Vuong statistical test, and the Akaike's Information Criterion (AIC). The NBRM model was found to be appropriate for only one category and the ZINB model was found to be more appropriate for six other categories. The overall results show that the Negative Binomial distribution model generally provides a better fit for the data than the Poisson distribution model. In addition, the ZINB model was found to give the best fit when the count data exhibit excess zeros and over-dispersion for most of the roadway categories. While model validation shows that most data points fall within the 95% prediction intervals of the models developed, the Pearson goodness-of-fit measure does not show statistical significance. This is expected as traffic volume is only one of the many factors contributing to the overall crash experience, and that the SPFs are to be applied in conjunction with Accident Modification Factors (AMFs) to further account for the safety impacts of major geometric features before arriving at the final crash prediction. However, with improved traffic and crash data quality, the crash prediction power of SPF models may be further improved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identification of hot spots, also known as the sites with promise, black spots, accident-prone locations, or priority investigation locations, is an important and routine activity for improving the overall safety of roadway networks. Extensive literature focuses on methods for hot spot identification (HSID). A subset of this considerable literature is dedicated to conducting performance assessments of various HSID methods. A central issue in comparing HSID methods is the development and selection of quantitative and qualitative performance measures or criteria. The authors contend that currently employed HSID assessment criteria—namely false positives and false negatives—are necessary but not sufficient, and additional criteria are needed to exploit the ordinal nature of site ranking data. With the intent to equip road safety professionals and researchers with more useful tools to compare the performances of various HSID methods and to improve the level of HSID assessments, this paper proposes four quantitative HSID evaluation tests that are, to the authors’ knowledge, new and unique. These tests evaluate different aspects of HSID method performance, including reliability of results, ranking consistency, and false identification consistency and reliability. It is intended that road safety professionals apply these different evaluation tests in addition to existing tests to compare the performances of various HSID methods, and then select the most appropriate HSID method to screen road networks to identify sites that require further analysis. This work demonstrates four new criteria using 3 years of Arizona road section accident data and four commonly applied HSID methods [accident frequency ranking, accident rate ranking, accident reduction potential, and empirical Bayes (EB)]. The EB HSID method reveals itself as the superior method in most of the evaluation tests. In contrast, identifying hot spots using accident rate rankings performs the least well among the tests. The accident frequency and accident reduction potential methods perform similarly, with slight differences explained. The authors believe that the four new evaluation tests offer insight into HSID performance heretofore unavailable to analysts and researchers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying crash “hotspots”, “blackspots”, “sites with promise”, or “high risk” locations is standard practice in departments of transportation throughout the US. The literature is replete with the development and discussion of statistical methods for hotspot identification (HSID). Theoretical derivations and empirical studies have been used to weigh the benefits of various HSID methods; however, a small number of studies have used controlled experiments to systematically assess various methods. Using experimentally derived simulated data—which are argued to be superior to empirical data, three hot spot identification methods observed in practice are evaluated: simple ranking, confidence interval, and Empirical Bayes. Using simulated data, sites with promise are known a priori, in contrast to empirical data where high risk sites are not known for certain. To conduct the evaluation, properties of observed crash data are used to generate simulated crash frequency distributions at hypothetical sites. A variety of factors is manipulated to simulate a host of ‘real world’ conditions. Various levels of confidence are explored, and false positives (identifying a safe site as high risk) and false negatives (identifying a high risk site as safe) are compared across methods. Finally, the effects of crash history duration in the three HSID approaches are assessed. The results illustrate that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques (with certain caveats). As found by others, false positives and negatives are inversely related. Three years of crash history appears, in general, to provide an appropriate crash history duration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study proposes a framework of a model-based hot spot identification method by applying full Bayes (FB) technique. In comparison with the state-of-the-art approach [i.e., empirical Bayes method (EB)], the advantage of the FB method is the capability to seamlessly integrate prior information and all available data into posterior distributions on which various ranking criteria could be based. With intersection crash data collected in Singapore, an empirical analysis was conducted to evaluate the following six approaches for hot spot identification: (a) naive ranking using raw crash data, (b) standard EB ranking, (c) FB ranking using a Poisson-gamma model, (d) FB ranking using a Poisson-lognormal model, (e) FB ranking using a hierarchical Poisson model, and (f) FB ranking using a hierarchical Poisson (AR-1) model. The results show that (a) when using the expected crash rate-related decision parameters, all model-based approaches perform significantly better in safety ranking than does the naive ranking method, and (b) the FB approach using hierarchical models significantly outperforms the standard EB approach in correctly identifying hazardous sites.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Purpose: This study explored the spatial distribution of notified cryptosporidiosis cases and identified major socioeconomic factors associated with the transmission of cryptosporidiosis in Brisbane, Australia. Methods: We obtained the computerized data sets on the notified cryptosporidiosis cases and their key socioeconomic factors by statistical local area (SLA) in Brisbane for the period of 1996 to 2004 from the Queensland Department of Health and Australian Bureau of Statistics, respectively. We used spatial empirical Bayes rates smoothing to estimate the spatial distribution of cryptosporidiosis cases. A spatial classification and regression tree (CART) model was developed to explore the relationship between socioeconomic factors and the incidence rates of cryptosporidiosis. Results: Spatial empirical Bayes analysis reveals that the cryptosporidiosis infections were primarily concentrated in the northwest and southeast of Brisbane. A spatial CART model shows that the relative risk for cryptosporidiosis transmission was 2.4 when the value of the social economic index for areas (SEIFA) was over 1028 and the proportion of residents with low educational attainment in an SLA exceeded 8.8%. Conclusions: There was remarkable variation in spatial distribution of cryptosporidiosis infections in Brisbane. Spatial pattern of cryptosporidiosis seems to be associated with SEIFA and the proportion of residents with low education attainment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Regional safety program managers face a daunting challenge in the attempt to reduce deaths, injuries, and economic losses that result from motor vehicle crashes. This difficult mission is complicated by the combination of a large perceived need, small budget, and uncertainty about how effective each proposed countermeasure would be if implemented. A manager can turn to the research record for insight, but the measured effect of a single countermeasure often varies widely from study to study and across jurisdictions. The challenge of converting widespread and conflicting research results into a regionally meaningful conclusion can be addressed by incorporating "subjective" information into a Bayesian analysis framework. Engineering evaluations of crashes provide the subjective input on countermeasure effectiveness in the proposed Bayesian analysis framework. Empirical Bayes approaches are widely used in before-and-after studies and "hot-spot" identification; however, in these cases, the prior information was typically obtained from the data (empirically), not subjective sources. The power and advantages of Bayesian methods for assessing countermeasure effectiveness are presented. Also, an engineering evaluation approach developed at the Georgia Institute of Technology is described. Results are presented from an experiment conducted to assess the repeatability and objectivity of subjective engineering evaluations. In particular, the focus is on the importance, methodology, and feasibility of the subjective engineering evaluation for assessing countermeasures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Speeding is recognized as a major contributing factor in traffic crashes. In order to reduce speed-related crashes, the city of Scottsdale, Arizona implemented the first fixed-camera photo speed enforcement program (SEP) on a limited access freeway in the US. The 9-month demonstration program spanning from January 2006 to October 2006 was implemented on a 6.5 mile urban freeway segment of Arizona State Route 101 running through Scottsdale. This paper presents the results of a comprehensive analysis of the impact of the SEP on speeding behavior, crashes, and the economic impact of crashes. The impact on speeding behavior was estimated using generalized least square estimation, in which the observed speeds and the speeding frequencies during the program period were compared to those during other periods. The impact of the SEP on crashes was estimated using 3 evaluation methods: a before-and-after (BA) analysis using a comparison group, a BA analysis with traffic flow correction, and an empirical Bayes BA analysis with time-variant safety. The analysis results reveal that speeding detection frequencies (speeds> or =76 mph) increased by a factor of 10.5 after the SEP was (temporarily) terminated. Average speeds in the enforcement zone were reduced by about 9 mph when the SEP was implemented, after accounting for the influence of traffic flow. All crash types were reduced except rear-end crashes, although the estimated magnitude of impact varies across estimation methods (and their corresponding assumptions). When considering Arizona-specific crash related injury costs, the SEP is estimated to yield about $17 million in annual safety benefits.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Hot spot identification (HSID) plays a significant role in improving the safety of transportation networks. Numerous HSID methods have been proposed, developed, and evaluated in the literature. The vast majority of HSID methods reported and evaluated in the literature assume that crash data are complete, reliable, and accurate. Crash under-reporting, however, has long been recognized as a threat to the accuracy and completeness of historical traffic crash records. As a natural continuation of prior studies, the paper evaluates the influence that under-reported crashes exert on HSID methods. To conduct the evaluation, five groups of data gathered from Arizona Department of Transportation (ADOT) over the course of three years are adjusted to account for fifteen different assumed levels of under-reporting. Three identification methods are evaluated: simple ranking (SR), empirical Bayes (EB) and full Bayes (FB). Various threshold levels for establishing hotspots are explored. Finally, two evaluation criteria are compared across HSID methods. The results illustrate that the identification bias—the ability to correctly identify at risk sites--under-reporting is influenced by the degree of under-reporting. Comparatively speaking, crash under-reporting has the largest influence on the FB method and the least influence on the SR method. Additionally, the impact is positively related to the percentage of the under-reported PDO crashes and inversely related to the percentage of the under-reported injury crashes. This finding is significant because it reveals that despite PDO crashes being least severe and costly, they have the most significant influence on the accuracy of HSID.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Hot spot identification (HSID) aims to identify potential sites—roadway segments, intersections, crosswalks, interchanges, ramps, etc.—with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset.