985 resultados para Spatial Empirical bayes Smoothing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identification of hot spots, also known as the sites with promise, black spots, accident-prone locations, or priority investigation locations, is an important and routine activity for improving the overall safety of roadway networks. Extensive literature focuses on methods for hot spot identification (HSID). A subset of this considerable literature is dedicated to conducting performance assessments of various HSID methods. A central issue in comparing HSID methods is the development and selection of quantitative and qualitative performance measures or criteria. The authors contend that currently employed HSID assessment criteria—namely false positives and false negatives—are necessary but not sufficient, and additional criteria are needed to exploit the ordinal nature of site ranking data. With the intent to equip road safety professionals and researchers with more useful tools to compare the performances of various HSID methods and to improve the level of HSID assessments, this paper proposes four quantitative HSID evaluation tests that are, to the authors’ knowledge, new and unique. These tests evaluate different aspects of HSID method performance, including reliability of results, ranking consistency, and false identification consistency and reliability. It is intended that road safety professionals apply these different evaluation tests in addition to existing tests to compare the performances of various HSID methods, and then select the most appropriate HSID method to screen road networks to identify sites that require further analysis. This work demonstrates four new criteria using 3 years of Arizona road section accident data and four commonly applied HSID methods [accident frequency ranking, accident rate ranking, accident reduction potential, and empirical Bayes (EB)]. The EB HSID method reveals itself as the superior method in most of the evaluation tests. In contrast, identifying hot spots using accident rate rankings performs the least well among the tests. The accident frequency and accident reduction potential methods perform similarly, with slight differences explained. The authors believe that the four new evaluation tests offer insight into HSID performance heretofore unavailable to analysts and researchers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speeding is recognized as a major contributing factor in traffic crashes. In order to reduce speed-related crashes, the city of Scottsdale, Arizona implemented the first fixed-camera photo speed enforcement program (SEP) on a limited access freeway in the US. The 9-month demonstration program spanning from January 2006 to October 2006 was implemented on a 6.5 mile urban freeway segment of Arizona State Route 101 running through Scottsdale. This paper presents the results of a comprehensive analysis of the impact of the SEP on speeding behavior, crashes, and the economic impact of crashes. The impact on speeding behavior was estimated using generalized least square estimation, in which the observed speeds and the speeding frequencies during the program period were compared to those during other periods. The impact of the SEP on crashes was estimated using 3 evaluation methods: a before-and-after (BA) analysis using a comparison group, a BA analysis with traffic flow correction, and an empirical Bayes BA analysis with time-variant safety. The analysis results reveal that speeding detection frequencies (speeds> or =76 mph) increased by a factor of 10.5 after the SEP was (temporarily) terminated. Average speeds in the enforcement zone were reduced by about 9 mph when the SEP was implemented, after accounting for the influence of traffic flow. All crash types were reduced except rear-end crashes, although the estimated magnitude of impact varies across estimation methods (and their corresponding assumptions). When considering Arizona-specific crash related injury costs, the SEP is estimated to yield about $17 million in annual safety benefits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying crash “hotspots”, “blackspots”, “sites with promise”, or “high risk” locations is standard practice in departments of transportation throughout the US. The literature is replete with the development and discussion of statistical methods for hotspot identification (HSID). Theoretical derivations and empirical studies have been used to weigh the benefits of various HSID methods; however, a small number of studies have used controlled experiments to systematically assess various methods. Using experimentally derived simulated data—which are argued to be superior to empirical data, three hot spot identification methods observed in practice are evaluated: simple ranking, confidence interval, and Empirical Bayes. Using simulated data, sites with promise are known a priori, in contrast to empirical data where high risk sites are not known for certain. To conduct the evaluation, properties of observed crash data are used to generate simulated crash frequency distributions at hypothetical sites. A variety of factors is manipulated to simulate a host of ‘real world’ conditions. Various levels of confidence are explored, and false positives (identifying a safe site as high risk) and false negatives (identifying a high risk site as safe) are compared across methods. Finally, the effects of crash history duration in the three HSID approaches are assessed. The results illustrate that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques (with certain caveats). As found by others, false positives and negatives are inversely related. Three years of crash history appears, in general, to provide an appropriate crash history duration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hot spot identification (HSID) plays a significant role in improving the safety of transportation networks. Numerous HSID methods have been proposed, developed, and evaluated in the literature. The vast majority of HSID methods reported and evaluated in the literature assume that crash data are complete, reliable, and accurate. Crash under-reporting, however, has long been recognized as a threat to the accuracy and completeness of historical traffic crash records. As a natural continuation of prior studies, the paper evaluates the influence that under-reported crashes exert on HSID methods. To conduct the evaluation, five groups of data gathered from Arizona Department of Transportation (ADOT) over the course of three years are adjusted to account for fifteen different assumed levels of under-reporting. Three identification methods are evaluated: simple ranking (SR), empirical Bayes (EB) and full Bayes (FB). Various threshold levels for establishing hotspots are explored. Finally, two evaluation criteria are compared across HSID methods. The results illustrate that the identification bias—the ability to correctly identify at risk sites--under-reporting is influenced by the degree of under-reporting. Comparatively speaking, crash under-reporting has the largest influence on the FB method and the least influence on the SR method. Additionally, the impact is positively related to the percentage of the under-reported PDO crashes and inversely related to the percentage of the under-reported injury crashes. This finding is significant because it reveals that despite PDO crashes being least severe and costly, they have the most significant influence on the accuracy of HSID.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hot spot identification (HSID) aims to identify potential sites—roadway segments, intersections, crosswalks, interchanges, ramps, etc.—with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a Bayesian learning setting, the posterior distribution of a predictive model arises from a trade-off between its prior distribution and the conditional likelihood of observed data. Such distribution functions usually rely on additional hyperparameters which need to be tuned in order to achieve optimum predictive performance; this operation can be efficiently performed in an Empirical Bayes fashion by maximizing the posterior marginal likelihood of the observed data. Since the score function of this optimization problem is in general characterized by the presence of local optima, it is necessary to resort to global optimization strategies, which require a large number of function evaluations. Given that the evaluation is usually computationally intensive and badly scaled with respect to the dataset size, the maximum number of observations that can be treated simultaneously is quite limited. In this paper, we consider the case of hyperparameter tuning in Gaussian process regression. A straightforward implementation of the posterior log-likelihood for this model requires O(N^3) operations for every iteration of the optimization procedure, where N is the number of examples in the input dataset. We derive a novel set of identities that allow, after an initial overhead of O(N^3), the evaluation of the score function, as well as the Jacobian and Hessian matrices, in O(N) operations. We prove how the proposed identities, that follow from the eigendecomposition of the kernel matrix, yield a reduction of several orders of magnitude in the computation time for the hyperparameter optimization problem. Notably, the proposed solution provides computational advantages even with respect to state of the art approximations that rely on sparse kernel matrices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The historically-reactive approach to identifying safety problems and mitigating them involves selecting black spots or hot spots by ranking locations based on crash frequency and severity. The approach focuses mainly on the corridor level without taking the exposure rate (vehicle miles traveled) and socio-demographics information of the study area, which are very important in the transportation planning process, into consideration. A larger study analysis unit at the Transportation Analysis Zone (TAZ) level or the network planning level should be used to address the needs of development of the community in the future and incorporate safety into the long-range transportation planning process. In this study, existing planning tools (such as the PLANSAFE models presented in NCHRP Report 546) were evaluated for forecasting safety in small and medium-sized communities, particularly as related to changes in socio-demographics characteristics, traffic demand, road network, and countermeasures. The research also evaluated the applicability of the Empirical Bayes (EB) method to network-level analysis. In addition, application of the United States Road Assessment Program (usRAP) protocols at the local urban road network level was investigated. This research evaluated the applicability of these three methods for the City of Ames, Iowa. The outcome of this research is a systematic process and framework for considering road safety issues explicitly in the small and medium-sized community transportation planning process and for quantifying the safety impacts of new developments and policy programs. More specifically, quantitative safety may be incorporated into the planning process, through effective visualization and increased awareness of safety issues (usRAP), the identification of high-risk locations with potential for improvement, (usRAP maps and EB), countermeasures for high-risk locations (EB before and after study and PLANSAFE), and socio-economic and demographic induced changes at the planning-level (PLANSAFE).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We considered prediction techniques based on models of accelerated failure time with random e ects for correlated survival data. Besides the bayesian approach through empirical Bayes estimator, we also discussed about the use of a classical predictor, the Empirical Best Linear Unbiased Predictor (EBLUP). In order to illustrate the use of these predictors, we considered applications on a real data set coming from the oil industry. More speci - cally, the data set involves the mean time between failure of petroleum-well equipments of the Bacia Potiguar. The goal of this study is to predict the risk/probability of failure in order to help a preventive maintenance program. The results show that both methods are suitable to predict future failures, providing good decisions in relation to employment and economy of resources for preventive maintenance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The James-Stein estimator is a biased shrinkage estimator with uniformly smaller risk than the risk of the sample mean estimator for the mean of multivariate normal distribution, except in the one-dimensional or two-dimensional cases. In this work we have used more heuristic arguments and intensified the geometric treatment of the theory of James-Stein estimator. New type James-Stein shrinking estimators are proposed and the Mahalanobis metric used to address the James-Stein estimator. . To evaluate the performance of the estimator proposed, in relation to the sample mean estimator, we used the computer simulation by the Monte Carlo method by calculating the mean square error. The result indicates that the new estimator has better performance relative to the sample mean estimator.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Under a two-level hierarchical model, suppose that the distribution of the random parameter is known or can be estimated well. Data are generated via a fixed, but unobservable realization of this parameter. In this paper, we derive the smallest confidence region of the random parameter under a joint Bayesian/frequentist paradigm. On average this optimal region can be much smaller than the corresponding Bayesian highest posterior density region. The new estimation procedure is appealing when one deals with data generated under a highly parallel structure, for example, data from a trial with a large number of clinical centers involved or genome-wide gene-expession data for estimating individual gene- or center-specific parameters simultaneously. The new proposal is illustrated with a typical microarray data set and its performance is examined via a small simulation study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Functional Magnetic Resonance Imaging (fMRI) is a non-invasive technique which is commonly used to quantify changes in blood oxygenation and flow coupled to neuronal activation. One of the primary goals of fMRI studies is to identify localized brain regions where neuronal activation levels vary between groups. Single voxel t-tests have been commonly used to determine whether activation related to the protocol differs across groups. Due to the generally limited number of subjects within each study, accurate estimation of variance at each voxel is difficult. Thus, combining information across voxels in the statistical analysis of fMRI data is desirable in order to improve efficiency. Here we construct a hierarchical model and apply an Empirical Bayes framework on the analysis of group fMRI data, employing techniques used in high throughput genomic studies. The key idea is to shrink residual variances by combining information across voxels, and subsequently to construct an improved test statistic in lieu of the classical t-statistic. This hierarchical model results in a shrinkage of voxel-wise residual sample variances towards a common value. The shrunken estimator for voxelspecific variance components on the group analyses outperforms the classical residual error estimator in terms of mean squared error. Moreover, the shrunken test-statistic decreases false positive rate when testing differences in brain contrast maps across a wide range of simulation studies. This methodology was also applied to experimental data regarding a cognitive activation task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. Results: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In 2010, the American Association of State Highway and Transportation Officials (AASHTO) released a safety analysis software system known as SafetyAnalyst. SafetyAnalyst implements the empirical Bayes (EB) method, which requires the use of Safety Performance Functions (SPFs). The system is equipped with a set of national default SPFs, and the software calibrates the default SPFs to represent the agency's safety performance. However, it is recommended that agencies generate agency-specific SPFs whenever possible. Many investigators support the view that the agency-specific SPFs represent the agency data better than the national default SPFs calibrated to agency data. Furthermore, it is believed that the crash trends in Florida are different from the states whose data were used to develop the national default SPFs. In this dissertation, Florida-specific SPFs were developed using the 2008 Roadway Characteristics Inventory (RCI) data and crash and traffic data from 2007-2010 for both total and fatal and injury (FI) crashes. The data were randomly divided into two sets, one for calibration (70% of the data) and another for validation (30% of the data). The negative binomial (NB) model was used to develop the Florida-specific SPFs for each of the subtypes of roadway segments, intersections and ramps, using the calibration data. Statistical goodness-of-fit tests were performed on the calibrated models, which were then validated using the validation data set. The results were compared in order to assess the transferability of the Florida-specific SPF models. The default SafetyAnalyst SPFs were calibrated to Florida data by adjusting the national default SPFs with local calibration factors. The performance of the Florida-specific SPFs and SafetyAnalyst default SPFs calibrated to Florida data were then compared using a number of methods, including visual plots and statistical goodness-of-fit tests. The plots of SPFs against the observed crash data were used to compare the prediction performance of the two models. Three goodness-of-fit tests, represented by the mean absolute deviance (MAD), the mean square prediction error (MSPE), and Freeman-Tukey R2 (R2FT), were also used for comparison in order to identify the better-fitting model. The results showed that Florida-specific SPFs yielded better prediction performance than the national default SPFs calibrated to Florida data. The performance of Florida-specific SPFs was further compared with that of the full SPFs, which include both traffic and geometric variables, in two major applications of SPFs, i.e., crash prediction and identification of high crash locations. The results showed that both SPF models yielded very similar performance in both applications. These empirical results support the use of the flow-only SPF models adopted in SafetyAnalyst, which require much less effort to develop compared to full SPFs.