973 resultados para Empirical Bayes Methods
Resumo:
Traffic safety engineers are among the early adopters of Bayesian statistical tools for analyzing crash data. As in many other areas of application, empirical Bayes methods were their first choice, perhaps because they represent an intuitively appealing, yet relatively easy to implement alternative to purely classical approaches. With the enormous progress in numerical methods made in recent years and with the availability of free, easy to use software that permits implementing a fully Bayesian approach, however, there is now ample justification to progress towards fully Bayesian analyses of crash data. The fully Bayesian approach, in particular as implemented via multi-level hierarchical models, has many advantages over the empirical Bayes approach. In a full Bayesian analysis, prior information and all available data are seamlessly integrated into posterior distributions on which practitioners can base their inferences. All uncertainties are thus accounted for in the analyses and there is no need to pre-process data to obtain Safety Performance Functions and other such prior estimates of the effect of covariates on the outcome of interest. In this slight, fully Bayesian methods may well be less costly to implement and may result in safety estimates with more realistic standard errors. In this manuscript, we present the full Bayesian approach to analyzing traffic safety data and focus on highlighting the differences between the empirical Bayes and the full Bayes approaches. We use an illustrative example to discuss a step-by-step Bayesian analysis of the data and to show some of the types of inferences that are possible within the full Bayesian framework.
Resumo:
Traffic safety engineers are among the early adopters of Bayesian statistical tools for analyzing crash data. As in many other areas of application, empirical Bayes methods were their first choice, perhaps because they represent an intuitively appealing, yet relatively easy to implement alternative to purely classical approaches. With the enormous progress in numerical methods made in recent years and with the availability of free, easy to use software that permits implementing a fully Bayesian approach, however, there is now ample justification to progress towards fully Bayesian analyses of crash data. The fully Bayesian approach, in particular as implemented via multi-level hierarchical models, has many advantages over the empirical Bayes approach. In a full Bayesian analysis, prior information and all available data are seamlessly integrated into posterior distributions on which practitioners can base their inferences. All uncertainties are thus accounted for in the analyses and there is no need to pre-process data to obtain Safety Performance Functions and other such prior estimates of the effect of covariates on the outcome of interest. In this light, fully Bayesian methods may well be less costly to implement and may result in safety estimates with more realistic standard errors. In this manuscript, we present the full Bayesian approach to analyzing traffic safety data and focus on highlighting the differences between the empirical Bayes and the full Bayes approaches. We use an illustrative example to discuss a step-by-step Bayesian analysis of the data and to show some of the types of inferences that are possible within the full Bayesian framework.
Resumo:
We compare a set of empirical Bayes and composite estimators of the population means of the districts (small areas) of a country, and show that the natural modelling strategy of searching for a well fitting empirical Bayes model and using it for estimation of the area-level means can be inefficient.
Resumo:
id 34 additional quiz resource
Resumo:
Crash reduction factors (CRFs) are used to estimate the potential number of traffic crashes expected to be prevented from investment in safety improvement projects. The method used to develop CRFs in Florida has been based on the commonly used before-and-after approach. This approach suffers from a widely recognized problem known as regression-to-the-mean (RTM). The Empirical Bayes (EB) method has been introduced as a means to addressing the RTM problem. This method requires the information from both the treatment and reference sites in order to predict the expected number of crashes had the safety improvement projects at the treatment sites not been implemented. The information from the reference sites is estimated from a safety performance function (SPF), which is a mathematical relationship that links crashes to traffic exposure. The objective of this dissertation was to develop the SPFs for different functional classes of the Florida State Highway System. Crash data from years 2001 through 2003 along with traffic and geometric data were used in the SPF model development. SPFs for both rural and urban roadway categories were developed. The modeling data used were based on one-mile segments that contain homogeneous traffic and geometric conditions within each segment. Segments involving intersections were excluded. The scatter plots of data show that the relationships between crashes and traffic exposure are nonlinear, that crashes increase with traffic exposure in an increasing rate. Four regression models, namely, Poisson (PRM), Negative Binomial (NBRM), zero-inflated Poisson (ZIP), and zero-inflated Negative Binomial (ZINB), were fitted to the one-mile segment records for individual roadway categories. The best model was selected for each category based on a combination of the Likelihood Ratio test, the Vuong statistical test, and the Akaike's Information Criterion (AIC). The NBRM model was found to be appropriate for only one category and the ZINB model was found to be more appropriate for six other categories. The overall results show that the Negative Binomial distribution model generally provides a better fit for the data than the Poisson distribution model. In addition, the ZINB model was found to give the best fit when the count data exhibit excess zeros and over-dispersion for most of the roadway categories. While model validation shows that most data points fall within the 95% prediction intervals of the models developed, the Pearson goodness-of-fit measure does not show statistical significance. This is expected as traffic volume is only one of the many factors contributing to the overall crash experience, and that the SPFs are to be applied in conjunction with Accident Modification Factors (AMFs) to further account for the safety impacts of major geometric features before arriving at the final crash prediction. However, with improved traffic and crash data quality, the crash prediction power of SPF models may be further improved.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.
Resumo:
OBJECTIVE To analyze the spatial distribution of risk for tuberculosis and its socioeconomic determinants in the city of Rio de Janeiro, Brazil.METHODS An ecological study on the association between the mean incidence rate of tuberculosis from 2004 to 2006 and socioeconomic indicators of the Censo Demográfico (Demographic Census) of 2000. The unit of analysis was the home district registered in the Sistema de Informação de Agravos de Notificação (Notifiable Diseases Information System) of Rio de Janeiro, Southeastern Brazil. The rates were standardized by sex and age group, and smoothed by the empirical Bayes method. Spatial autocorrelation was evaluated by Moran’s I. Multiple linear regression models were studied and the appropriateness of incorporating the spatial component in modeling was evaluated.RESULTS We observed a higher risk of the disease in some neighborhoods of the port and north regions, as well as a high incidence in the slums of Rocinha and Vidigal, in the south region, and Cidade de Deus, in the west. The final model identified a positive association for the variables: percentage of permanent private households in which the head of the house earns three to five minimum wages; percentage of individual residents in the neighborhood; and percentage of people living in homes with more than two people per bedroom.CONCLUSIONS The spatial analysis identified areas of risk of tuberculosis incidence in the neighborhoods of the city of Rio de Janeiro and also found spatial dependence for the incidence of tuberculosis and some socioeconomic variables. However, the inclusion of the space component in the final model was not required during the modeling process.
Resumo:
ABSTRACT OBJECTIVE To describe the spatial distribution of avoidable hospitalizations due to tuberculosis in the municipality of Ribeirao Preto, SP, Brazil, and to identify spatial and space-time clusters for the risk of occurrence of these events. METHODS This is a descriptive, ecological study that considered the hospitalizations records of the Hospital Information System of residents of Ribeirao Preto, SP, Southeastern Brazil, from 2006 to 2012. Only the cases with recorded addresses were considered for the spatial analyses, and they were also geocoded. We resorted to Kernel density estimation to identify the densest areas, local empirical Bayes rate as the method for smoothing the incidence rates of hospital admissions, and scan statistic for identifying clusters of risk. Softwares ArcGis 10.2, TerraView 4.2.2, and SaTScanTM were used in the analysis. RESULTS We identified 169 hospitalizations due to tuberculosis. Most were of men (n = 134; 79.2%), averagely aged 48 years (SD = 16.2). The predominant clinical form was the pulmonary one, which was confirmed through a microscopic examination of expectorated sputum (n = 66; 39.0%). We geocoded 159 cases (94.0%). We observed a non-random spatial distribution of avoidable hospitalizations due to tuberculosis concentrated in the northern and western regions of the municipality. Through the scan statistic, three spatial clusters for risk of hospitalizations due to tuberculosis were identified, one of them in the northern region of the municipality (relative risk [RR] = 3.4; 95%CI 2.7–4,4); the second in the central region, where there is a prison unit (RR = 28.6; 95%CI 22.4–36.6); and the last one in the southern region, and area of protection for hospitalizations (RR = 0.2; 95%CI 0.2–0.3). We did not identify any space-time clusters. CONCLUSIONS The investigation showed priority areas for the control and surveillance of tuberculosis, as well as the profile of the affected population, which shows important aspects to be considered in terms of management and organization of health care services targeting effectiveness in primary health care.
Resumo:
The historically-reactive approach to identifying safety problems and mitigating them involves selecting black spots or hot spots by ranking locations based on crash frequency and severity. The approach focuses mainly on the corridor level without taking the exposure rate (vehicle miles traveled) and socio-demographics information of the study area, which are very important in the transportation planning process, into consideration. A larger study analysis unit at the Transportation Analysis Zone (TAZ) level or the network planning level should be used to address the needs of development of the community in the future and incorporate safety into the long-range transportation planning process. In this study, existing planning tools (such as the PLANSAFE models presented in NCHRP Report 546) were evaluated for forecasting safety in small and medium-sized communities, particularly as related to changes in socio-demographics characteristics, traffic demand, road network, and countermeasures. The research also evaluated the applicability of the Empirical Bayes (EB) method to network-level analysis. In addition, application of the United States Road Assessment Program (usRAP) protocols at the local urban road network level was investigated. This research evaluated the applicability of these three methods for the City of Ames, Iowa. The outcome of this research is a systematic process and framework for considering road safety issues explicitly in the small and medium-sized community transportation planning process and for quantifying the safety impacts of new developments and policy programs. More specifically, quantitative safety may be incorporated into the planning process, through effective visualization and increased awareness of safety issues (usRAP), the identification of high-risk locations with potential for improvement, (usRAP maps and EB), countermeasures for high-risk locations (EB before and after study and PLANSAFE), and socio-economic and demographic induced changes at the planning-level (PLANSAFE).
Resumo:
Highway agencies spend millions of dollars to ensure safe and efficient winter travel. However, the effectiveness of winter-weather maintenance practices on safety and mobility are somewhat difficult to quantify. Safety and Mobility Impacts of Winter Weather - Phase 1 investigated opportunities for improving traffic safety on state-maintained roads in Iowa during winter-weather conditions. In Phase 2, three Iowa Department of Transportation (DOT) high-priority sites were evaluated and realistic maintenance and operations mitigation strategies were also identified. In this project, site prioritization techniques for identifying roadway segments with the potential for safety improvements related to winter-weather crashes, were developed through traditional naïve statistical methods by using raw crash data for seven winter seasons and previously developed metrics. Additionally, crash frequency models were developed using integrated crash data for four winter seasons, with the objective of identifying factors that affect crash frequency during winter seasons and screening roadway segments using the empirical Bayes technique. Based on these prioritization techniques, 11 sites were identified and analyzed in conjunction with input from Iowa DOT district maintenance managers and snowplow operators and the Iowa DOT Road Weather Information System (RWIS) coordinator.
Resumo:
Cloud computing is a practically relevant paradigm in computing today. Testing is one of the distinct areas where cloud computing can be applied. This study addressed the applicability of cloud computing for testing within organizational and strategic contexts. The study focused on issues related to the adoption, use and effects of cloudbased testing. The study applied empirical research methods. The data was collected through interviews with practitioners from 30 organizations and was analysed using the grounded theory method. The research process consisted of four phases. The first phase studied the definitions and perceptions related to cloud-based testing. The second phase observed cloud-based testing in real-life practice. The third phase analysed quality in the context of cloud application development. The fourth phase studied the applicability of cloud computing in the gaming industry. The results showed that cloud computing is relevant and applicable for testing and application development, as well as other areas, e.g., game development. The research identified the benefits, challenges, requirements and effects of cloud-based testing; and formulated a roadmap and strategy for adopting cloud-based testing. The study also explored quality issues in cloud application development. As a special case, the research included a study on applicability of cloud computing in game development. The results can be used by companies to enhance the processes for managing cloudbased testing, evaluating practical cloud-based testing work and assessing the appropriateness of cloud-based testing for specific testing needs.
Resumo:
Pairs trading is an algorithmic trading strategy that is based on the historical co-movement of two separate assets and trades are executed on the basis of degree of relative mispricing. The purpose of this study is to explore one new and alternative copula-based method for pairs trading. The objective is to find out whether the copula method generates more trading opportunities and higher profits than the more traditional distance and cointegration methods applied extensively in previous empirical studies. Methods are compared by selecting top five pairs from stocks of the large and medium-sized companies in the Finnish stock market. The research period includes years 2006-2015. All the methods are proven to be profitable and the Finnish stock market suitable for pairs trading. However, copula method doesn’t generate more trading opportunities or higher profits than the other methods. It seems that the limitations of the more traditional methods are not too restrictive for this particular sample data.
Resumo:
We considered prediction techniques based on models of accelerated failure time with random e ects for correlated survival data. Besides the bayesian approach through empirical Bayes estimator, we also discussed about the use of a classical predictor, the Empirical Best Linear Unbiased Predictor (EBLUP). In order to illustrate the use of these predictors, we considered applications on a real data set coming from the oil industry. More speci - cally, the data set involves the mean time between failure of petroleum-well equipments of the Bacia Potiguar. The goal of this study is to predict the risk/probability of failure in order to help a preventive maintenance program. The results show that both methods are suitable to predict future failures, providing good decisions in relation to employment and economy of resources for preventive maintenance.
Resumo:
The use of saturated two-level designs is very popular, especially in industrial applications where the cost of experiments is too high. Standard classical approaches are not appropriate to analyze data from saturated designs, since we could only get the estimates of the main factor effects and we would not have degrees of freedom to estimate the variance of the error. In this paper, we propose the use of empirical Bayesian procedures to get inferences for data obtained from saturated designs. The proposed methodology is illustrated assuming a simulated data set. © 2013 Growing Science Ltd. All rights reserved.
Resumo:
Model-based calibration of steady-state engine operation is commonly performed with highly parameterized empirical models that are accurate but not very robust, particularly when predicting highly nonlinear responses such as diesel smoke emissions. To address this problem, and to boost the accuracy of more robust non-parametric methods to the same level, GT-Power was used to transform the empirical model input space into multiple input spaces that simplified the input-output relationship and improved the accuracy and robustness of smoke predictions made by three commonly used empirical modeling methods: Multivariate Regression, Neural Networks and the k-Nearest Neighbor method. The availability of multiple input spaces allowed the development of two committee techniques: a 'Simple Committee' technique that used averaged predictions from a set of 10 pre-selected input spaces chosen by the training data and the "Minimum Variance Committee" technique where the input spaces for each prediction were chosen on the basis of disagreement between the three modeling methods. This latter technique equalized the performance of the three modeling methods. The successively increasing improvements resulting from the use of a single best transformed input space (Best Combination Technique), Simple Committee Technique and Minimum Variance Committee Technique were verified with hypothesis testing. The transformed input spaces were also shown to improve outlier detection and to improve k-Nearest Neighbor performance when predicting dynamic emissions with steady-state training data. An unexpected finding was that the benefits of input space transformation were unaffected by changes in the hardware or the calibration of the underlying GT-Power model.