813 resultados para Prediction models


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical modeling of traffic crashes has been of interest to researchers for decades. Over the most recent decade many crash models have accounted for extra-variation in crash counts—variation over and above that accounted for by the Poisson density. The extra-variation – or dispersion – is theorized to capture unaccounted for variation in crashes across sites. The majority of studies have assumed fixed dispersion parameters in over-dispersed crash models—tantamount to assuming that unaccounted for variation is proportional to the expected crash count. Miaou and Lord [Miaou, S.P., Lord, D., 2003. Modeling traffic crash-flow relationships for intersections: dispersion parameter, functional form, and Bayes versus empirical Bayes methods. Transport. Res. Rec. 1840, 31–40] challenged the fixed dispersion parameter assumption, and examined various dispersion parameter relationships when modeling urban signalized intersection accidents in Toronto. They suggested that further work is needed to determine the appropriateness of the findings for rural as well as other intersection types, to corroborate their findings, and to explore alternative dispersion functions. This study builds upon the work of Miaou and Lord, with exploration of additional dispersion functions, the use of an independent data set, and presents an opportunity to corroborate their findings. Data from Georgia are used in this study. A Bayesian modeling approach with non-informative priors is adopted, using sampling-based estimation via Markov Chain Monte Carlo (MCMC) and the Gibbs sampler. A total of eight model specifications were developed; four of them employed traffic flows as explanatory factors in mean structure while the remainder of them included geometric factors in addition to major and minor road traffic flows. The models were compared and contrasted using the significance of coefficients, standard deviance, chi-square goodness-of-fit, and deviance information criteria (DIC) statistics. The findings indicate that the modeling of the dispersion parameter, which essentially explains the extra-variance structure, depends greatly on how the mean structure is modeled. In the presence of a well-defined mean function, the extra-variance structure generally becomes insignificant, i.e. the variance structure is a simple function of the mean. It appears that extra-variation is a function of covariates when the mean structure (expected crash count) is poorly specified and suffers from omitted variables. In contrast, when sufficient explanatory variables are used to model the mean (expected crash count), extra-Poisson variation is not significantly related to these variables. If these results are generalizable, they suggest that model specification may be improved by testing extra-variation functions for significance. They also suggest that known influences of expected crash counts are likely to be different than factors that might help to explain unaccounted for variation in crashes across sites

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Predicting safety on roadways is standard practice for road safety professionals and has a corresponding extensive literature. The majority of safety prediction models are estimated using roadway segment and intersection (microscale) data, while more recently efforts have been undertaken to predict safety at the planning level (macroscale). Safety prediction models typically include roadway, operations, and exposure variables—factors known to affect safety in fundamental ways. Environmental variables, in particular variables attempting to capture the effect of rain on road safety, are difficult to obtain and have rarely been considered. In the few cases weather variables have been included, historical averages rather than actual weather conditions during which crashes are observed have been used. Without the inclusion of weather related variables researchers have had difficulty explaining regional differences in the safety performance of various entities (e.g. intersections, road segments, highways, etc.) As part of the NCHRP 8-44 research effort, researchers developed PLANSAFE, or planning level safety prediction models. These models make use of socio-economic, demographic, and roadway variables for predicting planning level safety. Accounting for regional differences - similar to the experience for microscale safety models - has been problematic during the development of planning level safety prediction models. More specifically, without weather related variables there is an insufficient set of variables for explaining safety differences across regions and states. Furthermore, omitted variable bias resulting from excluding these important variables may adversely impact the coefficients of included variables, thus contributing to difficulty in model interpretation and accuracy. This paper summarizes the results of an effort to include weather related variables, particularly various measures of rainfall, into accident frequency prediction and the prediction of the frequency of fatal and/or injury degree of severity crash models. The purpose of the study was to determine whether these variables do in fact improve overall goodness of fit of the models, whether these variables may explain some or all of observed regional differences, and identifying the estimated effects of rainfall on safety. The models are based on Traffic Analysis Zone level datasets from Michigan, and Pima and Maricopa Counties in Arizona. Numerous rain-related variables were found to be statistically significant, selected rain related variables improved the overall goodness of fit, and inclusion of these variables reduced the portion of the model explained by the constant in the base models without weather variables. Rain tends to diminish safety, as expected, in fairly complex ways, depending on rain frequency and intensity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A study was done to develop macrolevel crash prediction models that can be used to understand and identify effective countermeasures for improving signalized highway intersections and multilane stop-controlled highway intersections in rural areas. Poisson and negative binomial regression models were fit to intersection crash data from Georgia, California, and Michigan. To assess the suitability of the models, several goodness-of-fit measures were computed. The statistical models were then used to shed light on the relationships between crash occurrence and traffic and geometric features of the rural signalized intersections. The results revealed that traffic flow variables significantly affected the overall safety performance of the intersections regardless of intersection type and that the geometric features of intersections varied across intersection type and also influenced crash type.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Early models of bankruptcy prediction employed financial ratios drawn from pre-bankruptcy financial statements and performed well both in-sample and out-of-sample. Since then there has been an ongoing effort in the literature to develop models with even greater predictive performance. A significant innovation in the literature was the introduction into bankruptcy prediction models of capital market data such as excess stock returns and stock return volatility, along with the application of the Black–Scholes–Merton option-pricing model. In this note, we test five key bankruptcy models from the literature using an upto- date data set and find that they each contain unique information regarding the probability of bankruptcy but that their performance varies over time. We build a new model comprising key variables from each of the five models and add a new variable that proxies for the degree of diversification within the firm. The degree of diversification is shown to be negatively associated with the risk of bankruptcy. This more general model outperforms the existing models in a variety of in-sample and out-of-sample tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional crash prediction models, such as generalized linear regression models, are incapable of taking into account the multilevel data structure, which extensively exists in crash data. Disregarding the possible within-group correlations can lead to the production of models giving unreliable and biased estimates of unknowns. This study innovatively proposes a -level hierarchy, viz. (Geographic region level – Traffic site level – Traffic crash level – Driver-vehicle unit level – Vehicle-occupant level) Time level, to establish a general form of multilevel data structure in traffic safety analysis. To properly model the potential cross-group heterogeneity due to the multilevel data structure, a framework of Bayesian hierarchical models that explicitly specify multilevel structure and correctly yield parameter estimates is introduced and recommended. The proposed method is illustrated in an individual-severity analysis of intersection crashes using the Singapore crash records. This study proved the importance of accounting for the within-group correlations and demonstrated the flexibilities and effectiveness of the Bayesian hierarchical method in modeling multilevel structure of traffic crash data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern-day weather forecasting is highly dependent on Numerical Weather Prediction (NWP) models as the main data source. The evolving state of the atmosphere with time can be numerically predicted by solving a set of hydrodynamic equations, if the initial state is known. However, such a modelling approach always contains approximations that by and large depend on the purpose of use and resolution of the models. Present-day NWP systems operate with horizontal model resolutions in the range from about 40 km to 10 km. Recently, the aim has been to reach operationally to scales of 1 4 km. This requires less approximations in the model equations, more complex treatment of physical processes and, furthermore, more computing power. This thesis concentrates on the physical parameterization methods used in high-resolution NWP models. The main emphasis is on the validation of the grid-size-dependent convection parameterization in the High Resolution Limited Area Model (HIRLAM) and on a comprehensive intercomparison of radiative-flux parameterizations. In addition, the problems related to wind prediction near the coastline are addressed with high-resolution meso-scale models. The grid-size-dependent convection parameterization is clearly beneficial for NWP models operating with a dense grid. Results show that the current convection scheme in HIRLAM is still applicable down to a 5.6 km grid size. However, with further improved model resolution, the tendency of the model to overestimate strong precipitation intensities increases in all the experiment runs. For the clear-sky longwave radiation parameterization, schemes used in NWP-models provide much better results in comparison with simple empirical schemes. On the other hand, for the shortwave part of the spectrum, the empirical schemes are more competitive for producing fairly accurate surface fluxes. Overall, even the complex radiation parameterization schemes used in NWP-models seem to be slightly too transparent for both long- and shortwave radiation in clear-sky conditions. For cloudy conditions, simple cloud correction functions are tested. In case of longwave radiation, the empirical cloud correction methods provide rather accurate results, whereas for shortwave radiation the benefit is only marginal. Idealised high-resolution two-dimensional meso-scale model experiments suggest that the reason for the observed formation of the afternoon low level jet (LLJ) over the Gulf of Finland is an inertial oscillation mechanism, when the large-scale flow is from the south-east or west directions. The LLJ is further enhanced by the sea-breeze circulation. A three-dimensional HIRLAM experiment, with a 7.7 km grid size, is able to generate a similar LLJ flow structure as suggested by the 2D-experiments and observations. It is also pointed out that improved model resolution does not necessary lead to better wind forecasts in the statistical sense. In nested systems, the quality of the large-scale host model is really important, especially if the inner meso-scale model domain is small.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis report attempts to improve the models for predicting forest stand structure for practical use, e.g. forest management planning (FMP) purposes in Finland. Comparisons were made between Weibull and Johnson s SB distribution and alternative regression estimation methods. Data used for preliminary studies was local but the final models were based on representative data. Models were validated mainly in terms of bias and RMSE in the main stand characteristics (e.g. volume) using independent data. The bivariate SBB distribution model was used to mimic realistic variations in tree dimensions by including within-diameter-class height variation. Using the traditional method, diameter distribution with the expected height resulted in reduced height variation, whereas the alternative bivariate method utilized the error-term of the height model. The lack of models for FMP was covered to some extent by the models for peatland and juvenile stands. The validation of these models showed that the more sophisticated regression estimation methods provided slightly improved accuracy. A flexible prediction and application for stand structure consisted of seemingly unrelated regression models for eight stand characteristics, the parameters of three optional distributions and Näslund s height curve. The cross-model covariance structure was used for linear prediction application, in which the expected values of the models were calibrated with the known stand characteristics. This provided a framework to validate the optional distributions and the optional set of stand characteristics. Height distribution is recommended for the earliest state of stands because of its continuous feature. From the mean height of about 4 m, Weibull dbh-frequency distribution is recommended in young stands if the input variables consist of arithmetic stand characteristics. In advanced stands, basal area-dbh distribution models are recommended. Näslund s height curve proved useful. Some efficient transformations of stand characteristics are introduced, e.g. the shape index, which combined the basal area, the stem number and the median diameter. Shape index enabled SB model for peatland stands to detect large variation in stand densities. This model also demonstrated reasonable behaviour for stands in mineral soils.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The performance of prediction models is often based on ``abstract metrics'' that estimate the model's ability to limit residual errors between the observed and predicted values. However, meaningful evaluation and selection of prediction models for end-user domains requires holistic and application-sensitive performance measures. Inspired by energy consumption prediction models used in the emerging ``big data'' domain of Smart Power Grids, we propose a suite of performance measures to rationally compare models along the dimensions of scale independence, reliability, volatility and cost. We include both application independent and dependent measures, the latter parameterized to allow customization by domain experts to fit their scenario. While our measures are generalizable to other domains, we offer an empirical analysis using real energy use data for three Smart Grid applications: planning, customer education and demand response, which are relevant for energy sustainability. Our results underscore the value of the proposed measures to offer a deeper insight into models' behavior and their impact on real applications, which benefit both data mining researchers and practitioners.