374 resultados para Negative Binomial model
Resumo:
Advances in safety research—trying to improve the collective understanding of motor vehicle crash causes and contributing factors—rest upon the pursuit of numerous lines of research inquiry. The research community has focused considerable attention on analytical methods development (negative binomial models, simultaneous equations, etc.), on better experimental designs (before-after studies, comparison sites, etc.), on improving exposure measures, and on model specification improvements (additive terms, non-linear relations, etc.). One might logically seek to know which lines of inquiry might provide the most significant improvements in understanding crash causation and/or prediction. It is the contention of this paper that the exclusion of important variables (causal or surrogate measures of causal variables) cause omitted variable bias in model estimation and is an important and neglected line of inquiry in safety research. In particular, spatially related variables are often difficult to collect and omitted from crash models—but offer significant opportunities to better understand contributing factors and/or causes of crashes. This study examines the role of important variables (other than Average Annual Daily Traffic (AADT)) that are generally omitted from intersection crash prediction models. In addition to the geometric and traffic regulatory information of intersection, the proposed model includes many spatial factors such as local influences of weather, sun glare, proximity to drinking establishments, and proximity to schools—representing a mix of potential environmental and human factors that are theoretically important, but rarely used. Results suggest that these variables in addition to AADT have significant explanatory power, and their exclusion leads to omitted variable bias. Provided is evidence that variable exclusion overstates the effect of minor road AADT by as much as 40% and major road AADT by 14%.
Resumo:
Readily accepted knowledge regarding crash causation is consistently omitted from efforts to model and subsequently understand motor vehicle crash occurrence and their contributing factors. For instance, distracted and impaired driving accounts for a significant proportion of crash occurrence, yet is rarely modeled explicitly. In addition, spatially allocated influences such as local law enforcement efforts, proximity to bars and schools, and roadside chronic distractions (advertising, pedestrians, etc.) play a role in contributing to crash occurrence and yet are routinely absent from crash models. By and large, these well-established omitted effects are simply assumed to contribute to model error, with predominant focus on modeling the engineering and operational effects of transportation facilities (e.g. AADT, number of lanes, speed limits, width of lanes, etc.) The typical analytical approach—with a variety of statistical enhancements—has been to model crashes that occur at system locations as negative binomial (NB) distributed events that arise from a singular, underlying crash generating process. These models and their statistical kin dominate the literature; however, it is argued in this paper that these models fail to capture the underlying complexity of motor vehicle crash causes, and thus thwart deeper insights regarding crash causation and prevention. This paper first describes hypothetical scenarios that collectively illustrate why current models mislead highway safety researchers and engineers. It is argued that current model shortcomings are significant, and will lead to poor decision-making. Exploiting our current state of knowledge of crash causation, crash counts are postulated to arise from three processes: observed network features, unobserved spatial effects, and ‘apparent’ random influences that reflect largely behavioral influences of drivers. It is argued; furthermore, that these three processes in theory can be modeled separately to gain deeper insight into crash causes, and that the model represents a more realistic depiction of reality than the state of practice NB regression. An admittedly imperfect empirical model that mixes three independent crash occurrence processes is shown to outperform the classical NB model. The questioning of current modeling assumptions and implications of the latent mixture model to current practice are the most important contributions of this paper, with an initial but rather vulnerable attempt to model the latent mixtures as a secondary contribution.
Resumo:
Objective: To examine the effects of extremely cold and hot temperatures on ischaemic heart disease (IHD) mortality in five cities (Beijing, Tianjin, Shanghai, Wuhan and Guangzhou) in China; and to examine the time relationships between cold and hot temperatures and IHD mortality for each city. Design: A negative binomial regression model combined with a distributed lag non-linear model was used to examine city-specific temperature effects on IHD mortality up to 20 lag days. A meta-analysis was used to pool the cold effects and hot effects across the five cities. Patients: 16 559 IHD deaths were monitored by a sentinel surveillance system in five cities during 2004–2008. Results: The relationships between temperature and IHD mortality were non-linear in all five cities. The minimum-mortality temperatures in northern cities were lower than in southern cities. In Beijing, Tianjin and Guangzhou, the effects of extremely cold temperatures were delayed, while Shanghai and Wuhan had immediate cold effects. The effects of extremely hot temperatures appeared immediately in all the cities except Wuhan. Meta-analysis showed that IHD mortality increased 48% at the 1st percentile of temperature (extremely cold temperature) compared with the 10th percentile, while IHD mortality increased 18% at the 99th percentile of temperature (extremely hot temperature) compared with the 90th percentile. Conclusions: Results indicate that both extremely cold and hot temperatures increase IHD mortality in China. Each city has its characteristics of heat effects on IHD mortality. The policy for response to climate change should consider local climate–IHD mortality relationships.
Resumo:
BACKGROUND Dengue fever (DF) outbreaks often arise from imported DF cases in Cairns, Australia. Few studies have incorporated imported DF cases in the estimation of the relationship between weather variability and incidence of autochthonous DF. The study aimed to examine the impact of weather variability on autochthonous DF infection after accounting for imported DF cases and then to explore the possibility of developing an empirical forecast system. METHODOLOGY/PRINCIPAL FINDS Data on weather variables, notified DF cases (including those acquired locally and overseas), and population size in Cairns were supplied by the Australian Bureau of Meteorology, Queensland Health, and Australian Bureau of Statistics. A time-series negative-binomial hurdle model was used to assess the effects of imported DF cases and weather variability on autochthonous DF incidence. Our results showed that monthly autochthonous DF incidences were significantly associated with monthly imported DF cases (Relative Risk (RR):1.52; 95% confidence interval (CI): 1.01-2.28), monthly minimum temperature ((o)C) (RR: 2.28; 95% CI: 1.77-2.93), monthly relative humidity (%) (RR: 1.21; 95% CI: 1.06-1.37), monthly rainfall (mm) (RR: 0.50; 95% CI: 0.31-0.81) and monthly standard deviation of daily relative humidity (%) (RR: 1.27; 95% CI: 1.08-1.50). In the zero hurdle component, the occurrence of monthly autochthonous DF cases was significantly associated with monthly minimum temperature (Odds Ratio (OR): 1.64; 95% CI: 1.01-2.67). CONCLUSIONS/SIGNIFICANCE Our research suggested that incidences of monthly autochthonous DF were strongly positively associated with monthly imported DF cases, local minimum temperature and inter-month relative humidity variability in Cairns. Moreover, DF outbreak in Cairns was driven by imported DF cases only under favourable seasons and weather conditions in the study.
Resumo:
Hot spot identification (HSID) aims to identify potential sites—roadway segments, intersections, crosswalks, interchanges, ramps, etc.—with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset.
Resumo:
The use of graphical processing unit (GPU) parallel processing is becoming a part of mainstream statistical practice. The reliance of Bayesian statistics on Markov Chain Monte Carlo (MCMC) methods makes the applicability of parallel processing not immediately obvious. It is illustrated that there are substantial gains in improved computational time for MCMC and other methods of evaluation by computing the likelihood using GPU parallel processing. Examples use data from the Global Terrorism Database to model terrorist activity in Colombia from 2000 through 2010 and a likelihood based on the explicit convolution of two negative-binomial processes. Results show decreases in computational time by a factor of over 200. Factors influencing these improvements and guidelines for programming parallel implementations of the likelihood are discussed.
Resumo:
We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms using indirect infer- ence. We embed this approach within a sequential Monte Carlo algorithm that is completely adaptive. This methodological development was motivated by an application involving data on macroparasite population evolution modelled with a trivariate Markov process. The main objective of the analysis is to compare inferences on the Markov process when considering two di®erent indirect mod- els. The two indirect models are based on a Beta-Binomial model and a three component mixture of Binomials, with the former providing a better ¯t to the observed data.
Resumo:
The primary aim of this descriptive exploration of scientists’ life cycle award patterns is to evaluate whether awards breed further awards and identify researcher experiences after reception of the Nobel Prize. To achieve this goal, we collected data on the number of awards received each year for 50 years before and after Nobel Prize reception by all 1901–2000 Nobel laureates in physics, chemistry, and medicine or physiology. Our results indicate an increasing rate of awards before Nobel reception, reaching the summit precisely in the year of the Nobel Prize. After this pinnacle year, awards drop sharply. This result is confirmed by separate analyses of three different disciplines and by a random-effects negative binomial regression model. Such an effect, however, does not emerge for more recent Nobel laureates (1971–2000). In addition, Nobelists in medicine or physiology generate more awards shortly before and after prize reception, whereas laureates in chemistry attract more awards as time progresses.
Resumo:
Pedestrian crashes are one of the major road safety problems in developing countries representing about 40% of total fatal crashes in low income countries. Despite the fact that many pedestrian crashes in these countries occur at unsignalized intersections such as roundabouts, studies focussing on this issue are limited—thus representing a critical research gap. The objective of this study is to develop safety performance functions for pedestrian crashes at modern roundabouts to identify significant roadway geometric, traffic and land use characteristics related to pedestrian safety. To establish the relationship between pedestrian crashes and various causal factors, detailed data including various forms of exposure, geometric and traffic characteristics, and spatial factors such as proximity to schools and proximity to drinking establishments were collected from a sample of 22 modern roundabouts in Addis Ababa, Ethiopia, representing about 56% of such roundabouts in Addis Ababa. To account for spatial correlation resulting from multiple observations at a roundabout, both the random effect Poisson (REP) and random effect Negative Binomial (RENB) regression models were estimated and compared. Model goodness of fit statistics reveal a marginally superior fit of the REP model compared to the RENB model of pedestrian crashes at roundabouts. Pedestrian crossing volume and the product of traffic volumes along major and minor road had significant and positive associations with pedestrian crashes at roundabouts. The presence of a public transport (bus/taxi) terminal beside a roundabout is associated with increased pedestrian crashes. While the maximum gradient of an approach road is negatively associated with pedestrian safety, the provision of a raised median along an approach appears to increase pedestrian safety at roundabouts. Remedial measures are identified for combating pedestrian safety problems at roundabouts in the context of a developing country.
Resumo:
Crashes at any particular transport network location consist of a chain of events arising from a multitude of potential causes and/or contributing factors whose nature is likely to reflect geometric characteristics of the road, spatial effects of the surrounding environment, and human behavioural factors. It is postulated that these potential contributing factors do not arise from the same underlying risk process, and thus should be explicitly modelled and understood. The state of the practice in road safety network management applies a safety performance function that represents a single risk process to explain crash variability across network sites. This study aims to elucidate the importance of differentiating among various underlying risk processes contributing to the observed crash count at any particular network location. To demonstrate the principle of this theoretical and corresponding methodological approach, the study explores engineering (e.g. segment length, speed limit) and unobserved spatial factors (e.g. climatic factors, presence of schools) as two explicit sources of crash contributing factors. A Bayesian Latent Class (BLC) analysis is used to explore these two sources and to incorporate prior information about their contribution to crash occurrence. The methodology is applied to the state controlled roads in Queensland, Australia and the results are compared with the traditional Negative Binomial (NB) model. A comparison of goodness of fit measures indicates that the model with a double risk process outperforms the single risk process NB model, and thus indicating the need for further research to capture all the three crash generation processes into the SPFs.
Resumo:
The current state of the practice in Blackspot Identification (BSI) utilizes safety performance functions based on total crash counts to identify transport system sites with potentially high crash risk. This paper postulates that total crash count variation over a transport network is a result of multiple distinct crash generating processes including geometric characteristics of the road, spatial features of the surrounding environment, and driver behaviour factors. However, these multiple sources are ignored in current modelling methodologies in both trying to explain or predict crash frequencies across sites. Instead, current practice employs models that imply that a single underlying crash generating process exists. The model mis-specification may lead to correlating crashes with the incorrect sources of contributing factors (e.g. concluding a crash is predominately caused by a geometric feature when it is a behavioural issue), which may ultimately lead to inefficient use of public funds and misidentification of true blackspots. This study aims to propose a latent class model consistent with a multiple crash process theory, and to investigate the influence this model has on correctly identifying crash blackspots. We first present the theoretical and corresponding methodological approach in which a Bayesian Latent Class (BLC) model is estimated assuming that crashes arise from two distinct risk generating processes including engineering and unobserved spatial factors. The Bayesian model is used to incorporate prior information about the contribution of each underlying process to the total crash count. The methodology is applied to the state-controlled roads in Queensland, Australia and the results are compared to an Empirical Bayesian Negative Binomial (EB-NB) model. A comparison of goodness of fit measures illustrates significantly improved performance of the proposed model compared to the NB model. The detection of blackspots was also improved when compared to the EB-NB model. In addition, modelling crashes as the result of two fundamentally separate underlying processes reveals more detailed information about unobserved crash causes.
Resumo:
We used geographic information systems and a spatial analysis approach to explore the pattern of Ross River virus (RRV) incidence in Brisbane, Australia. Climate, vegetation and socioeconomic data in 2001 were obtained from the Australian Bureau of Meteorology, the Brisbane City Council and the Australian Bureau of Statistics, respectively. Information on the RRV cases was obtained from the Queensland Department of Health. Spatial and multiple negative binomial regression models were used to identify the socioeconomic and environmental determinants of RRV transmission. The results show that RRV activity was primarily concentrated in the northeastern, northwestern, and southeastern regions in Brisbane. Multiple negative binomial regression models showed that the spatial pattern of RRV disease in Brisbane seemed to be determined by a combination of local ecologic, socioeconomic, and environmental factors.
Resumo:
Objectives: Ecological studies support the hypothesis that there is an association between vitamin D and pancreatic cancer (PaCa) mortality, but observational studies are somewhat conflicting. We sought to contribute further data to this issue by analyzing the differences in PaCa mortality across the eastern states of Australia and investigating if there is a role of vitamin D-effective ultraviolet radiation (DUVR), which is related to latitude. ---------- Methods: Mortality data from 1968 to 2005 were sourced from the Australian General Record of Incidence and Mortality books. Negative binomial models were fitted to calculate the association between state and PaCa mortality. Clear sky monthly DUVR in each capital city was also modeled. ---------- Results: Mortality from PaCa was 10% higher in southern states than in Queensland, with those in Victoria recording the highest mortality risk (relative risk, 1.13; 95% confidence interval, 1.09-1.17). We found a highly significant association between DUVR and PaCa mortality, with an estimated 1.5% decrease in the risk per 10-kJ/m2 increase in yearly DUVR. ---------- Conclusions: These data show an association between latitude, DUVR, and PaCa mortality. Although this study cannot be used to infer causality, it supports the need for further investigations of a possible role of vitamin D in PaCa etiology.
Resumo:
A study was done to develop macrolevel crash prediction models that can be used to understand and identify effective countermeasures for improving signalized highway intersections and multilane stop-controlled highway intersections in rural areas. Poisson and negative binomial regression models were fit to intersection crash data from Georgia, California, and Michigan. To assess the suitability of the models, several goodness-of-fit measures were computed. The statistical models were then used to shed light on the relationships between crash occurrence and traffic and geometric features of the rural signalized intersections. The results revealed that traffic flow variables significantly affected the overall safety performance of the intersections regardless of intersection type and that the geometric features of intersections varied across intersection type and also influenced crash type.
Resumo:
The two adjacent genes of coat protein 1 and 2 of rice tungro spherical virus (RTSV) were amplified from total RNA extracts of serologically indistinguishable field isolates from the Philippines and Indonesia, using reverse transcriptase polymerase chain reaction (RT-PCR). Digestion with HindIII and BstYI restriction endonucleases differentiated the amplified DNA products into eight distinct coat protein genotypes. These genotypes were then used as indicators of virus diversity in the field. Inter- and intra-site diversities were determined over three cropping seasons. At each of the sites surveyed, one or two main genotypes prevailed together with other related minor or mixed genotypes that did not replace the main genotype over the sampling time. The cluster of genotypes found at the Philippines sites was significantly different from the one at the Indonesia sites, suggesting geographic isolation for virus populations. Phylogenetic studies based on the nucleotide sequences of 38 selected isolates confirm the spatial distribution of RTSV virus populations but show that gene flow may occur between populations. Under the present conditions, rice varieties do not seem to exert selective pressure on the virus populations. Based on the selective constraints in the coat protein amino acid sequences and the virus genetic composition per site, a negative selection model followed by random-sampling events due to vector transmissions is proposed to explain the inter-site diversity observed