991 resultados para Binomial Model
Resumo:
2000 Mathematics Subject Classification: 62F15.
Resumo:
In the study of traffic safety, expected crash frequencies across sites are generally estimated via the negative binomial model, assuming time invariant safety. Since the time invariant safety assumption may be invalid, Hauer (1997) proposed a modified empirical Bayes (EB) method. Despite the modification, no attempts have been made to examine the generalisable form of the marginal distribution resulting from the modified EB framework. Because the hyper-parameters needed to apply the modified EB method are not readily available, an assessment is lacking on how accurately the modified EB method estimates safety in the presence of the time variant safety and regression-to-the-mean (RTM) effects. This study derives the closed form marginal distribution, and reveals that the marginal distribution in the modified EB method is equivalent to the negative multinomial (NM) distribution, which is essentially the same as the likelihood function used in the random effects Poisson model. As a result, this study shows that the gamma posterior distribution from the multivariate Poisson-gamma mixture can be estimated using the NM model or the random effects Poisson model. This study also shows that the estimation errors from the modified EB method are systematically smaller than those from the comparison group method by simultaneously accounting for the RTM and time variant safety effects. Hence, the modified EB method via the NM model is a generalisable method for estimating safety in the presence of the time variant safety and the RTM effects.
Resumo:
We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms using indirect infer- ence. We embed this approach within a sequential Monte Carlo algorithm that is completely adaptive. This methodological development was motivated by an application involving data on macroparasite population evolution modelled with a trivariate Markov process. The main objective of the analysis is to compare inferences on the Markov process when considering two di®erent indirect mod- els. The two indirect models are based on a Beta-Binomial model and a three component mixture of Binomials, with the former providing a better ¯t to the observed data.
Resumo:
In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.
Resumo:
Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence of time-dependent covariates; consequently we use a basis orthogonal to the space spanned by the covariates and use penalized quasi-likelihood (PQL) for estimation. We conclude that Enterococcus counts were greatly reduced near the Nut Island Treatment Plant (NITP) outfalls following the transfer of wastewaters from NITP to the Deer Island Treatment Plant (DITP) and that the transfer of wastewaters from Boston Harbor to the offshore diffusers in Massachusetts Bay reduced the Enterococcus counts near the DITP outfalls.
Resumo:
Poisson distribution has often been used for count like accident data. Negative Binomial (NB) distribution has been adopted in the count data to take care of the over-dispersion problem. However, Poisson and NB distributions are incapable of taking into account some unobserved heterogeneities due to spatial and temporal effects of accident data. To overcome this problem, Random Effect models have been developed. Again another challenge with existing traffic accident prediction models is the distribution of excess zero accident observations in some accident data. Although Zero-Inflated Poisson (ZIP) model is capable of handling the dual-state system in accident data with excess zero observations, it does not accommodate the within-location correlation and between-location correlation heterogeneities which are the basic motivations for the need of the Random Effect models. This paper proposes an effective way of fitting ZIP model with location specific random effects and for model calibration and assessment the Bayesian analysis is recommended.
Resumo:
Extending recent research on the importance of specific resources and skills for the internationalization of start-ups, this article tests a negative binomial model on a sample of 520 recently created high technology firms from the UK and Germany. The results show that previous international experience of entrepreneurs facilitates the rapid penetration of foreign markets, especially when the company features a clear and deliberate strategic intent of internationalization from the outset. This research provides one of the first empirical studies linking the influence of entrepreneurial teams to a high probability of success in the internationalization of high-technology ventures.
Resumo:
Background: Developing sampling strategies to target biological pests such as insects in stored grain is inherently difficult owing to species biology and behavioural characteristics. The design of robust sampling programmes should be based on an underlying statistical distribution that is sufficiently flexible to capture variations in the spatial distribution of the target species. Results: Comparisons are made of the accuracy of four probability-of-detection sampling models - the negative binomial model,1 the Poisson model,1 the double logarithmic model2 and the compound model3 - for detection of insects over a broad range of insect densities. Although the double log and negative binomial models performed well under specific conditions, it is shown that, of the four models examined, the compound model performed the best over a broad range of insect spatial distributions and densities. In particular, this model predicted well the number of samples required when insect density was high and clumped within experimental storages. Conclusions: This paper reinforces the need for effective sampling programs designed to detect insects over a broad range of spatial distributions. The compound model is robust over a broad range of insect densities and leads to substantial improvement in detection probabilities within highly variable systems such as grain storage.
Resumo:
Understanding pedestrian crash causes and contributing factors in developing countries is critically important as they account for about 55% of all traffic crashes. Not surprisingly, considerable attention in the literature has been paid to road traffic crash prediction models and methodologies in developing countries of late. Despite this interest, there are significant challenges confronting safety managers in developing countries. For example, in spite of the prominence of pedestrian crashes occurring on two-way two-lane rural roads, it has proven difficult to develop pedestrian crash prediction models due to a lack of both traffic and pedestrian exposure data. This general lack of available data has further hampered identification of pedestrian crash causes and subsequent estimation of pedestrian safety performance functions. The challenges are similar across developing nations, where little is known about the relationship between pedestrian crashes, traffic flow, and road environment variables on rural two-way roads, and where unique predictor variables may be needed to capture the unique crash risk circumstances. This paper describes pedestrian crash safety performance functions for two-way two-lane rural roads in Ethiopia as a function of traffic flow, pedestrian flows, and road geometry characteristics. In particular, random parameter negative binomial model was used to investigate pedestrian crashes. The models and their interpretations make important contributions to road crash analysis and prevention in developing countries. They also assist in the identification of the contributing factors to pedestrian crashes, with the intent to identify potential design and operational improvements.
Resumo:
Tämän tutkielman tarkoituksena on määrittää kesämökkikäynnin virkistysarvo. Aihetta ei ole aikaisemmin tutkittu, vaikka kesämökkeily on merkittävä osa suomalaista elämää. Kesämökkikäynnin virkistysarvo tarkoittaa hyötyä, jonka yksilö saa kesämökillä virkistäytymisestä. Virkistäytyminen kesämökillä pitää sisällään kaiken kesämökillä ja sen ympäristössä tapahtuvan harrastamisen ja rentoutumisen. Koska ympäristö on tärkeässä osassa mökillä virkistäytymisessä, tässä tutkielmassa on lisäksi tarkoitus tutkia, kuinka mökkiympäristön ominaisuudet vaikuttavat virkistysarvoon. Tarkasteltavina ympäristön ominaisuuksina ovat virkistäytymisen estävät leväkukinnot ja mökin rannattomuus. Koska mökkeily toisaalta myös kuormittaa ympäristöä, tutkielmassa tutkitaan myös, kuinka sähköistys, ympäristöä kuormittava kesämökin ominaisuus, vaikuttaa virkistysarvoon. Virkistysarvo on markkinaton hyöty, joten sen määrittämiseen on käytettävä jotain markkinattomien hyödykkeiden arvottamismenetelmää. Tässä työssä arvottaminen tapahtuu matkakustannusmenetelmällä, jota käytetään yleisesti ympäristön tarjoamien virkistyspalveluiden taloudelliseen arvottamiseen. Kesämökkikäyntien kysyntää kuvaava matkakustannusmallin ekonometrinen mallintaminen suoritetaan negatiivisella binomimallilla. Tutkielman tulosten mukaan noin neljän päivän pituinen käynti sähköistetyllä kesämökillä, jossa on ranta eivätkä levät häiritse virkistäytymistä, tuottaa 167-205 euron suuruisen virkistyshyödyn. Virkistäytymisen estävät leväkukinnot laskevat arvoa 40 prosentilla ja mökin rannattomuus 45 prosentilla. Käynti sähköistetyllä mökillä tuottaa 3-5 prosenttia korkeamman virkistyshyödyn kuin käynti sähköistämättömällä mökillä. Suomessa kesän aikana tehtävien mökkikäyntien yhteenlaskettu virkistyshyöty on 430-530 miljoonaa, jos mökillä on ranta, jossa levistä ei ole haittaa. Häiritsevät leväkukinnot laskevat yhteenlaskettua virkistyshyötyä 30 miljoonalla ja rannattomuus 10-20 miljoonalla. Sähköistys nostaa yhteenlaskettua virkistyshyötyä 20-30 miljoonalla eurolla.
Resumo:
Seasonal population dynamics of the digenean Phyllodistomum pawlovskii in the urinary bladder of the bullhead catfish, Pseudobagrus fulvidraco, were investigated in Liangzi Lake in the flood plain of the Yangtze River in China from February 2001 to July 2002. The overall prevalence of the parasite was high, 41.5% (n = 1,476), while the mean abundance was relatively low, 1.24 +/- 2.11. The parasite exhibited evident seasonality in changes of prevalence and abundance. In brief, prevalence and abundance were very low in midwinter (January), but increased and remained relatively high in other seasons and months. The distribution pattern of this parasite in the fish was overdispersed, with a variance to mean ratio > 1, but its frequency distribution could not be described by the negative binomial model. There were positive correlations between the number of the parasites per fish and the age and length of the fish; a peaked age-parasite abundance curve was not detected in the parasite-host association. It is suggested that the parasite P. pawlovskii has little effect on the population structure of the bullhead catfish.
Resumo:
Global amphibian declines are a major element of the current biodiversity crisis. Monitoring changes in the distribution and abundance of target species is a basic component in conservation decision making and requires robust and repeatable sampling. For EU member states, surveillance of designated species, including the common frog Rana temporaria, is a formal requirement of the 'EC Habitats & Species Directive'. We deployed established methods for estimating frog population density at local water bodies and extrapolated these to the national and ecoregion scale. Spawn occurred at 49.4% of water bodies and 70.1% of independent 500-m survey squares. Using spawn mat area, we estimated the number of adult breeding females and subsequently the total population assuming a sex ratio of 1:1. A negative binomial model suggested that mean frog density was 23.5 frogsha [95% confidence interval (CI) 14.9-44.0] equating to 196M frogs (95%CI 124M-367M) throughout Ireland. A total of 86% of frogs bred in drainage ditches, which were a notably common feature of the landscape. The recorded distribution of the species did not change significantly between the last Article 17 reporting period (1993-2006) and the current period (2007-2011) throughout the Republic of Ireland. Recording effort was markedly lower in Northern Ireland, which led to an apparent decline in the recorded distribution. We highlight the need to coordinate biological surveys between adjacent political jurisdictions that share a common ecoregion to avoid apparent disparities in the quality of distributional information. Power analysis suggested that a reduced sample of 40-50 survey squares is sufficient to detect a 30% decline (consistent with the International Union for Conservation of Nature Category of 'Vulnerable') at 80% power providing guidance for minimizing future survey effort. Our results provin assessments for R. temporaria and other clump-spawning amphibians. 2013 The Zoological Society of London.
Resumo:
Les données comptées (count data) possèdent des distributions ayant des caractéristiques particulières comme la non-normalité, l’hétérogénéité des variances ainsi qu’un nombre important de zéros. Il est donc nécessaire d’utiliser les modèles appropriés afin d’obtenir des résultats non biaisés. Ce mémoire compare quatre modèles d’analyse pouvant être utilisés pour les données comptées : le modèle de Poisson, le modèle binomial négatif, le modèle de Poisson avec inflation du zéro et le modèle binomial négatif avec inflation du zéro. À des fins de comparaisons, la prédiction de la proportion du zéro, la confirmation ou l’infirmation des différentes hypothèses ainsi que la prédiction des moyennes furent utilisées afin de déterminer l’adéquation des différents modèles. Pour ce faire, le nombre d’arrestations des membres de gangs de rue sur le territoire de Montréal fut utilisé pour la période de 2005 à 2007. L’échantillon est composé de 470 hommes, âgés de 18 à 59 ans. Au terme des analyses, le modèle le plus adéquat est le modèle binomial négatif puisque celui-ci produit des résultats significatifs, s’adapte bien aux données observées et produit une proportion de zéro très similaire à celle observée.
Resumo:
Exam and solutions in LaTex