982 resultados para Poisson model
Resumo:
In this paper, we consider the problem of estimating the number of times an air quality standard is exceeded in a given period of time. A non-homogeneous Poisson model is proposed to analyse this issue. The rate at which the Poisson events occur is given by a rate function lambda(t), t >= 0. This rate function also depends on some parameters that need to be estimated. Two forms of lambda(t), t >= 0 are considered. One of them is of the Weibull form and the other is of the exponentiated-Weibull form. The parameters estimation is made using a Bayesian formulation based on the Gibbs sampling algorithm. The assignation of the prior distributions for the parameters is made in two stages. In the first stage, non-informative prior distributions are considered. Using the information provided by the first stage, more informative prior distributions are used in the second one. The theoretical development is applied to data provided by the monitoring network of Mexico City. The rate function that best fit the data varies according to the region of the city and/or threshold that is considered. In some cases the best fit is the Weibull form and in other cases the best option is the exponentiated-Weibull. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In the study of traffic safety, expected crash frequencies across sites are generally estimated via the negative binomial model, assuming time invariant safety. Since the time invariant safety assumption may be invalid, Hauer (1997) proposed a modified empirical Bayes (EB) method. Despite the modification, no attempts have been made to examine the generalisable form of the marginal distribution resulting from the modified EB framework. Because the hyper-parameters needed to apply the modified EB method are not readily available, an assessment is lacking on how accurately the modified EB method estimates safety in the presence of the time variant safety and regression-to-the-mean (RTM) effects. This study derives the closed form marginal distribution, and reveals that the marginal distribution in the modified EB method is equivalent to the negative multinomial (NM) distribution, which is essentially the same as the likelihood function used in the random effects Poisson model. As a result, this study shows that the gamma posterior distribution from the multivariate Poisson-gamma mixture can be estimated using the NM model or the random effects Poisson model. This study also shows that the estimation errors from the modified EB method are systematically smaller than those from the comparison group method by simultaneously accounting for the RTM and time variant safety effects. Hence, the modified EB method via the NM model is a generalisable method for estimating safety in the presence of the time variant safety and the RTM effects.
Resumo:
The thermodynamical model of intermittency in fully developed turbulence due to Castaing (B. Castaing, J. Phys. II France 6 (1996) 105) is investigated and compared with the log-Poisson model (Z-S, She, E. Leveque, Phys. Rev. Lett. 72 (1994) 336). It is shown that the thermodynamical model obeys general scaling laws and corresponds to the degenerate class of scale-invariant statistics. We also find that its structure function shapes have physical behaviors similar to the log-Poisson's one. The only difference between them lies in the convergence of the log-Poisson's structure functions and divergence of the thermodynamical one. As far as the comparison with experiments on intermittency is concerned, they are indifferent.
Resumo:
Les données comptées (count data) possèdent des distributions ayant des caractéristiques particulières comme la non-normalité, l’hétérogénéité des variances ainsi qu’un nombre important de zéros. Il est donc nécessaire d’utiliser les modèles appropriés afin d’obtenir des résultats non biaisés. Ce mémoire compare quatre modèles d’analyse pouvant être utilisés pour les données comptées : le modèle de Poisson, le modèle binomial négatif, le modèle de Poisson avec inflation du zéro et le modèle binomial négatif avec inflation du zéro. À des fins de comparaisons, la prédiction de la proportion du zéro, la confirmation ou l’infirmation des différentes hypothèses ainsi que la prédiction des moyennes furent utilisées afin de déterminer l’adéquation des différents modèles. Pour ce faire, le nombre d’arrestations des membres de gangs de rue sur le territoire de Montréal fut utilisé pour la période de 2005 à 2007. L’échantillon est composé de 470 hommes, âgés de 18 à 59 ans. Au terme des analyses, le modèle le plus adéquat est le modèle binomial négatif puisque celui-ci produit des résultats significatifs, s’adapte bien aux données observées et produit une proportion de zéro très similaire à celle observée.
Resumo:
Les simulations et figures ont été réalisées avec le logiciel R.
Resumo:
In this paper, we consider some non-homogeneous Poisson models to estimate the probability that an air quality standard is exceeded a given number of times in a time interval of interest. We assume that the number of exceedances occurs according to a non-homogeneous Poisson process (NHPP). This Poisson process has rate function lambda(t), t >= 0, which depends on some parameters that must be estimated. We take into account two cases of rate functions: the Weibull and the Goel-Okumoto. We consider models with and without change-points. When the presence of change-points is assumed, we may have the presence of either one, two or three change-points, depending of the data set. The parameters of the rate functions are estimated using a Gibbs sampling algorithm. Results are applied to ozone data provided by the Mexico City monitoring network. In a first instance, we assume that there are no change-points present. Depending on the adjustment of the model, we assume the presence of either one, two or three change-points. Copyright (C) 2009 John Wiley & Sons, Ltd.
Resumo:
Cattle resistance to ticks is measured by the number of ticks infesting the animal. The model used for the genetic analysis of cattle resistance to ticks frequently requires logarithmic transformation of the observations. The objective of this study was to evaluate the predictive ability and goodness of fit of different models for the analysis of this trait in cross-bred Hereford x Nellore cattle. Three models were tested: a linear model using logarithmic transformation of the observations (MLOG); a linear model without transformation of the observations (MLIN); and a generalized linear Poisson model with residual term (MPOI). All models included the classificatory effects of contemporary group and genetic group and the covariates age of animal at the time of recording and individual heterozygosis, as well as additive genetic effects as random effects. Heritability estimates were 0.08 ± 0.02, 0.10 ± 0.02 and 0.14 ± 0.04 for MLIN, MLOG and MPOI models, respectively. The model fit quality, verified by deviance information criterion (DIC) and residual mean square, indicated fit superiority of MPOI model. The predictive ability of the models was compared by validation test in independent sample. The MPOI model was slightly superior in terms of goodness of fit and predictive ability, whereas the correlations between observed and predicted tick counts were practically the same for all models. A higher rank correlation between breeding values was observed between models MLOG and MPOI. Poisson model can be used for the selection of tick-resistant animals. © 2013 Blackwell Verlag GmbH.
Resumo:
Suicide has drawn much attention from both the scientific community and the public. Examining the impact of socio-environmental factors on suicide is essential in developing suicide prevention strategies and interventions, because it will provide health authorities with important information for their decision-making. However, previous studies did not examine the impact of socio-environmental factors on suicide using a spatial analysis approach. The purpose of this study was to identify the patterns of suicide and to examine how socio-environmental factors impact on suicide over time and space at the Local Governmental Area (LGA) level in Queensland. The suicide data between 1999 and 2003 were collected from the Australian Bureau of Statistics (ABS). Socio-environmental variables at the LGA level included climate (rainfall, maximum and minimum temperature), Socioeconomic Indexes for Areas (SEIFA) and demographic variables (proportion of Indigenous population, unemployment rate, proportion of population with low income and low education level). Climate data were obtained from Australian Bureau of Meteorology. SEIFA and demographic variables were acquired from ABS. A series of statistical and geographical information system (GIS) approaches were applied in the analysis. This study included two stages. The first stage used average annual data to view the spatial pattern of suicide and to examine the association between socio-environmental factors and suicide over space. The second stage examined the spatiotemporal pattern of suicide and assessed the socio-environmental determinants of suicide, using more detailed seasonal data. In this research, 2,445 suicide cases were included, with 1,957 males (80.0%) and 488 females (20.0%). In the first stage, we examined the spatial pattern and the determinants of suicide using 5-year aggregated data. Spearman correlations were used to assess associations between variables. Then a Poisson regression model was applied in the multivariable analysis, as the occurrence of suicide is a small probability event and this model fitted the data quite well. Suicide mortality varied across LGAs and was associated with a range of socio-environmental factors. The multivariable analysis showed that maximum temperature was significantly and positively associated with male suicide (relative risk [RR] = 1.03, 95% CI: 1.00 to 1.07). Higher proportion of Indigenous population was accompanied with more suicide in male population (male: RR = 1.02, 95% CI: 1.01 to 1.03). There was a positive association between unemployment rate and suicide in both genders (male: RR = 1.04, 95% CI: 1.02 to 1.06; female: RR = 1.07, 95% CI: 1.00 to 1.16). No significant association was observed for rainfall, minimum temperature, SEIFA, proportion of population with low individual income and low educational attainment. In the second stage of this study, we undertook a preliminary spatiotemporal analysis of suicide using seasonal data. Firstly, we assessed the interrelations between variables. Secondly, a generalised estimating equations (GEE) model was used to examine the socio-environmental impact on suicide over time and space, as this model is well suited to analyze repeated longitudinal data (e.g., seasonal suicide mortality in a certain LGA) and it fitted the data better than other models (e.g., Poisson model). The suicide pattern varied with season and LGA. The north of Queensland had the highest suicide mortality rate in all the seasons, while there was no suicide case occurred in the southwest. Northwest had consistently higher suicide mortality in spring, autumn and winter. In other areas, suicide mortality varied between seasons. This analysis showed that maximum temperature was positively associated with suicide among male population (RR = 1.24, 95% CI: 1.04 to 1.47) and total population (RR = 1.15, 95% CI: 1.00 to 1.32). Higher proportion of Indigenous population was accompanied with more suicide among total population (RR = 1.16, 95% CI: 1.13 to 1.19) and by gender (male: RR = 1.07, 95% CI: 1.01 to 1.13; female: RR = 1.23, 95% CI: 1.03 to 1.48). Unemployment rate was positively associated with total (RR = 1.40, 95% CI: 1.24 to 1.59) and female (RR=1.09, 95% CI: 1.01 to 1.18) suicide. There was also a positive association between proportion of population with low individual income and suicide in total (RR = 1.28, 95% CI: 1.10 to 1.48) and male (RR = 1.45, 95% CI: 1.23 to 1.72) population. Rainfall was only positively associated with suicide in total population (RR = 1.11, 95% CI: 1.04 to 1.19). There was no significant association for rainfall, minimum temperature, SEIFA, proportion of population with low educational attainment. The second stage is the extension of the first stage. Different spatial scales of dataset were used between the two stages (i.e., mean yearly data in the first stage, and seasonal data in the second stage), but the results are generally consistent with each other. Compared with other studies, this research explored the variety of the impact of a wide range of socio-environmental factors on suicide in different geographical units. Maximum temperature, proportion of Indigenous population, unemployment rate and proportion of population with low individual income were among the major determinants of suicide in Queensland. However, the influence from other factors (e.g. socio-culture background, alcohol and drug use) influencing suicide cannot be ignored. An in-depth understanding of these factors is vital in planning and implementing suicide prevention strategies. Five recommendations for future research are derived from this study: (1) It is vital to acquire detailed personal information on each suicide case and relevant information among the population in assessing the key socio-environmental determinants of suicide; (2) Bayesian model could be applied to compare mortality rates and their socio-environmental determinants across LGAs in future research; (3) In the LGAs with warm weather, high proportion of Indigenous population and/or unemployment rate, concerted efforts need to be made to control and prevent suicide and other mental health problems; (4) The current surveillance, forecasting and early warning system needs to be strengthened, to trace the climate and socioeconomic change over time and space and its impact on population health; (5) It is necessary to evaluate and improve the facilities of mental health care, psychological consultation, suicide prevention and control programs; especially in the areas with low socio-economic status, high unemployment rate, extreme weather events and natural disasters.
Resumo:
The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.
Resumo:
This study proposes a framework of a model-based hot spot identification method by applying full Bayes (FB) technique. In comparison with the state-of-the-art approach [i.e., empirical Bayes method (EB)], the advantage of the FB method is the capability to seamlessly integrate prior information and all available data into posterior distributions on which various ranking criteria could be based. With intersection crash data collected in Singapore, an empirical analysis was conducted to evaluate the following six approaches for hot spot identification: (a) naive ranking using raw crash data, (b) standard EB ranking, (c) FB ranking using a Poisson-gamma model, (d) FB ranking using a Poisson-lognormal model, (e) FB ranking using a hierarchical Poisson model, and (f) FB ranking using a hierarchical Poisson (AR-1) model. The results show that (a) when using the expected crash rate-related decision parameters, all model-based approaches perform significantly better in safety ranking than does the naive ranking method, and (b) the FB approach using hierarchical models significantly outperforms the standard EB approach in correctly identifying hazardous sites.
Resumo:
Background: Developing sampling strategies to target biological pests such as insects in stored grain is inherently difficult owing to species biology and behavioural characteristics. The design of robust sampling programmes should be based on an underlying statistical distribution that is sufficiently flexible to capture variations in the spatial distribution of the target species. Results: Comparisons are made of the accuracy of four probability-of-detection sampling models - the negative binomial model,1 the Poisson model,1 the double logarithmic model2 and the compound model3 - for detection of insects over a broad range of insect densities. Although the double log and negative binomial models performed well under specific conditions, it is shown that, of the four models examined, the compound model performed the best over a broad range of insect spatial distributions and densities. In particular, this model predicted well the number of samples required when insect density was high and clumped within experimental storages. Conclusions: This paper reinforces the need for effective sampling programs designed to detect insects over a broad range of spatial distributions. The compound model is robust over a broad range of insect densities and leads to substantial improvement in detection probabilities within highly variable systems such as grain storage.