36 resultados para Overdispersion
Resumo:
Quasi-likelihood (QL) methods are often used to account for overdispersion in categorical data. This paper proposes a new way of constructing a QL function that stems from the conditional mean-variance relationship. Unlike traditional QL approaches to categorical data, this QL function is, in general, not a scaled version of the ordinary log-likelihood function. A simulation study is carried out to examine the performance of the proposed QL method. Fish mortality data from quantal response experiments are used for illustration.
Weibull and generalised exponential overdispersion models with an application to ozone air pollution
Resumo:
We consider the problem of estimating the mean and variance of the time between occurrences of an event of interest (inter-occurrences times) where some forms of dependence between two consecutive time intervals are allowed. Two basic density functions are taken into account. They are the Weibull and the generalised exponential density functions. In order to capture the dependence between two consecutive inter-occurrences times, we assume that either the shape and/or the scale parameters of the two density functions are given by auto-regressive models. The expressions for the mean and variance of the inter-occurrences times are presented. The models are applied to the ozone data from two regions of Mexico City. The estimation of the parameters is performed using a Bayesian point of view via Markov chain Monte Carlo (MCMC) methods.
Resumo:
At least two important transportation planning activities rely on planning-level crash prediction models. One is motivated by the Transportation Equity Act for the 21st Century, which requires departments of transportation and metropolitan planning organizations to consider safety explicitly in the transportation planning process. The second could arise from a need for state agencies to establish incentive programs to reduce injuries and save lives. Both applications require a forecast of safety for a future period. Planning-level crash prediction models for the Tucson, Arizona, metropolitan region are presented to demonstrate the feasibility of such models. Data were separated into fatal, injury, and property-damage crashes. To accommodate overdispersion in the data, negative binomial regression models were applied. To accommodate the simultaneity of fatality and injury crash outcomes, simultaneous estimation of the models was conducted. All models produce crash forecasts at the traffic analysis zone level. Statistically significant (p-values < 0.05) and theoretically meaningful variables for the fatal crash model included population density, persons 17 years old or younger as a percentage of the total population, and intersection density. Significant variables for the injury and property-damage crash models were population density, number of employees, intersections density, percentage of miles of principal arterial, percentage of miles of minor arterials, and percentage of miles of urban collectors. Among several conclusions it is suggested that planning-level safety models are feasible and may play a role in future planning activities. However, caution must be exercised with such models.
Resumo:
This paper develops a semiparametric estimation approach for mixed count regression models based on series expansion for the unknown density of the unobserved heterogeneity. We use the generalized Laguerre series expansion around a gamma baseline density to model unobserved heterogeneity in a Poisson mixture model. We establish the consistency of the estimator and present a computational strategy to implement the proposed estimation techniques in the standard count model as well as in truncated, censored, and zero-inflated count regression models. Monte Carlo evidence shows that the finite sample behavior of the estimator is quite good. The paper applies the method to a model of individual shopping behavior. © 1999 Elsevier Science S.A. All rights reserved.
Resumo:
Background: Studies have examined the effects of temperature on mortality in a single city, country, or region. However, less evidence is available on the variation in the associations between temperature and mortality in multiple countries, analyzed simultaneously. Methods: We obtained daily data on temperature and mortality in 306 communities from 12 countries/regions (Australia, Brazil, Thailand, China, Taiwan, Korea, Japan, Italy, Spain, United Kingdom, United States, and Canada). Two-stage analyses were used to assess the nonlinear and delayed relation between temperature and mortality. In the first stage, a Poisson regression allowing overdispersion with distributed lag nonlinear model was used to estimate the community-specific temperature-mortality relation. In the second stage, a multivariate meta-analysis was used to pool the nonlinear and delayed effects of ambient temperature at the national level, in each country. Results: The temperatures associated with the lowest mortality were around the 75th percentile of temperature in all the countries/regions, ranging from 66th (Taiwan) to 80th (UK) percentiles. The estimated effects of cold and hot temperatures on mortality varied by community and country. Meta-analysis results show that both cold and hot temperatures increased the risk of mortality in all the countries/regions. Cold effects were delayed and lasted for many days, whereas heat effects appeared quickly and did not last long. Conclusions: People have some ability to adapt to their local climate type, but both cold and hot temperatures are still associated with increased risk of mortality. Public health strategies to alleviate the impact of ambient temperatures are important, in particular in the context of climate change.
Resumo:
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice.
Resumo:
We propose a new model for estimating the size of a population from successive catches taken during a removal experiment. The data from these experiments often have excessive variation, known as overdispersion, as compared with that predicted by the multinomial model. The new model allows catchability to vary randomly among samplings, which accounts for overdispersion. When the catchability is assumed to have a beta distribution, the likelihood function, which is refered to as beta-multinomial, is derived, and hence the maximum likelihood estimates can be evaluated. Simulations show that in the presence of extravariation in the data, the confidence intervals have been substantially underestimated in previous models (Leslie-DeLury, Moran) and that the new model provides more reliable confidence intervals. The performance of these methods was also demonstrated using two real data sets: one with overdispersion, from smallmouth bass (Micropterus dolomieu), and the other without overdispersion, from rat (Rattus rattus).
Resumo:
The article describes a generalized estimating equations approach that was used to investigate the impact of technology on vessel performance in a trawl fishery during 1988-96, while accounting for spatial and temporal correlations in the catch-effort data. Robust estimation of parameters in the presence of several levels of clustering depended more on the choice of cluster definition than on the choice of correlation structure within the cluster. Models with smaller cluster sizes produced stable results, while models with larger cluster sizes, that may have had complex within-cluster correlation structures and that had within-cluster covariates, produced estimates sensitive to the correlation structure. The preferred model arising from this dataset assumed that catches from a vessel were correlated in the same years and the same areas, but independent in different years and areas. The model that assumed catches from a vessel were correlated in all years and areas, equivalent to a random effects term for vessel, produced spurious results. This was an unexpected finding that highlighted the need to adopt a systematic strategy for modelling. The article proposes a modelling strategy of selecting the best cluster definition first, and the working correlation structure (within clusters) second. The article discusses the selection and interpretation of the model in the light of background knowledge of the data and utility of the model, and the potential for this modelling approach to apply in similar statistical situations.
Resumo:
We consider the problem of estimating a population size from successive catches taken during a removal experiment and propose two estimating functions approaches, the traditional quasi-likelihood (TQL) approach for dependent observations and the conditional quasi-likelihood (CQL) approach using the conditional mean and conditional variance of the catch given previous catches. Asymptotic covariance of the estimates and the relationship between the two methods are derived. Simulation results and application to the catch data from smallmouth bass show that the proposed estimating functions perform better than other existing methods, especially in the presence of overdispersion.
Resumo:
The frequency distributions of the parasitic copepod Sinergasilus polycolpus were examined in silver carp Hypophthalmichthys molitrix and bighead carp Aristichthys nobilis during a disease outbreak of the 2 species of fish in a reservoir in China. The mean abundance of the copepod was positively related with host length and age, and the overdispersion of the copepod in both silver and bighead carp was fitted well with negative binomial distribution. Although parasite-induced host mortality was observed, a peaked age-parasite abundance curve was not detected in the present parasite-host system. It is also proposed that this peaked age-abundance curve is unlikely to be observed in its natural host populations.
Resumo:
It is increasingly evident that evolutionary processes play a role in how ecological communities are assembled. However the extend to which evolution influences how plants respond to spatial and environmental gradients and interact with each other is less clear. In this dissertation I leverage evolutionary tools and thinking to understand how space and environment affect community composition and patterns of gene flow in a unique system of Atlantic rainforest and restinga (sandy coastal plains) habitats in Southeastern Brazil.
In chapter one I investigate how space and environment affect the population genetic structure and gene flow of Aechmea nudicaulis, a bromeliad species that co-occurs in forest and restinga habitats. I genotyped seven microsatellite loci and sequenced one chloroplast DNA region for individuals collected in 7 pairs of forest / restinga sites. Bayesian genetic clustering analyses show that populations of A. nudicaulis are geographically structured in northern and southern populations, a pattern consistent with broader scale phylogeographic dynamics of the Atlantic rainforest. On the other hand, explicit migration models based on the coalescent estimate that inter-habitat gene flow is less common than gene flow between populations in the same habitat type, despite their geographic discontinuity. I conclude that there is evidence for repeated colonization of the restingas from forest populations even though the steep environmental gradient between habitats is a stronger barrier to gene flow than geographic distance.
In chapter two I use data on 2800 individual plants finely mapped in a restinga plot and on first-year survival of 500 seedlings to understand the roles of phylogeny, functional traits and abiotic conditions in the spatial structuring of that community. I demonstrate that phylogeny is a poor predictor of functional traits in and that convergence in these traits is pervasive. In general, the community is not phylogenetically structured, with at best 14% of the plots deviating significantly from the null model. The functional traits SLA, leaf dry matter content (LDMC), and maximum height also showed no clear pattern of spatial structuring. On the other hand, leaf area is strongly overdispersed across all spatial scales. Although leaf area overdispersion would be generally taken as evidence of competition, I argue that interpretation is probably misleading. Finally, I show that seedling survival is dramatically increased when they grow shaded by an adult individual, suggesting that seedlings are being facilitated. Phylogenetic distance to their adult neighbor has no influence on rates of survival though. Taken together, these results indicate that phylogeny has very limited influence on the fine scale assembly of restinga communities.
Resumo:
This thesis focuses on the application of optimal alarm systems to non linear time series models. The most common classes of models in the analysis of real-valued and integer-valued time series are described. The construction of optimal alarm systems is covered and its applications explored. Considering models with conditional heteroscedasticity, particular attention is given to the Fractionally Integrated Asymmetric Power ARCH, FIAPARCH(p; d; q) model and an optimal alarm system is implemented, following both classical and Bayesian methodologies. Taking into consideration the particular characteristics of the APARCH(p; q) representation for financial time series, the introduction of a possible counterpart for modelling time series of counts is proposed: the INteger-valued Asymmetric Power ARCH, INAPARCH(p; q). The probabilistic properties of the INAPARCH(1; 1) model are comprehensively studied, the conditional maximum likelihood (ML) estimation method is applied and the asymptotic properties of the conditional ML estimator are obtained. The final part of the work consists on the implementation of an optimal alarm system to the INAPARCH(1; 1) model. An application is presented to real data series.
Resumo:
The Asymmetric Power Arch representation for the volatility was introduced by Ding et al.(1993) in order to account for asymmetric responses in the volatility in the analysis of continuous-valued financial time series like, for instance, the log-return series of foreign exchange rates, stock indices or share prices. As reported by Brannas and Quoreshi (2010), asymmetric responses in volatility are also observed in time series of counts such as the number of intra-day transactions in stocks. In this work, an asymmetric power autoregressive conditional Poisson model is introduced for the analysis of time series of counts exhibiting asymmetric overdispersion. Basic probabilistic and statistical properties are summarized and parameter estimation is discussed. A simulation study is presented to illustrate the proposed model. Finally, an empirical application to a set of data concerning the daily number of stock transactions is also presented to attest for its practical applicability in data analysis.
Resumo:
The thesis entitled "Studies on the eco-physiology of heterotrophic and indicator bacteria in the marine environments of Kerala" embodies the results of an investigation carried out by the candidate at the Central Marine Fisheries Research Institute, Cochin. It is presentedd under 4 chapters in two parts (Parts A & B) and includes 6 sections. The material for the study was collected in the Cochin backwater during April 1972 to February. 1973, March 1974 to February 1975, July 1975 to June 1976 and in the ishore area during January to October, 1978 and an account of the heterotropic and indicator bacteria are given with intensity charts and tables. Samples from all the stations contained significant quantities of heterotrophs (Part A, Section I) and faecal pollution indicators (Section II). Maximum number of heterotrophic bacteria was observed during the postmonsoon period. The total counts betwen one station and the other did not vary as much as the counts between months did. The distribution was characterised by overdispersion. During 1972-73 in all the stations except the fourth the minimum heterotrophs (Total counts) were recorded during the monsoon period. Minimum counts were observed during the premonsoon period, with an increasing trend from the premonsoon to postmonsoon seasons. Maximum counts were recorded during monsoon months during 1974-75. No significant difference was noted in the total plate count between stations, months and regions. Seasonal variations in sea water was meagre during 1975-76, whereas in sediments variations were prominent during monsoon in Station I - near the mouth of the sewage effluent of Cochin City and in postmonsoon at Station II in the Mattancherry Channel and Station III near barmouth
Resumo:
This study has investigated serial (temporal) clustering of extra-tropical cyclones simulated by 17 climate models that participated in CMIP5. Clustering was estimated by calculating the dispersion (ratio of variance to mean) of 30 December-February counts of Atlantic storm tracks passing nearby each grid point. Results from single historical simulations of 1975-2005 were compared to those from historical ERA40 reanalyses from 1958-2001 ERA40 and single future model projections of 2069-2099 under the RCP4.5 climate change scenario. Models were generally able to capture the broad features in reanalyses reported previously: underdispersion/regularity (i.e. variance less than mean) in the western core of the Atlantic storm track surrounded by overdispersion/clustering (i.e. variance greater than mean) to the north and south and over western Europe. Regression of counts onto North Atlantic Oscillation (NAO) indices revealed that much of the overdispersion in the historical reanalyses and model simulations can be accounted for by NAO variability. Future changes in dispersion were generally found to be small and not consistent across models. The overdispersion statistic, for any 30 year sample, is prone to large amounts of sampling uncertainty that obscures the climate change signal. For example, the projected increase in dispersion for storm counts near London in the CNRMCM5 model is 0.1 compared to a standard deviation of 0.25. Projected changes in the mean and variance of NAO are insufficient to create changes in overdispersion that are discernible above natural sampling variations.