54 resultados para Truncated negative binomial model
em CentAUR: Central Archive University of Reading - UK
Resumo:
We propose a geoadditive negative binomial model (Geo-NB-GAM) for regional count data that allows us to address simultaneously some important methodological issues, such as spatial clustering, nonlinearities, and overdispersion. This model is applied to the study of location determinants of inward greenfield investments that occurred during 2003–2007 in 249 European regions. After presenting the data set and showing the presence of overdispersion and spatial clustering, we review the theoretical framework that motivates the choice of the location determinants included in the empirical model, and we highlight some reasons why the relationship between some of the covariates and the dependent variable might be nonlinear. The subsequent section first describes the solutions proposed by previous literature to tackle spatial clustering, nonlinearities, and overdispersion, and then presents the Geo-NB-GAM. The empirical analysis shows the good performance of Geo-NB-GAM. Notably, the inclusion of a geoadditive component (a smooth spatial trend surface) permits us to control for spatial unobserved heterogeneity that induces spatial clustering. Allowing for nonlinearities reveals, in keeping with theoretical predictions, that the positive effect of agglomeration economies fades as the density of economic activities reaches some threshold value. However, no matter how dense the economic activity becomes, our results suggest that congestion costs never overcome positive agglomeration externalities.
Resumo:
Previous assessments of the impacts of climate change on heat-related mortality use the "delta method" to create temperature projection time series that are applied to temperature-mortality models to estimate future mortality impacts. The delta method means that climate model bias in the modelled present does not influence the temperature projection time series and impacts. However, the delta method assumes that climate change will result only in a change in the mean temperature but there is evidence that there will also be changes in the variability of temperature with climate change. The aim of this paper is to demonstrate the importance of considering changes in temperature variability with climate change in impacts assessments of future heat-related mortality. We investigate future heatrelated mortality impacts in six cities (Boston, Budapest, Dallas, Lisbon, London and Sydney) by applying temperature projections from the UK Meteorological Office HadCM3 climate model to the temperature-mortality models constructed and validated in Part 1. We investigate the impacts for four cases based on various combinations of mean and variability changes in temperature with climate change. The results demonstrate that higher mortality is attributed to increases in the mean and variability of temperature with climate change rather than with the change in mean temperature alone. This has implications for interpreting existing impacts estimates that have used the delta method. We present a novel method for the creation of temperature projection time series that includes changes in the mean and variability of temperature with climate change and is not influenced by climate model bias in the modelled present. The method should be useful for future impacts assessments. Few studies consider the implications that the limitations of the climate model may have on the heatrelated mortality impacts. Here, we demonstrate the importance of considering this by conducting an evaluation of the daily and extreme temperatures from HadCM3, which demonstrates that the estimates of future heat-related mortality for Dallas and Lisbon may be overestimated due to positive climate model bias. Likewise, estimates for Boston and London may be underestimated due to negative climate model bias. Finally, we briefly consider uncertainties in the impacts associated with greenhouse gas emissions and acclimatisation. The uncertainties in the mortality impacts due to different emissions scenarios of greenhouse gases in the future varied considerably by location. Allowing for acclimatisation to an extra 2°C in mean temperatures reduced future heat-related mortality by approximately half that of no acclimatisation in each city.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
Geophysical fluid models often support both fast and slow motions. As the dynamics are often dominated by the slow motions, it is desirable to filter out the fast motions by constructing balance models. An example is the quasi geostrophic (QG) model, which is used widely in meteorology and oceanography for theoretical studies, in addition to practical applications such as model initialization and data assimilation. Although the QG model works quite well in the mid-latitudes, its usefulness diminishes as one approaches the equator. Thus far, attempts to derive similar balance models for the tropics have not been entirely successful as the models generally filter out Kelvin waves, which contribute significantly to tropical low-frequency variability. There is much theoretical interest in the dynamics of planetary-scale Kelvin waves, especially for atmospheric and oceanic data assimilation where observations are generally only of the mass field and thus do not constrain the wind field without some kind of diagnostic balance relation. As a result, estimates of Kelvin wave amplitudes can be poor. Our goal is to find a balance model that includes Kelvin waves for planetary-scale motions. Using asymptotic methods, we derive a balance model for the weakly nonlinear equatorial shallow-water equations. Specifically we adopt the ‘slaving’ method proposed by Warn et al. (Q. J. R. Meteorol. Soc., vol. 121, 1995, pp. 723–739), which avoids secular terms in the expansion and thus can in principle be carried out to any order. Different from previous approaches, our expansion is based on a long-wave scaling and the slow dynamics is described using the height field instead of potential vorticity. The leading-order model is equivalent to the truncated long-wave model considered previously (e.g. Heckley & Gill, Q. J. R. Meteorol. Soc., vol. 110, 1984, pp. 203–217), which retains Kelvin waves in addition to equatorial Rossby waves. Our method allows for the derivation of higher-order models which significantly improve the representation of Rossby waves in the isotropic limit. In addition, the ‘slaving’ method is applicable even when the weakly nonlinear assumption is relaxed, and the resulting nonlinear model encompasses the weakly nonlinear model. We also demonstrate that the method can be applied to more realistic stratified models, such as the Boussinesq model.
Resumo:
During the last decades, several windstorm series hit Europe leading to large aggregated losses. Such storm series are examples of serial clustering of extreme cyclones, presenting a considerable risk for the insurance industry. Clustering of events and return periods of storm series for Germany are quantified based on potential losses using empirical models. Two reanalysis data sets and observations from German weather stations are considered for 30 winters. Histograms of events exceeding selected return levels (1-, 2- and 5-year) are derived. Return periods of historical storm series are estimated based on the Poisson and the negative binomial distributions. Over 4000 years of general circulation model (GCM) simulations forced with current climate conditions are analysed to provide a better assessment of historical return periods. Estimations differ between distributions, for example 40 to 65 years for the 1990 series. For such less frequent series, estimates obtained with the Poisson distribution clearly deviate from empirical data. The negative binomial distribution provides better estimates, even though a sensitivity to return level and data set is identified. The consideration of GCM data permits a strong reduction of uncertainties. The present results support the importance of considering explicitly clustering of losses for an adequate risk assessment for economical applications.
Resumo:
The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.
Resumo:
The Lincoln–Petersen estimator is one of the most popular estimators used in capture–recapture studies. It was developed for a sampling situation in which two sources independently identify members of a target population. For each of the two sources, it is determined if a unit of the target population is identified or not. This leads to a 2 × 2 table with frequencies f11, f10, f01, f00 indicating the number of units identified by both sources, by the first but not the second source, by the second but not the first source and not identified by any of the two sources, respectively. However, f00 is unobserved so that the 2 × 2 table is incomplete and the Lincoln–Petersen estimator provides an estimate for f00. In this paper, we consider a generalization of this situation for which one source provides not only a binary identification outcome but also a count outcome of how many times a unit has been identified. Using a truncated Poisson count model, truncating multiple identifications larger than two, we propose a maximum likelihood estimator of the Poisson parameter and, ultimately, of the population size. This estimator shows benefits, in comparison with Lincoln–Petersen’s, in terms of bias and efficiency. It is possible to test the homogeneity assumption that is not testable in the Lincoln–Petersen framework. The approach is applied to surveillance data on syphilis from Izmir, Turkey.
Resumo:
None of the current surveillance streams monitoring the presence of scrapie in Great Britain provide a comprehensive and unbiased estimate of the prevalence of the disease at the holding level. Previous work to estimate the under-ascertainment adjusted prevalence of scrapie in Great Britain applied multiple-list capture–recapture methods. The enforcement of new control measures on scrapie-affected holdings in 2004 has stopped the overlapping between surveillance sources and, hence, the application of multiple-list capture–recapture models. Alternative methods, still under the capture–recapture methodology, relying on repeated entries in one single list have been suggested in these situations. In this article, we apply one-list capture–recapture approaches to data held on the Scrapie Notifications Database to estimate the undetected population of scrapie-affected holdings with clinical disease in Great Britain for the years 2002, 2003, and 2004. For doing so, we develop a new diagnostic tool for indication of heterogeneity as well as a new understanding of the Zelterman and Chao’s lower bound estimators to account for potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a special, locally truncated Poisson likelihood equivalent to a binomial likelihood. This understanding allows the extension of the Zelterman approach by means of logistic regression to include observed heterogeneity in the form of covariates—in case studied here, the holding size and country of origin. Our results confirm the presence of substantial unobserved heterogeneity supporting the application of our two estimators. The total scrapie-affected holding population in Great Britain is around 300 holdings per year. None of the covariates appear to inform the model significantly.
Resumo:
None of the current surveillance streams monitoring the presence of scrapie in Great Britain provide a comprehensive and unbiased estimate of the prevalence of the disease at the holding level. Previous work to estimate the under-ascertainment adjusted prevalence of scrapie in Great Britain applied multiple-list capture-recapture methods. The enforcement of new control measures on scrapie-affected holdings in 2004 has stopped the overlapping between surveillance sources and, hence, the application of multiple-list capture-recapture models. Alternative methods, still under the capture-recapture methodology, relying on repeated entries in one single list have been suggested in these situations. In this article, we apply one-list capture-recapture approaches to data held on the Scrapie Notifications Database to estimate the undetected population of scrapie-affected holdings with clinical disease in Great Britain for the years 2002, 2003, and 2004. For doing so, we develop a new diagnostic tool for indication of heterogeneity as well as a new understanding of the Zelterman and Chao's lower bound estimators to account for potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a special, locally truncated Poisson likelihood equivalent to a binomial likelihood. This understanding allows the extension of the Zelterman approach by means of logistic regression to include observed heterogeneity in the form of covariates-in case studied here, the holding size and country of origin. Our results confirm the presence of substantial unobserved heterogeneity supporting the application of our two estimators. The total scrapie-affected holding population in Great Britain is around 300 holdings per year. None of the covariates appear to inform the model significantly.
Resumo:
Modern neuroimaging techniques rely on neurovascular coupling to show regions of increased brain activation. However, little is known of the neurovascular coupling relationships that exist for inhibitory signals. To address this issue directly we developed a preparation to investigate the signal sources of one of these proposed inhibitory neurovascular signals, the negative blood oxygen level-dependent (BOLD) response (NBR), in rat somatosensory cortex. We found a reliable NBR measured in rat somatosensory cortex in response to unilateral electrical whisker stimulation, which was located in deeper cortical layers relative to the positive BOLD response. Separate optical measurements (two-dimensional optical imaging spectroscopy and laser Doppler flowmetry) revealed that the NBR was a result of decreased blood volume and flow and increased levels of deoxyhemoglobin. Neural activity in the NBR region, measured by multichannel electrodes, varied considerably as a function of cortical depth. There was a decrease in neuronal activity in deep cortical laminae. After cessation of whisker stimulation there was a large increase in neural activity above baseline. Both the decrease in neuronal activity and increase above baseline after stimulation cessation correlated well with the simultaneous measurement of blood flow suggesting that the NBR is related to decreases in neural activity in deep cortical layers. Interestingly, the magnitude of the neural decrease was largest in regions showing stimulus-evoked positive BOLD responses. Since a similar type of neural suppression in surround regions was associated with a negative BOLD signal, the increased levels of suppression in positive BOLD regions could importantly moderate the size of the observed BOLD response.
Resumo:
The constant-density Charney model describes the simplest unstable basic state with a planetary-vorticity gradient, which is uniform and positive, and baroclinicity that is manifest as a negative contribution to the potential-vorticity (PV) gradient at the ground and positive vertical wind shear. Together, these ingredients satisfy the necessary conditions for baroclinic instability. In Part I it was shown how baroclinic growth on a general zonal basic state can be viewed as the interaction of pairs of ‘counter-propagating Rossby waves’ (CRWs) that can be constructed from a growing normal mode and its decaying complex conjugate. In this paper the normal-mode solutions for the Charney model are studied from the CRW perspective.
Clear parallels can be drawn between the most unstable modes of the Charney model and the Eady model, in which the CRWs can be derived independently of the normal modes. However, the dispersion curves for the two models are very different; the Eady model has a short-wave cut-off, while the Charney model is unstable at short wavelengths. Beyond its maximum growth rate the Charney model has a neutral point at finite wavelength (r=1). Thereafter follows a succession of unstable branches, each with weaker growth than the last, separated by neutral points at integer r—the so-called ‘Green branches’. A separate branch of westward-propagating neutral modes also originates from each neutral point. By approximating the lower CRW as a Rossby edge wave and the upper CRW structure as a single PV peak with a spread proportional to the Rossby scale height, the main features of the ‘Charney branch’ (0
Resumo:
A physically motivated statistical model is used to diagnose variability and trends in wintertime ( October - March) Global Precipitation Climatology Project (GPCP) pentad (5-day mean) precipitation. Quasi-geostrophic theory suggests that extratropical precipitation amounts should depend multiplicatively on the pressure gradient, saturation specific humidity, and the meridional temperature gradient. This physical insight has been used to guide the development of a suitable statistical model for precipitation using a mixture of generalized linear models: a logistic model for the binary occurrence of precipitation and a Gamma distribution model for the wet day precipitation amount. The statistical model allows for the investigation of the role of each factor in determining variations and long-term trends. Saturation specific humidity q(s) has a generally negative effect on global precipitation occurrence and with the tropical wet pentad precipitation amount, but has a positive relationship with the pentad precipitation amount at mid- and high latitudes. The North Atlantic Oscillation, a proxy for the meridional temperature gradient, is also found to have a statistically significant positive effect on precipitation over much of the Atlantic region. Residual time trends in wet pentad precipitation are extremely sensitive to the choice of the wet pentad threshold because of increasing trends in low-amplitude precipitation pentads; too low a choice of threshold can lead to a spurious decreasing trend in wet pentad precipitation amounts. However, for not too small thresholds, it is found that the meridional temperature gradient is an important factor for explaining part of the long-term trend in Atlantic precipitation.
Resumo:
Global hydrological models (GHMs) model the land surface hydrologic dynamics of continental-scale river basins. Here we describe one such GHM, the Macro-scale - Probability-Distributed Moisture model.09 (Mac-PDM.09). The model has undergone a number of revisions since it was last applied in the hydrological literature. This paper serves to provide a detailed description of the latest version of the model. The main revisions include the following: (1) the ability for the model to be run for n repetitions, which provides more robust estimates of extreme hydrological behaviour, (2) the ability of the model to use a gridded field of coefficient of variation (CV) of daily rainfall for the stochastic disaggregation of monthly precipitation to daily precipitation, and (3) the model can now be forced with daily input climate data as well as monthly input climate data. We demonstrate the effects that each of these three revisions has on simulated runoff relative to before the revisions were applied. Importantly, we show that when Mac-PDM.09 is forced with monthly input data, it results in a negative runoff bias relative to when daily forcings are applied, for regions of the globe where the day-to-day variability in relative humidity is high. The runoff bias can be up to - 80% for a small selection of catchments but the absolute magnitude of the bias may be small. As such, we recommend future applications of Mac-PDM.09 that use monthly climate forcings acknowledge the bias as a limitation of the model. The performance of Mac-PDM.09 is evaluated by validating simulated runoff against observed runoff for 50 catchments. We also present a sensitivity analysis that demonstrates that simulated runoff is considerably more sensitive to method of PE calculation than to perturbations in soil moisture and field capacity parameters.
Resumo:
In this paper we consider the estimation of population size from onesource capture–recapture data, that is, a list in which individuals can potentially be found repeatedly and where the question is how many individuals are missed by the list. As a typical example, we provide data from a drug user study in Bangkok from 2001 where the list consists of drug users who repeatedly contact treatment institutions. Drug users with 1, 2, 3, . . . contacts occur, but drug users with zero contacts are not present, requiring the size of this group to be estimated. Statistically, these data can be considered as stemming from a zero-truncated count distribution.We revisit an estimator for the population size suggested by Zelterman that is known to be robust under potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a locally truncated Poisson likelihood which is equivalent to a binomial likelihood. This result allows the extension of the Zelterman estimator by means of logistic regression to include observed heterogeneity in the form of covariates. We also review an estimator proposed by Chao and explain why we are not able to obtain similar results for this estimator. The Zelterman estimator is applied in two case studies, the first a drug user study from Bangkok, the second an illegal immigrant study in the Netherlands. Our results suggest the new estimator should be used, in particular, if substantial unobserved heterogeneity is present.
Resumo:
Disease-weather relationships influencing Septoria leaf blotch (SLB) preceding growth stage (GS) 31 were identified using data from 12 sites in the UK covering 8 years. Based on these relationships, an early-warning predictive model for SLB on winter wheat was formulated to predict the occurrence of a damaging epidemic (defined as disease severity of 5% or > 5% on the top three leaf layers). The final model was based on accumulated rain > 3 mm in the 80-day period preceding GS 31 (roughly from early-February to the end of April) and accumulated minimum temperature with a 0A degrees C base in the 50-day period starting from 120 days preceding GS 31 (approximately January and February). The model was validated on an independent data set on which the prediction accuracy was influenced by cultivar resistance. Over all observations, the model had a true positive proportion of 0.61, a true negative proportion of 0.73, a sensitivity of 0.83, and a specificity of 0.18. True negative proportion increased to 0.85 for resistant cultivars and decreased to 0.50 for susceptible cultivars. Potential fungicide savings are most likely to be made with resistant cultivars, but such benefits would need to be identified with an in-depth evaluation.