970 resultados para Zero-inflated Count Data
Resumo:
In a recent paper Bermúdez [2009] used bivariate Poisson regression models for ratemaking in car insurance, and included zero-inflated models to account for the excess of zeros and the overdispersion in the data set. In the present paper, we revisit this model in order to consider alternatives. We propose a 2-finite mixture of bivariate Poisson regression models to demonstrate that the overdispersion in the data requires more structure if it is to be taken into account, and that a simple zero-inflated bivariate Poisson model does not suffice. At the same time, we show that a finite mixture of bivariate Poisson regression models embraces zero-inflated bivariate Poisson regression models as a special case. Additionally, we describe a model in which the mixing proportions are dependent on covariates when modelling the way in which each individual belongs to a separate cluster. Finally, an EM algorithm is provided in order to ensure the models’ ease-of-fit. These models are applied to the same automobile insurance claims data set as used in Bermúdez [2009] and it is shown that the modelling of the data set can be improved considerably.
Resumo:
Bodily injury claims have the greatest impact on the claim costs of motor insurance companies. The disability severity of motor claims is assessed in numerous European countries by means of score systems. In this paper a zero inflated generalized Poisson regression model is implemented to estimate the disability severity score of victims in-volved in motor accidents on Spanish roads. We show that the injury severity estimates may be automatically converted into financial terms by insurers at any point of the claim handling process. As such, the methodology described may be used by motor insurers operating in the Spanish market to monitor the size of bodily injury claims. By using insurance data, various applications are presented in which the score estimate of disability severity is of value to insurers, either for computing the claim compensation or for claim reserve purposes.
Resumo:
The purpose of the work was to realize a high-speed digital data transfer system for RPC muon chambers in the CMS experiment on CERN’s new LHC accelerator. This large scale system took many years and many stages of prototyping to develop, and required the participation of tens of people. The system interfaces to Frontend Boards (FEB) at the 200,000-channel detector and to the trigger and readout electronics in the control room of the experiment. The distance between these two is about 80 metres and the speed required for the optic links was pushing the limits of available technology when the project was started. Here, as in many other aspects of the design, it was assumed that the features of readily available commercial components would develop in the course of the design work, just as they did. By choosing a high speed it was possible to multiplex the data from some the chambers into the same fibres to reduce the number of links needed. Further reduction was achieved by employing zero suppression and data compression, and a total of only 660 optical links were needed. Another requirement, which conflicted somewhat with choosing the components a late as possible was that the design needed to be radiation tolerant to an ionizing dose of 100 Gy and to a have a moderate tolerance to Single Event Effects (SEEs). This required some radiation test campaigns, and eventually led to ASICs being chosen for some of the critical parts. The system was made to be as reconfigurable as possible. The reconfiguration needs to be done from a distance as the electronics is not accessible except for some short and rare service breaks once the accelerator starts running. Therefore reconfigurable logic is extensively used, and the firmware development for the FPGAs constituted a sizable part of the work. Some special techniques needed to be used there too, to achieve the required radiation tolerance. The system has been demonstrated to work in several laboratory and beam tests, and now we are waiting to see it in action when the LHC will start running in the autumn 2008.
Resumo:
Les modèles à sur-représentation de zéros discrets et continus ont une large gamme d'applications et leurs propriétés sont bien connues. Bien qu'il existe des travaux portant sur les modèles discrets à sous-représentation de zéro et modifiés à zéro, la formulation usuelle des modèles continus à sur-représentation -- un mélange entre une densité continue et une masse de Dirac -- empêche de les généraliser afin de couvrir le cas de la sous-représentation de zéros. Une formulation alternative des modèles continus à sur-représentation de zéros, pouvant aisément être généralisée au cas de la sous-représentation, est présentée ici. L'estimation est d'abord abordée sous le paradigme classique, et plusieurs méthodes d'obtention des estimateurs du maximum de vraisemblance sont proposées. Le problème de l'estimation ponctuelle est également considéré du point de vue bayésien. Des tests d'hypothèses classiques et bayésiens visant à déterminer si des données sont à sur- ou sous-représentation de zéros sont présentées. Les méthodes d'estimation et de tests sont aussi évaluées au moyen d'études de simulation et appliquées à des données de précipitation agrégées. Les diverses méthodes s'accordent sur la sous-représentation de zéros des données, démontrant la pertinence du modèle proposé. Nous considérons ensuite la classification d'échantillons de données à sous-représentation de zéros. De telles données étant fortement non normales, il est possible de croire que les méthodes courantes de détermination du nombre de grappes s'avèrent peu performantes. Nous affirmons que la classification bayésienne, basée sur la distribution marginale des observations, tiendrait compte des particularités du modèle, ce qui se traduirait par une meilleure performance. Plusieurs méthodes de classification sont comparées au moyen d'une étude de simulation, et la méthode proposée est appliquée à des données de précipitation agrégées provenant de 28 stations de mesure en Colombie-Britannique.
Resumo:
The objective of the present study was to investigate the effect of data structure on estimated genetic parameters and predicted breeding values of direct and maternal genetic effects for weaning weight (WW) and weight gain from birth to weaning (BWG), including or not the genetic covariance between direct and maternal effects. Records of 97,490 Nellore animals born between 1993 and 2006, from the Jacarezinho cattle raising farm, were used. Two different data sets were analyzed: DI_all, which included all available progenies of dams without their own performance; DII_all, which included DI_all + 20% of recorded progenies with maternal phenotypes. Two subsets were obtained from each data set (DI_all and DII_all): DI_1 and DII_1, which included only dams with three or fewer progenies; DI_5 and DII_5, which included only dams with five or more progenies. (Co)variance components and heritabilities were estimated by Bayesian inference through Gibbs sampling using univariate animal models. In general, for the population and traits studied, the proportion of dams with known phenotypic information and the number of progenies per dam influenced direct and maternal heritabilities, as well as the contribution of maternal permanent environmental variance to phenotypic variance. Only small differences were observed in the genetic and environmental parameters when the genetic covariance between direct and maternal effects was set to zero in the data sets studied. Thus, the inclusion or not of the genetic covariance between direct and maternal effects had little effect on the ranking of animals according to their breeding values for WW and BWG. Accurate estimation of genetic correlations between direct and maternal genetic effects depends on the data structure. Thus, this covariance should be set to zero in Nellore data sets in which the proportion of dams with phenotypic information is low, the number of progenies per dam is small, and pedigree relationships are poorly known. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Pós-graduação em Engenharia de Produção - FEB
Resumo:
Experimental models are necessary to elucidate pathophysiological mechanisms not yet understood in humans. To evaluate the repercussions of the diabetes, considering two methodologies, on the pregnancy of Wistar rats and on the development of their offspring. In the 1st induction, female offspring were distributed into two experimental groups: Group streptozotocin (STZ, n=67): received the β-cytotoxic agent (100mg STZ/kg body weight - sc) on the 1st day of the life; and Non-diabetic Group (ND, n=14): received the vehicle in a similar time period. In the adult life, the animals were mated. After a positive diagnosis of pregnancy (0), female rats from group STZ presenting with lower glycemia than 120 mg/dL received more 20 mg STZ/kg (ip) at day 7 of pregnancy (2nd induction). The female rats with glycemia higher than 120mg/dL were discarded because they reproduced results already found in the literature. In the mornings of days 0, 7, 14 and 21 of the pregnancy glycemia was determined. At day 21 of pregnancy (at term), the female rats were anesthetized and killed for maternal reproductive performance and fetal development analysis. The data were analyzed using Student-Newman-Keuls, Chi-square and Zero-inflated Poisson (ZIP) Tests (p<0.05). STZ rats presented with increased rates of pre (STZ=22.0%; ND=5.1%) and post-implantation losses (STZ=26.1%; ND=5.7%), reduced rates of fetuses with appropriate weight for gestational age (STZ=66%; ND=93%) and reduced degree of development (ossification sites). Conclusion: Mild diabetes led a negative impact on maternal reproductive performance and caused intrauterine growth restriction and impaired fetal development
Resumo:
Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence of time-dependent covariates; consequently we use a basis orthogonal to the space spanned by the covariates and use penalized quasi-likelihood (PQL) for estimation. We conclude that Enterococcus counts were greatly reduced near the Nut Island Treatment Plant (NITP) outfalls following the transfer of wastewaters from NITP to the Deer Island Treatment Plant (DITP) and that the transfer of wastewaters from Boston Harbor to the offshore diffusers in Massachusetts Bay reduced the Enterococcus counts near the DITP outfalls.
Resumo:
BACKGROUND Inflammatory bowel disease can decrease the quality of life and induce work disability. We sought to (1) identify and quantify the predictors of disease-specific work disability in patients with inflammatory bowel disease and (2) assess the suitability of using cross-sectional data to predict future outcomes, using the Swiss Inflammatory Bowel Disease Cohort Study data. METHODS A total of 1187 patients were enrolled and followed up for an average of 13 months. Predictors included patient and disease characteristics and drug utilization. Potential predictors were identified through an expert panel and published literature. We estimated adjusted effect estimates with 95% confidence intervals using logistic and zero-inflated Poisson regression. RESULTS Overall, 699 (58.9%) experienced Crohn's disease and 488 (41.1%) had ulcerative colitis. Most important predictors for temporary work disability in patients with Crohn's disease included gender, disease duration, disease activity, C-reactive protein level, smoking, depressive symptoms, fistulas, extraintestinal manifestations, and the use of immunosuppressants/steroids. Temporary work disability in patients with ulcerative colitis was associated with age, disease duration, disease activity, and the use of steroids/antibiotics. In all patients, disease activity emerged as the only predictor of permanent work disability. Comparing data at enrollment versus follow-up yielded substantial differences regarding disability and predictors, with follow-up data showing greater predictor effects. CONCLUSIONS We identified predictors of work disability in patients with Crohn's disease and ulcerative colitis. Our findings can help in forecasting these disease courses and guide the choice of appropriate measures to prevent adverse outcomes. Comparing cross-sectional and longitudinal data showed that the conduction of cohort studies is inevitable for the examination of disability.
Resumo:
Objectives: To compare mental health care utilization regarding the source, types, and intensity of mental health services received, unmet need for services, and out of pocket cost among non-institutionalized psychologically distressed women and men. ^ Method: Cross-sectional data for 19,325 non-institutionalized mentally distressed adult respondents to the “The National Survey on Drug Use and Health” (NSDUH), for the years 2006 -2008, representing over twenty-nine millions U.S. adults was analyzed. To assess the relative odds for women compared to men, logistic regression analysis was used for source of service, for types of barriers, for unmet need and cost; zero inflated negative binomial regression for intensity of utilization; and ordinal logistic regression analysis for quantifying out-of-pocket expenditure. ^ Results: Overall, 43% of mentally distressed adults utilized a form of mental health treatment; representing 12.6 million U.S psychologically distressed adults. Females utilized more mental health care compared to males in the previous 12 months (OR: 1. 70; 95% CI: 1.54, 1.83). Similarly, females were 54% more likely to get help for psychological distress in an outpatient setting and females were associated with an increased probability of using medication for mental distress (OR: 1.72; 95% CI: 1.63, 1.98). Women were 1.25 times likelier to visit a mental health center (specialty care) than men. ^ Females were positively associated with unmet needs (OR: 1.50; 95% CI: 1.29, 1.75) after taking into account predisposing, enabling, and need (PEN) characteristics. Women with perceived unmet needs were 23% (OR: 0.77; 95% CI: 0.59, 0.99) less likely than men to report societal accommodation (stigma) as a barrier to mental health care. At any given cutoff point, women were 1.74 times likelier to be in the higher payment categories for inpatient out of pocket cost when other variables in the model are held constant. Conclusions: Women utilize more specialty mental healthcare, report more unmet need, and pay more inpatient out of pocket costs than men. These gender disparities exist even after controlling for predisposing, enabling, and need variables. Creating policies that not only provide mental health care access but also de-stigmatize mental illness will bring us one step closer to eliminating gender disparities in mental health care.^
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
The use of presence/absence data in wildlife management and biological surveys is widespread. There is a growing interest in quantifying the sources of error associated with these data. We show that false-negative errors (failure to record a species when in fact it is present) can have a significant impact on statistical estimation of habitat models using simulated data. Then we introduce an extension of logistic modeling, the zero-inflated binomial (ZIB) model that permits the estimation of the rate of false-negative errors and the correction of estimates of the probability of occurrence for false-negative errors by using repeated. visits to the same site. Our simulations show that even relatively low rates of false negatives bias statistical estimates of habitat effects. The method with three repeated visits eliminates the bias, but estimates are relatively imprecise. Six repeated visits improve precision of estimates to levels comparable to that achieved with conventional statistics in the absence of false-negative errors In general, when error rates are less than or equal to50% greater efficiency is gained by adding more sites, whereas when error rates are >50% it is better to increase the number of repeated visits. We highlight the flexibility of the method with three case studies, clearly demonstrating the effect of false-negative errors for a range of commonly used survey methods.
Resumo:
Demonstrating the existence of trends in monitoring data is of increasing practical importance to conservation managers wishing to preserve threatened species or reduce the impact of pest species. However, the ability to do so can be compromised if the species in question has low detectability and the true occupancy level or abundance of the species is thus obscured. Zero-inflated models that explicitly model detectability improve the ability to make sound ecological inference in such situations. In this paper we apply an occupancy model including detectability to data from the initial stages of a fox-monitoring program on the Eyre Peninsula, South Australia. We find that detectability is extremely low (< 18%) and varies according to season and the presence or absence of roadside vegetation. We show that simple methods of using monitoring data to inform management, such as plotting the raw data or performing logistic regression, fail to accurately diagnose either the status of the fox population or its trajectory over time. We use the results of the detectability model to consider how future monitoring could be redesigned to achieve efficiency gains. A wide range of monitoring programs could benefit from similar analyses, as part of an active adaptive approach to improving monitoring and management.