970 resultados para Zero-inflated Count Data
Resumo:
Urbanization has occasionally been linked to negative consequences. Traffic light system in urban arterial networks plays an essential role to the operation of transport systems. The availability of new Intelligent Transportation System innovations paved the way for connecting vehicles and road infrastructure. GLOSA, or the Green Light Optimal Speed Advisory, is a recent integration of vehicle-to-everything (v2x) technology. This thesis emphasized GLOSA system's potential as a tool for addressing traffic signal optimization. GLOSA serves as an advisory to drivers, informing them of the speed they must maintain to reduce waiting time. The considered study area in this thesis is the Via Aurelio Saffi – Via Emilia Ponente corridor in the Metropolitan City of Bologna which has several signalized intersections. Several simulation runs were performed in SUMOPy software on each peak-hour period (morning and afternoon) using recent actual traffic count data. GLOSA devices were placed on a 300m GLOSA distance. Considering the morning peak-hour, GLOSA outperformed the actuated traffic signal control, which is the baseline scenario, in terms of average waiting time, average speed, average fuel consumption per vehicle and average CO2 emissions. A remarkable 97% reduction on both fuel consumption and CO2 emissions were obtained. The average speed of vehicles running through the simulation was increased as well by 7% and a time saved of 25%. Same results were obtained for the afternoon peak hour with a decrease of 98% on both fuel consumption and CO2 emissions, 20% decrease on average waiting time, and an increase of 2% in average speed. In addition to previously mentioned benefits of GLOSA, a 15% and 13% decrease in time loss were obtained during morning and afternoon peak-hour, respectively. Towards the goal of sustainability, GLOSA shows a promising result of significantly lowering fuel consumption and CO2 emissions per vehicle.
Resumo:
This paper proposes a general class of regression models for continuous proportions when the data contain zeros or ones. The proposed class of models assumes that the response variable has a mixed continuous-discrete distribution with probability mass at zero or one. The beta distribution is used to describe the continuous component of the model, since its density has a wide range of different shapes depending on the values of the two parameters that index the distribution. We use a suitable parameterization of the beta law in terms of its mean and a precision parameter. The parameters of the mixture distribution are modeled as functions of regression parameters. We provide inference, diagnostic, and model selection tools for this class of models. A practical application that employs real data is presented. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.
Resumo:
The application of compositional data analysis through log ratio trans-formations corresponds to a multinomial logit model for the shares themselves.This model is characterized by the property of Independence of Irrelevant Alter-natives (IIA). IIA states that the odds ratio in this case the ratio of shares is invariant to the addition or deletion of outcomes to the problem. It is exactlythis invariance of the ratio that underlies the commonly used zero replacementprocedure in compositional data analysis. In this paper we investigate using thenested logit model that does not embody IIA and an associated zero replacementprocedure and compare its performance with that of the more usual approach ofusing the multinomial logit model. Our comparisons exploit a data set that com-bines voting data by electoral division with corresponding census data for eachdivision for the 2001 Federal election in Australia
Resumo:
One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By anessential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur inmany compositional situations, such as household budget patterns, time budgets,palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful insuch situations. From consideration of such examples it seems sensible to build up amodel in two stages, the first determining where the zeros will occur and the secondhow the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data
Resumo:
One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By an essential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur in many compositional situations, such as household budget patterns, time budgets, palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful in such situations. From consideration of such examples it seems sensible to build up a model in two stages, the first determining where the zeros will occur and the second how the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data
Resumo:
The application of compositional data analysis through log ratio trans- formations corresponds to a multinomial logit model for the shares themselves. This model is characterized by the property of Independence of Irrelevant Alter- natives (IIA). IIA states that the odds ratio in this case the ratio of shares is invariant to the addition or deletion of outcomes to the problem. It is exactly this invariance of the ratio that underlies the commonly used zero replacement procedure in compositional data analysis. In this paper we investigate using the nested logit model that does not embody IIA and an associated zero replacement procedure and compare its performance with that of the more usual approach of using the multinomial logit model. Our comparisons exploit a data set that com- bines voting data by electoral division with corresponding census data for each division for the 2001 Federal election in Australia
Resumo:
Background: The present paper investigates the question of a suitable basic model for the number of scrapie cases in a holding and applications of this knowledge to the estimation of scrapie-ffected holding population sizes and adequacy of control measures within holding. Is the number of scrapie cases proportional to the size of the holding in which case it should be incorporated into the parameter of the error distribution for the scrapie counts? Or, is there a different - potentially more complex - relationship between case count and holding size in which case the information about the size of the holding should be better incorporated as a covariate in the modeling? Methods: We show that this question can be appropriately addressed via a simple zero-truncated Poisson model in which the hypothesis of proportionality enters as a special offset-model. Model comparisons can be achieved by means of likelihood ratio testing. The procedure is illustrated by means of surveillance data on classical scrapie in Great Britain. Furthermore, the model with the best fit is used to estimate the size of the scrapie-affected holding population in Great Britain by means of two capture-recapture estimators: the Poisson estimator and the generalized Zelterman estimator. Results: No evidence could be found for the hypothesis of proportionality. In fact, there is some evidence that this relationship follows a curved line which increases for small holdings up to a maximum after which it declines again. Furthermore, it is pointed out how crucial the correct model choice is when applied to capture-recapture estimation on the basis of zero-truncated Poisson models as well as on the basis of the generalized Zelterman estimator. Estimators based on the proportionality model return very different and unreasonable estimates for the population sizes. Conclusion: Our results stress the importance of an adequate modelling approach to the association between holding size and the number of cases of classical scrapie within holding. Reporting artefacts and speculative biological effects are hypothesized as the underlying causes of the observed curved relationship. The lack of adjustment for these artefacts might well render ineffective the current strategies for the control of the disease.
Resumo:
Polymorphonuclear leukocyte (PMNL) apoptosis is central to the successful resolution of inflammation. Since Somatic Cell Count (SCC) is an indicator of the mammary gland's immune status, this study sought to clarify the influence that these factors have on each other and on the evolution of the inflammatory process. Milk samples were stained with annexin-V, propidium iodide (PI), primary antibody anti-CH138A. Negative correlation between SCC and PMNL apoptosis was found, and a statistical difference between high SCC group and low SCC group was observed concerning the rate of viable PMNL, apoptotic PMNL, necrotic PMNL and necrotic and/or apoptotic PMNL. Overall, the high cellularity group presented lower proportions of CH138+ cells undergoing apoptosis and higher proportions of viable and necrotic CH138+ cells. Thus, it can be concluded that PMNL apoptosis and SCC are related factors, and that in high SCC, milk apoptosis is delayed. Although there is a greater amount of active phagocytes in this situation, apoptosis' anti-inflammatory effects are decreased, while necrosis' pro-inflammatory effects are increased, which can contribute to chronic inflammation.
Resumo:
Notes from Henrik de Nie: The project started as a phenological study in cooperation with the (Dutch) meteorological institute (KNMI) to register the time of arrival of Fitis and Tjiftaf. During 1951 to 1969 he went every day to the wood (except 1966, in this year his wife died). Thereafter he went no more daily, but because he knew the wood very well and he was free to choice the day on which he did a survey, therefore he choose days with relatively good weather. He did not observe very common bird species, maybe because they are dependent on nest boxes and he did not want to be dependent on the management of the nest box-people (in fact I forgot precisely his arguments, and now I cannot ask him this): Common Starling; Eurasian Tree Sparrow (not common); Great Tit; Eurasian Blue Tit Pieter mentioned 14 species that scored many zero values or only one observation: Stock Dove; Common Cuckoo; Lesser Spotted Woodpecker; Eurasian Golden Oriole; Eurasian Nuthatch; Short-toed Treecreeper; Common Nightingale; Marsh Warbler; Lesser Whitethroat; Goldcrest; Common Firecrest (after 1970 he had difficulties in hearing these two species); Spotted Flycatcher; Eurasian Bullfinch; Black Woodpecker He also mentioned species that he found much fewer as: European Greenfinch; European Pied Flycatcher; Long-eared Owl; Red Crossbill; Sedge Warbler; Icterine Warbler; Eurasian Woodcock; Eurasian Siskin; European Green Woodpecker; Great Spotted Woodpecker; Eurasian Hobby; Western Barn Owl; Woodlark; Common Wood Pigeon; Little Owl; European Crested Tit; Hawfinch. But for these species I think that observations are strongly dependent on the number of visits to the wood. Also here, many zeros and few 1 x during the whole series of visits.
Resumo:
Federal Highway Administration, Washington, D.C.
Resumo:
Federal Highway Administration, Washington, D.C.
Resumo:
Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.
Resumo:
Background: Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers. Results: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10(-5) for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers. Conclusions: Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.