939 resultados para Zero-inflated models, Statistical models, Poisson, Negative binomial, Statistical methods
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.
Resumo:
Traditional vegetation mapping methods use high cost, labour-intensive aerial photography interpretation. This approach can be subjective and is limited by factors such as the extent of remnant vegetation, and the differing scale and quality of aerial photography over time. An alternative approach is proposed which integrates a data model, a statistical model and an ecological model using sophisticated Geographic Information Systems (GIS) techniques and rule-based systems to support fine-scale vegetation community modelling. This approach is based on a more realistic representation of vegetation patterns with transitional gradients from one vegetation community to another. Arbitrary, though often unrealistic, sharp boundaries can be imposed on the model by the application of statistical methods. This GIS-integrated multivariate approach is applied to the problem of vegetation mapping in the complex vegetation communities of the Innisfail Lowlands in the Wet Tropics bioregion of Northeastern Australia. The paper presents the full cycle of this vegetation modelling approach including sampling sites, variable selection, model selection, model implementation, internal model assessment, model prediction assessments, models integration of discrete vegetation community models to generate a composite pre-clearing vegetation map, independent data set model validation and model prediction's scale assessments. An accurate pre-clearing vegetation map of the Innisfail Lowlands was generated (0.83r(2)) through GIS integration of 28 separate statistical models. This modelling approach has good potential for wider application, including provision of. vital information for conservation planning and management; a scientific basis for rehabilitation of disturbed and cleared areas; a viable method for the production of adequate vegetation maps for conservation and forestry planning of poorly-studied areas. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
We study a class of models used with success in the modelling of climatological sequences. These models are based on the notion of renewal. At first, we examine the probabilistic aspects of these models to afterwards study the estimation of their parameters and their asymptotical properties, in particular the consistence and the normality. We will discuss for applications, two particular classes of alternating renewal processes at discrete time. The first class is defined by laws of sojourn time that are translated negative binomial laws and the second class, suggested by Green is deduced from alternating renewal process in continuous time with sojourn time laws which are exponential laws with parameters α^0 and α^1 respectively.
Resumo:
This paper explains how Poisson regression can be used in studies in which the dependent variable describes the number of occurrences of some rare event such as suicide. After pointing out why ordinary linear regression is inappropriate for treating dependent variables of this sort, we go on to present the basic Poisson regression model and show how it fits in the broad class of generalized linear models. Then we turn to discussing a major problem of Poisson regression known as overdispersion and suggest possible solutions, including the correction of standard errors and negative binomial regression. The paper ends with a detailed empirical example, drawn from our own research on suicide.
Resumo:
This study examines the influence of acculturative stress on substance use and HIV risk behaviors among recent Latino immigrants. The central hypothesis of the study is that specific religious coping mechanisms influence the relationship that acculturative stress has on the substance use and HIV-risk behaviors of recent Latino immigrants. Within the Latino culture religiosity is a pervasive force, guiding attitudes, behaviors, and even social interactions. When controlling for education and socioeconomic status, Latinos have been found to use religious coping mechanisms more frequently than their Non-Latino White counterparts. In addition, less acculturated Latinos use religious coping strategies more frequently than those with higher levels of acculturation. Given its prominent role in Latino culture, it appears probable that this mechanism may prove to be influential during difficult life transitions, such as those experienced during the immigration process. This study examines the moderating influence of specific religious coping mechanisms on the relationship between acculturative stress and substance use/HIV risk behaviors of recent Latino immigrants. Analyses for the present study were conducted with wave 2 data from an ongoing longitudinal study investigating associations between pre-immigration factors and health behavior trajectories of recent Latino immigrants. Structural equation and zero-inflated Poisson modeling were implemented to test the specified models and examine the nature of the relationship among the variables. Moderating effects were found for negative religious coping. Higher levels of negative religious coping strengthened an inverse relationship between acculturative stress and substance use. Results also indicated direct relationships between religious coping mechanisms and substance use. External and positive religious coping were inversely related to substance use. Negative religious coping was positively related to substance use. This study aims to contribute knowledge of how religious coping influence's the adaptation process of recent Latino immigrants. Expanding scientific understanding as to the function and effect of these coping mechanisms could lead to enhanced culturally relevant approaches in service delivery among Latino populations. Furthermore this knowledge could inform research about specific cognitions and behaviors that need to be targeted in prevention and treatment programs with this population.
Resumo:
Ma thèse s’intéresse aux politiques de santé conçues pour encourager l’offre de services de santé. L’accessibilité aux services de santé est un problème majeur qui mine le système de santé de la plupart des pays industrialisés. Au Québec, le temps médian d’attente entre une recommandation du médecin généraliste et un rendez-vous avec un médecin spécialiste était de 7,3 semaines en 2012, contre 2,9 semaines en 1993, et ceci malgré l’augmentation du nombre de médecins sur cette même période. Pour les décideurs politiques observant l’augmentation du temps d’attente pour des soins de santé, il est important de comprendre la structure de l’offre de travail des médecins et comment celle-ci affecte l’offre des services de santé. Dans ce contexte, je considère deux principales politiques. En premier lieu, j’estime comment les médecins réagissent aux incitatifs monétaires et j’utilise les paramètres estimés pour examiner comment les politiques de compensation peuvent être utilisées pour déterminer l’offre de services de santé de court terme. En second lieu, j’examine comment la productivité des médecins est affectée par leur expérience, à travers le mécanisme du "learning-by-doing", et j’utilise les paramètres estimés pour trouver le nombre de médecins inexpérimentés que l’on doit recruter pour remplacer un médecin expérimenté qui va à la retraite afin de garder l’offre des services de santé constant. Ma thèse développe et applique des méthodes économique et statistique afin de mesurer la réaction des médecins face aux incitatifs monétaires et estimer leur profil de productivité (en mesurant la variation de la productivité des médecins tout le long de leur carrière) en utilisant à la fois des données de panel sur les médecins québécois, provenant d’enquêtes et de l’administration. Les données contiennent des informations sur l’offre de travail de chaque médecin, les différents types de services offerts ainsi que leurs prix. Ces données couvrent une période pendant laquelle le gouvernement du Québec a changé les prix relatifs des services de santé. J’ai utilisé une approche basée sur la modélisation pour développer et estimer un modèle structurel d’offre de travail en permettant au médecin d’être multitâche. Dans mon modèle les médecins choisissent le nombre d’heures travaillées ainsi que l’allocation de ces heures à travers les différents services offerts, de plus les prix des services leurs sont imposés par le gouvernement. Le modèle génère une équation de revenu qui dépend des heures travaillées et d’un indice de prix représentant le rendement marginal des heures travaillées lorsque celles-ci sont allouées de façon optimale à travers les différents services. L’indice de prix dépend des prix des services offerts et des paramètres de la technologie de production des services qui déterminent comment les médecins réagissent aux changements des prix relatifs. J’ai appliqué le modèle aux données de panel sur la rémunération des médecins au Québec fusionnées à celles sur l’utilisation du temps de ces mêmes médecins. J’utilise le modèle pour examiner deux dimensions de l’offre des services de santé. En premierlieu, j’analyse l’utilisation des incitatifs monétaires pour amener les médecins à modifier leur production des différents services. Bien que les études antérieures ont souvent cherché à comparer le comportement des médecins à travers les différents systèmes de compensation,il y a relativement peu d’informations sur comment les médecins réagissent aux changementsdes prix des services de santé. Des débats actuels dans les milieux de politiques de santé au Canada se sont intéressés à l’importance des effets de revenu dans la détermination de la réponse des médecins face à l’augmentation des prix des services de santé. Mon travail contribue à alimenter ce débat en identifiant et en estimant les effets de substitution et de revenu résultant des changements des prix relatifs des services de santé. En second lieu, j’analyse comment l’expérience affecte la productivité des médecins. Cela a une importante implication sur le recrutement des médecins afin de satisfaire la demande croissante due à une population vieillissante, en particulier lorsque les médecins les plus expérimentés (les plus productifs) vont à la retraite. Dans le premier essai, j’ai estimé la fonction de revenu conditionnellement aux heures travaillées, en utilisant la méthode des variables instrumentales afin de contrôler pour une éventuelle endogeneité des heures travaillées. Comme instruments j’ai utilisé les variables indicatrices des âges des médecins, le taux marginal de taxation, le rendement sur le marché boursier, le carré et le cube de ce rendement. Je montre que cela donne la borne inférieure de l’élasticité-prix direct, permettant ainsi de tester si les médecins réagissent aux incitatifs monétaires. Les résultats montrent que les bornes inférieures des élasticités-prix de l’offre de services sont significativement positives, suggérant que les médecins répondent aux incitatifs. Un changement des prix relatifs conduit les médecins à allouer plus d’heures de travail au service dont le prix a augmenté. Dans le deuxième essai, j’estime le modèle en entier, de façon inconditionnelle aux heures travaillées, en analysant les variations des heures travaillées par les médecins, le volume des services offerts et le revenu des médecins. Pour ce faire, j’ai utilisé l’estimateur de la méthode des moments simulés. Les résultats montrent que les élasticités-prix direct de substitution sont élevées et significativement positives, représentant une tendance des médecins à accroitre le volume du service dont le prix a connu la plus forte augmentation. Les élasticitésprix croisées de substitution sont également élevées mais négatives. Par ailleurs, il existe un effet de revenu associé à l’augmentation des tarifs. J’ai utilisé les paramètres estimés du modèle structurel pour simuler une hausse générale de prix des services de 32%. Les résultats montrent que les médecins devraient réduire le nombre total d’heures travaillées (élasticité moyenne de -0,02) ainsi que les heures cliniques travaillées (élasticité moyenne de -0.07). Ils devraient aussi réduire le volume de services offerts (élasticité moyenne de -0.05). Troisièmement, j’ai exploité le lien naturel existant entre le revenu d’un médecin payé à l’acte et sa productivité afin d’établir le profil de productivité des médecins. Pour ce faire, j’ai modifié la spécification du modèle pour prendre en compte la relation entre la productivité d’un médecin et son expérience. J’estime l’équation de revenu en utilisant des données de panel asymétrique et en corrigeant le caractère non-aléatoire des observations manquantes à l’aide d’un modèle de sélection. Les résultats suggèrent que le profil de productivité est une fonction croissante et concave de l’expérience. Par ailleurs, ce profil est robuste à l’utilisation de l’expérience effective (la quantité de service produit) comme variable de contrôle et aussi à la suppression d’hypothèse paramétrique. De plus, si l’expérience du médecin augmente d’une année, il augmente la production de services de 1003 dollar CAN. J’ai utilisé les paramètres estimés du modèle pour calculer le ratio de remplacement : le nombre de médecins inexpérimentés qu’il faut pour remplacer un médecin expérimenté. Ce ratio de remplacement est de 1,2.
Resumo:
We report the first measurements of the moments--mean (M), variance (σ(2)), skewness (S), and kurtosis (κ)--of the net-charge multiplicity distributions at midrapidity in Au+Au collisions at seven energies, ranging from sqrt[sNN]=7.7 to 200 GeV, as a part of the Beam Energy Scan program at RHIC. The moments are related to the thermodynamic susceptibilities of net charge, and are sensitive to the location of the QCD critical point. We compare the products of the moments, σ(2)/M, Sσ, and κσ(2), with the expectations from Poisson and negative binomial distributions (NBDs). The Sσ values deviate from the Poisson baseline and are close to the NBD baseline, while the κσ(2) values tend to lie between the two. Within the present uncertainties, our data do not show nonmonotonic behavior as a function of collision energy. These measurements provide a valuable tool to extract the freeze-out parameters in heavy-ion collisions by comparing with theoretical models.
Resumo:
In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propose a robust structure for a censored linear model based on the multivariate Student's t-distribution. To compensate for the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is employed. An efficient expectation maximization type algorithm is developed for computing the maximum likelihood estimates, obtaining as a by-product the standard errors of the fixed effects and the log-likelihood function. The proposed algorithm uses closed-form expressions at the E-step that rely on formulas for the mean and variance of a truncated multivariate Student's t-distribution. The methodology is illustrated through an application to an Human Immunodeficiency Virus-AIDS (HIV-AIDS) study and several simulation studies.
Resumo:
In this paper, we present various diagnostic methods for polyhazard models. Polyhazard models are a flexible family for fitting lifetime data. Their main advantage over the single hazard models, such as the Weibull and the log-logistic models, is to include a large amount of nonmonotone hazard shapes, as bathtub and multimodal curves. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. A discussion of the computation of the likelihood displacement as well as the normal curvature in the local influence method are presented. Finally, an example with real data is given for illustration.
Resumo:
Academics are often ranked on citation counts’, which is considered an adequate proxy for author's quality and reputation. This paper seeks to find what is behind a cited academic / a cited article. We constructed a rich dataset from Portuguese affiliated economists and use zero inflated negative binomial model. This procedure is appropriate for count outcomes, correcting for overdispersion and excess zeros. We also use a fixed effect poisson model to accomodate authors' unobserved heterogeneity. We analyze results in detail comparing with existing literature and making some theoretical considerations around.
Resumo:
We review recent likelihood-based approaches to modeling demand for medical care. A semi-nonparametric model along the lines of Cameron and Johansson's Poisson polynomial model, but using a negative binomial baseline model, is introduced. We apply these models, as well a semiparametric Poisson, hurdle semiparametric Poisson, and finite mixtures of negative binomial models to six measures of health care usage taken from the Medical Expenditure Panel survey. We conclude that most of the models lead to statistically similar results, both in terms of information criteria and conditional and unconditional prediction. This suggests that applied researchers may not need to be overly concerned with the choice of which of these models they use to analyze data on health care demand.
Resumo:
The phenomenon of human migration is certainly not new and it has been studied from a variety of perspectives. Yet, the attention on human migration and its determinant has not been fading over time as confirmed by recent contributions (see for instance Cushing and Poot 2004 and Rebhun and Raveh 2006). In this paper we combine the recent theoretical contributions by Douglas (1997) and Wall (2001) with the methodological advancements of Guimarães et al. (2000, 2003) to model inter-municipal migration flows in the Barcelona area. In order to do that, we employ two different types of count models, i.e. the Poisson and negative binomial and compare the estimations obtained. Our results show that, even after controlling for the traditional migration factors, QoL (measured with a Composite Index which includes numerous aspects and also using a list of individual variables) is an important determinant of short distance migration movements in the Barcelona area.
Resumo:
It is generally accepted that between 70 and 80% of manufacturing costs can be attributed to design. Nevertheless, it is difficult for the designer to estimate manufacturing costs accurately, especially when alternative constructions are compared at the conceptual design phase, because of the lack of cost information and appropriate tools. In general, previous reports concerning optimisation of a welded structure have used the mass of the product as the basis for the cost comparison. However, it can easily be shown using a simple example that the use of product mass as the sole manufacturing cost estimator is unsatisfactory. This study describes a method of formulating welding time models for cost calculation, and presents the results of the models for particular sections, based on typical costs in Finland. This was achieved by collecting information concerning welded products from different companies. The data included 71 different welded assemblies taken from the mechanical engineering and construction industries. The welded assemblies contained in total 1 589 welded parts, 4 257 separate welds, and a total welded length of 3 188 metres. The data were modelled for statistical calculations, and models of welding time were derived by using linear regression analysis. Themodels were tested by using appropriate statistical methods, and were found to be accurate. General welding time models have been developed, valid for welding in Finland, as well as specific, more accurate models for particular companies. The models are presented in such a form that they can be used easily by a designer, enabling the cost calculation to be automated.