913 resultados para Poisson regression model
Resumo:
The objective of the present study was to estimate the contribution of environmental pollutants to hospital admissions for cardiovascular disease. A time series ecological study was conducted on subjects aged over 60 years and living in São José dos Campos, Brazil, with a population near 700,000 inhabitants. Hospital admission data of public health patients (SUS) were obtained from DATASUS for the period between January 1, 2004 and December 31, 2006, according to the ICD-10 diagnoses I20 to I22 and I24. Particulate matter with less than 10 µm in aerodynamic diameter, sulfur dioxide and ozone were the pollutants examined, and the control variables were mean temperature and relative humidity. Data on pollutants were obtained from the São Paulo State Sanitary Agency. The generalized linear model Poisson regression with lags of up to 5 days was used. There were 1303 hospital admissions during the period. Exposure to particulate matter was significantly associated with hospitalization for cardiovascular disease 3 days after exposure (RR = 1.006; 95%CI = 1.000 to 1.010) and an increase of 16 µg/m³ was associated with a 10% increase in risk of hospitalization; other pollutants were not associated with hospitalization. Thus, it was possible to identify the role of exposure to particulate matter as an environmental pollutant in hospitalization for cardiovascular disease in a medium-sized city inSoutheastern Brazil.
Resumo:
There is a demonstrable association between exposure to air pollutants and deaths due to cardiovascular diseases. The objective of this study was to estimate the effects of exposure to sulfur dioxide on mortality due to circulatory diseases in individuals 50 years of age or older residing in São José dos Campos, SP. This was a time-series ecological study for the years 2003 to 2007 using information on deaths due to circulatory disease obtained from Datasus reports. Data on daily levels of pollutants, particulate matter, sulfur dioxide (SO2), ozone, temperature, and humidity were obtained from the São Paulo State Environmental Agency. Moving average models for 2 to 7 days were calculated by Poisson regression using the R software. Exposure to SO2 was analyzed using a unipollutant, bipollutant or multipollutant model adjusted for mean temperature and humidity. The relative risks with 95%CI were obtained and the percent decrease in risk was calculated. There were 1928 deaths with a daily mean (± SD) of 1.05 ± 1.03 (range: 0-6). Exposure to SO2 was significantly associated with mortality due to circulatory disease: RR = 1.04 (95%CI = 1.01 to 1.06) in the 7-day moving average, after adjusting for ozone. There was an 8.5% decrease in risk in the multipollutant model, proportional to a decrease of SO2 concentrations. The results of this study suggest that residents of medium-sized Brazilian cities with characteristics similar to those of São José dos Campos probably have health problems due to exposure to air pollutants.
Resumo:
Exposure to nitrogen oxides (NOx) emitted by burning fossil fuels has been associated with respiratory diseases. We aimed to estimate the effects of NOx exposure on mortality owing to respiratory diseases in residents of Taubaté, São Paulo, Brazil, of all ages and both sexes. This time-series ecological study from August 1, 2011 to July 31, 2012 used information on deaths caused by respiratory diseases obtained from the Health Department of Taubaté. Estimated daily levels of pollutants (NOx, particulate matter, ozone, carbon monoxide) were obtained from the Centro de Previsão de Tempo e Estudos Climáticos Coupled Aerosol and Tracer Transport model to the Brazilian developments on the Regional Atmospheric Modeling System. These environmental variables were used to adjust the multipollutant model for apparent temperature. To estimate association between hospitalizations owing to asthma and air pollutants, generalized additive Poisson regression models were developed, with lags as much as 5 days. There were 385 deaths with a daily mean (±SD) of 1.05±1.03 (range: 0-5). Exposure to NOx was significantly associated with mortality owing to respiratory diseases: relative risk (RR)=1.035 (95% confidence interval [CI]: 1.008-1.063) for lag 2, RR=1.064 (95%CI: 1.017-1.112) lag 3, RR=1.055 (95%CI: 1.025-1.085) lag 4, and RR=1.042 (95%CI: 1.010-1.076) lag 5. A 3 µg/m3 reduction in NOx concentration resulted in a decrease of 10-18 percentage points in risk of death caused by respiratory diseases. Even at NOx concentrations below the acceptable standard, there is association with deaths caused by respiratory diseases.
Resumo:
Several Authors Have Discussed Recently the Limited Dependent Variable Regression Model with Serial Correlation Between Residuals. the Pseudo-Maximum Likelihood Estimators Obtained by Ignoring Serial Correlation Altogether, Have Been Shown to Be Consistent. We Present Alternative Pseudo-Maximum Likelihood Estimators Which Are Obtained by Ignoring Serial Correlation Only Selectively. Monte Carlo Experiments on a Model with First Order Serial Correlation Suggest That Our Alternative Estimators Have Substantially Lower Mean-Squared Errors in Medium Size and Small Samples, Especially When the Serial Correlation Coefficient Is High. the Same Experiments Also Suggest That the True Level of the Confidence Intervals Established with Our Estimators by Assuming Asymptotic Normality, Is Somewhat Lower Than the Intended Level. Although the Paper Focuses on Models with Only First Order Serial Correlation, the Generalization of the Proposed Approach to Serial Correlation of Higher Order Is Also Discussed Briefly.
Resumo:
Cette thèse présente des méthodes de traitement de données de comptage en particulier et des données discrètes en général. Il s'inscrit dans le cadre d'un projet stratégique du CRNSG, nommé CC-Bio, dont l'objectif est d'évaluer l'impact des changements climatiques sur la répartition des espèces animales et végétales. Après une brève introduction aux notions de biogéographie et aux modèles linéaires mixtes généralisés aux chapitres 1 et 2 respectivement, ma thèse s'articulera autour de trois idées majeures. Premièrement, nous introduisons au chapitre 3 une nouvelle forme de distribution dont les composantes ont pour distributions marginales des lois de Poisson ou des lois de Skellam. Cette nouvelle spécification permet d'incorporer de l'information pertinente sur la nature des corrélations entre toutes les composantes. De plus, nous présentons certaines propriétés de ladite distribution. Contrairement à la distribution multidimensionnelle de Poisson qu'elle généralise, celle-ci permet de traiter les variables avec des corrélations positives et/ou négatives. Une simulation permet d'illustrer les méthodes d'estimation dans le cas bidimensionnel. Les résultats obtenus par les méthodes bayésiennes par les chaînes de Markov par Monte Carlo (CMMC) indiquent un biais relatif assez faible de moins de 5% pour les coefficients de régression des moyennes contrairement à ceux du terme de covariance qui semblent un peu plus volatils. Deuxièmement, le chapitre 4 présente une extension de la régression multidimensionnelle de Poisson avec des effets aléatoires ayant une densité gamma. En effet, conscients du fait que les données d'abondance des espèces présentent une forte dispersion, ce qui rendrait fallacieux les estimateurs et écarts types obtenus, nous privilégions une approche basée sur l'intégration par Monte Carlo grâce à l'échantillonnage préférentiel. L'approche demeure la même qu'au chapitre précédent, c'est-à-dire que l'idée est de simuler des variables latentes indépendantes et de se retrouver dans le cadre d'un modèle linéaire mixte généralisé (GLMM) conventionnel avec des effets aléatoires de densité gamma. Même si l'hypothèse d'une connaissance a priori des paramètres de dispersion semble trop forte, une analyse de sensibilité basée sur la qualité de l'ajustement permet de démontrer la robustesse de notre méthode. Troisièmement, dans le dernier chapitre, nous nous intéressons à la définition et à la construction d'une mesure de concordance donc de corrélation pour les données augmentées en zéro par la modélisation de copules gaussiennes. Contrairement au tau de Kendall dont les valeurs se situent dans un intervalle dont les bornes varient selon la fréquence d'observations d'égalité entre les paires, cette mesure a pour avantage de prendre ses valeurs sur (-1;1). Initialement introduite pour modéliser les corrélations entre des variables continues, son extension au cas discret implique certaines restrictions. En effet, la nouvelle mesure pourrait être interprétée comme la corrélation entre les variables aléatoires continues dont la discrétisation constitue nos observations discrètes non négatives. Deux méthodes d'estimation des modèles augmentés en zéro seront présentées dans les contextes fréquentiste et bayésien basées respectivement sur le maximum de vraisemblance et l'intégration de Gauss-Hermite. Enfin, une étude de simulation permet de montrer la robustesse et les limites de notre approche.
Resumo:
Objectif de l’étude : Estimer l'association entre la position socioéconomique et l'utilisation des médicaments psychotropes dans cinq populations différentes chez les personnes âgées de 65-74 ans. Méthode : L'échantillon d'étude était composé de 1995 personnes avec des données issues de la première vague de collecte de 2012 faite par l’International Mobility in Aging Study (IMIAS). Il se composait de 401 participants de Saint- Hyacinthe (Québec), 398 de Kingston (Ontario), 394 personnes âgées de Tirana (Albanie), 400 de Manizales (Colombie) et 402 de Natal (Brésil). Tous les médicaments psychotropes consommés pendant les 15 derniers jours ont été identifiés au cours d'une visite à domicile et codés selon la classification ATC. Les médicaments psychotropes inclus étaient les anxiolytiques, sédatifs et hypnotiques (ASH), les antidépresseurs (ADP) et les analgésiques/antiépileptiques/antiParkinson (AEP). Les associations entre la prévalence de la consommation des médicaments psychotropes et l'éducation, le revenu et l’occupation ont été estimés avec des ratios de prévalence (RP) obtenus en ajustant une régression de Poisson et en utilisant le modèle comportemental de Andersen et Newman sur l'utilisation des services de santé et en contrôlant les besoins (les maladies chroniques et la dépression), les facteurs prédisposants (âge et sexe) ainsi que les facteurs facilitants (en utilisant le site d'étude en tant que mandataire des facteurs lié au système de santé et à l'environnement). Résultats : Les personnes âgées vivant dans les sites canadiens consommaient plus de médicaments psychotropes que celles vivant dans les sites à l'extérieur du Canada, elles consommaient moins d’ASH à Manizales et ne consommaient pas d’ADP en Albanie. Les inégalités socioéconomiques varient selon les sites. Dans les sites canadiens, le faible niveau socioéconomique était associée à une plus grande consommation de médicaments psychotropes : en particulier, les personnes à faible niveau d’instruction consommaient plus d’antidépresseurs et celles à faible revenu consommaient plus d’AEP. Dans les sites de recherche d'Amérique latine, les personnes âgées de niveau d’instruction et de revenu élevé consommaient plus antidépresseurs et celles avec des occupations manuelles consommaient plus d’analgésiques/antiépileptiques/antiParkinson. À Tirana (Albanie), il n'y avait pas de consommation de médicaments antidépresseurs, mais la consommation d’ASH était plus élevée chez les personnes à faible revenu. Les analyses multivariées du modèle final cache les différences entre les sites qui se sont révélées dans les analyses spécifiques au niveau du Canada, de l’Amérique Latine et d’Albanie. Conclusion : Il existe des inégalités socioéconomiques liées à la consommation des médicaments psychotropes chez les personnes âgées. Ces inégalités varient selon les sites. L'utilisation des médicaments psychotropes était plus fréquente chez les personnes les moins instruites et les plus pauvres au Canada alors que l'inverse était vrai dans les sites d'Amérique latine. L'Albanie était caractérisée par une absence de consommation d'antidépresseurs alors qu’il y avait une plus grande utilisation des anxiolytiques, sédatifs et hypnotiques dans les groupes à faible revenu.
Resumo:
Multivariate lifetime data arise in various forms including recurrent event data when individuals are followed to observe the sequence of occurrences of a certain type of event; correlated lifetime when an individual is followed for the occurrence of two or more types of events, or when distinct individuals have dependent event times. In most studies there are covariates such as treatments, group indicators, individual characteristics, or environmental conditions, whose relationship to lifetime is of interest. This leads to a consideration of regression models.The well known Cox proportional hazards model and its variations, using the marginal hazard functions employed for the analysis of multivariate survival data in literature are not sufficient to explain the complete dependence structure of pair of lifetimes on the covariate vector. Motivated by this, in Chapter 2, we introduced a bivariate proportional hazards model using vector hazard function of Johnson and Kotz (1975), in which the covariates under study have different effect on two components of the vector hazard function. The proposed model is useful in real life situations to study the dependence structure of pair of lifetimes on the covariate vector . The well known partial likelihood approach is used for the estimation of parameter vectors. We then introduced a bivariate proportional hazards model for gap times of recurrent events in Chapter 3. The model incorporates both marginal and joint dependence of the distribution of gap times on the covariate vector . In many fields of application, mean residual life function is considered superior concept than the hazard function. Motivated by this, in Chapter 4, we considered a new semi-parametric model, bivariate proportional mean residual life time model, to assess the relationship between mean residual life and covariates for gap time of recurrent events. The counting process approach is used for the inference procedures of the gap time of recurrent events. In many survival studies, the distribution of lifetime may depend on the distribution of censoring time. In Chapter 5, we introduced a proportional hazards model for duration times and developed inference procedures under dependent (informative) censoring. In Chapter 6, we introduced a bivariate proportional hazards model for competing risks data under right censoring. The asymptotic properties of the estimators of the parameters of different models developed in previous chapters, were studied. The proposed models were applied to various real life situations.
Resumo:
Survival times for the Acacia mangium plantation in the Segaliud Lokan Project, Sabah, East Malaysia were analysed based on 20 permanent sample plots (PSPs) established in 1988 as a spacing experiment. The PSPs were established following a complete randomized block design with five levels of spacing randomly assigned to units within four blocks at different sites. The survival times of trees in years are of interest. Since the inventories were only conducted annually, the actual survival time for each tree was not observed. Hence, the data set comprises censored survival times. Initial analysis of the survival of the Acacia mangium plantation suggested there is block by spacing interaction; a Weibull model gives a reasonable fit to the replicate survival times within each PSP; but a standard Weibull regression model is inappropriate because the shape parameter differs between PSPs. In this paper we investigate the form of the non-constant Weibull shape parameter. Parsimonious models for the Weibull survival times have been derived using maximum likelihood methods. The factor selection for the parameters is based on a backward elimination procedure. The models are compared using likelihood ratio statistics. The results suggest that both Weibull parameters depend on spacing and block.
Resumo:
This paper derives some exact power properties of tests for spatial autocorrelation in the context of a linear regression model. In particular, we characterize the circumstances in which the power vanishes as the autocorrelation increases, thus extending the work of Krämer (2005). More generally, the analysis in the paper sheds new light on how the power of tests for spatial autocorrelation is affected by the matrix of regressors and by the spatial structure. We mainly focus on the problem of residual spatial autocorrelation, in which case it is appropriate to restrict attention to the class of invariant tests, but we also consider the case when the autocorrelation is due to the presence of a spatially lagged dependent variable among the regressors. A numerical study aimed at assessing the practical relevance of the theoretical results is included
Resumo:
The survival of Bifidobacterium longum NCIMB 8809 was studied during refrigerated storage for 6 weeks in model solutions, based on which a mathematical model was constructed describing cell survival as a function of pH, citric acid, protein and dietary fibre. A Central Composite Design (CCD) was developed studying the influence of four factors at three levels, i.e., pH (3.2–4), citric acid (2–15 g/l), protein (0–10 g/l), and dietary fibre (0–8 g/l). In total, 31 experimental runs were carried out. Analysis of variance (ANOVA) of the regression model demonstrated that the model fitted well the data. From the regression coefficients it was deduced that all four factors had a statistically significant (P < 0.05) negative effect on the log decrease [log10N0 week−log10N6 week], with the pH and citric acid being the most influential ones. Cell survival during storage was also investigated in various types of juices, including orange, grapefruit, blackcurrant, pineapple, pomegranate and strawberry. The highest cell survival (less than 0.4 log decrease) after 6 weeks of storage was observed in orange and pineapple, both of which had a pH of about 3.8. Although the pH of grapefruit and blackcurrant was similar (pH ∼3.2), the log decrease of the former was ∼0.5 log, whereas of the latter was ∼0.7 log. One reason for this could be the fact that grapefruit contained a high amount of citric acid (15.3 g/l). The log decrease in pomegranate and strawberry juices was extremely high (∼8 logs). The mathematical model was able to predict adequately the cell survival in orange, grapefruit, blackcurrant, and pineapple juices. However, the model failed to predict the cell survival in pomegranate and strawberry, most likely due to the very high levels of phenolic compounds in these two juices.
Resumo:
This work proposes a unified neurofuzzy modelling scheme. To begin with, the initial fuzzy base construction method is based on fuzzy clustering utilising a Gaussian mixture model (GMM) combined with the analysis of covariance (ANOVA) decomposition in order to obtain more compact univariate and bivariate membership functions over the subspaces of the input features. The mean and covariance of the Gaussian membership functions are found by the expectation maximisation (EM) algorithm with the merit of revealing the underlying density distribution of system inputs. The resultant set of membership functions forms the basis of the generalised fuzzy model (GFM) inference engine. The model structure and parameters of this neurofuzzy model are identified via the supervised subspace orthogonal least square (OLS) learning. Finally, instead of providing deterministic class label as model output by convention, a logistic regression model is applied to present the classifier’s output, in which the sigmoid type of logistic transfer function scales the outputs of the neurofuzzy model to the class probability. Experimental validation results are presented to demonstrate the effectiveness of the proposed neurofuzzy modelling scheme.
Resumo:
Classical regression methods take vectors as covariates and estimate the corresponding vectors of regression parameters. When addressing regression problems on covariates of more complex form such as multi-dimensional arrays (i.e. tensors), traditional computational models can be severely compromised by ultrahigh dimensionality as well as complex structure. By exploiting the special structure of tensor covariates, the tensor regression model provides a promising solution to reduce the model’s dimensionality to a manageable level, thus leading to efficient estimation. Most of the existing tensor-based methods independently estimate each individual regression problem based on tensor decomposition which allows the simultaneous projections of an input tensor to more than one direction along each mode. As a matter of fact, multi-dimensional data are collected under the same or very similar conditions, so that data share some common latent components but can also have their own independent parameters for each regression task. Therefore, it is beneficial to analyse regression parameters among all the regressions in a linked way. In this paper, we propose a tensor regression model based on Tucker Decomposition, which identifies not only the common components of parameters across all the regression tasks, but also independent factors contributing to each particular regression task simultaneously. Under this paradigm, the number of independent parameters along each mode is constrained by a sparsity-preserving regulariser. Linked multiway parameter analysis and sparsity modeling further reduce the total number of parameters, with lower memory cost than their tensor-based counterparts. The effectiveness of the new method is demonstrated on real data sets.
Resumo:
Nesse artigo, tem-se o interesse em avaliar diferentes estratégias de estimação de parâmetros para um modelo de regressão linear múltipla. Para a estimação dos parâmetros do modelo foram utilizados dados de um ensaio clínico em que o interesse foi verificar se o ensaio mecânico da propriedade de força máxima (EM-FM) está associada com a massa femoral, com o diâmetro femoral e com o grupo experimental de ratas ovariectomizadas da raça Rattus norvegicus albinus, variedade Wistar. Para a estimação dos parâmetros do modelo serão comparadas três metodologias: a metodologia clássica, baseada no método dos mínimos quadrados; a metodologia Bayesiana, baseada no teorema de Bayes; e o método Bootstrap, baseado em processos de reamostragem.
Resumo:
In this paper, the generalized log-gamma regression model is modified to allow the possibility that long-term survivors may be present in the data. This modification leads to a generalized log-gamma regression model with a cure rate, encompassing, as special cases, the log-exponential, log-Weibull and log-normal regression models with a cure rate typically used to model such data. The models attempt to simultaneously estimate the effects of explanatory variables on the timing acceleration/deceleration of a given event and the surviving fraction, that is, the proportion of the population for which the event never occurs. The normal curvatures of local influence are derived under some usual perturbation schemes and two martingale-type residuals are proposed to assess departures from the generalized log-gamma error assumption as well as to detect outlying observations. Finally, a data set from the medical area is analyzed.
Resumo:
Considering the Wald, score, and likelihood ratio asymptotic test statistics, we analyze a multivariate null intercept errors-in-variables regression model, where the explanatory and the response variables are subject to measurement errors, and a possible structure of dependency between the measurements taken within the same individual are incorporated, representing a longitudinal structure. This model was proposed by Aoki et al. (2003b) and analyzed under the bayesian approach. In this article, considering the classical approach, we analyze asymptotic test statistics and present a simulation study to compare the behavior of the three test statistics for different sample sizes, parameter values and nominal levels of the test. Also, closed form expressions for the score function and the Fisher information matrix are presented. We consider two real numerical illustrations, the odontological data set from Hadgu and Koch (1999), and a quality control data set.