961 resultados para Generalized Additive Models
Resumo:
With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^
Resumo:
Complex diseases, such as cancer, are caused by various genetic and environmental factors, and their interactions. Joint analysis of these factors and their interactions would increase the power to detect risk factors but is statistically. Bayesian generalized linear models using student-t prior distributions on coefficients, is a novel method to simultaneously analyze genetic factors, environmental factors, and interactions. I performed simulation studies using three different disease models and demonstrated that the variable selection performance of Bayesian generalized linear models is comparable to that of Bayesian stochastic search variable selection, an improved method for variable selection when compared to standard methods. I further evaluated the variable selection performance of Bayesian generalized linear models using different numbers of candidate covariates and different sample sizes, and provided a guideline for required sample size to achieve a high power of variable selection using Bayesian generalize linear models, considering different scales of number of candidate covariates. ^ Polymorphisms in folate metabolism genes and nutritional factors have been previously associated with lung cancer risk. In this study, I simultaneously analyzed 115 tag SNPs in folate metabolism genes, 14 nutritional factors, and all possible genetic-nutritional interactions from 1239 lung cancer cases and 1692 controls using Bayesian generalized linear models stratified by never, former, and current smoking status. SNPs in MTRR were significantly associated with lung cancer risk across never, former, and current smokers. In never smokers, three SNPs in TYMS and three gene-nutrient interactions, including an interaction between SHMT1 and vitamin B12, an interaction between MTRR and total fat intake, and an interaction between MTR and alcohol use, were also identified as associated with lung cancer risk. These lung cancer risk factors are worthy of further investigation.^
Resumo:
Standard factorial designs sometimes may be inadequate for experiments that aim to estimate a generalized linear model, for example, for describing a binary response in terms of several variables. A method is proposed for finding exact designs for such experiments that uses a criterion allowing for uncertainty in the link function, the linear predictor, or the model parameters, together with a design search. Designs are assessed and compared by simulation of the distribution of efficiencies relative to locally optimal designs over a space of possible models. Exact designs are investigated for two applications, and their advantages over factorial and central composite designs are demonstrated.
Resumo:
Магдалина Василева Тодорова - В статията е описан подход за верификация на процедурни програми чрез изграждане на техни модели, дефинирани чрез обобщени мрежи. Подходът интегрира концепцията “design by contract” с подходи за верификация от тип доказателство на теореми и проверка на съгласуваност на модели. За целта разделно се верифицират функциите, които изграждат програмата относно спецификации според предназначението им. Изгражда се обобщен мрежов модел, специфициащ връзките между функциите във вид на коректни редици от извиквания. За главната функция на програмата се построява обобщен мрежов модел и се проверява дали той съответства на мрежовия модел на връзките между функциите на програмата. Всяка от функциите на програмата, която използва други функции се верифицира и относно спецификацията, зададена чрез мрежовия модел на връзките между функциите на програмата.
Resumo:
Mixtures of Zellner's g-priors have been studied extensively in linear models and have been shown to have numerous desirable properties for Bayesian variable selection and model averaging. Several extensions of g-priors to Generalized Linear Models (GLMs) have been proposed in the literature; however, the choice of prior distribution of g and resulting properties for inference have received considerably less attention. In this paper, we extend mixtures of g-priors to GLMs by assigning the truncated Compound Confluent Hypergeometric (tCCH) distribution to 1/(1+g) and illustrate how this prior distribution encompasses several special cases of mixtures of g-priors in the literature, such as the Hyper-g, truncated Gamma, Beta-prime, and the Robust prior. Under an integrated Laplace approximation to the likelihood, the posterior distribution of 1/(1+g) is in turn a tCCH distribution, and approximate marginal likelihoods are thus available analytically. We discuss the local geometric properties of the g-prior in GLMs and show that specific choices of the hyper-parameters satisfy the various desiderata for model selection proposed by Bayarri et al, such as asymptotic model selection consistency, information consistency, intrinsic consistency, and measurement invariance. We also illustrate inference using these priors and contrast them to others in the literature via simulation and real examples.
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
The aim of this article is to assess the effects of several territorial characteristics, specifically agglomeration economies, on industrial location processes in the Spanish region of Catalonia. Theoretically, the level of agglomeration causes economies which favour the location of new establishments, but an excessive level of agglomeration might cause diseconomies, since congestion effects arise. The empirical evidence on this matter is inconclusive, probably because the models used so far are not suitable enough. We use a more flexible semiparametric specification, which allows us to study the nonlinear relationship between the different types of agglomeration levels and location processes. Our main statistical source is the REIC (Catalan Manufacturing Establishments Register), which has plant-level microdata on location of new industrial establishments. Keywords: agglomeration economies, industrial location, Generalized Additive Models, nonparametric estimation, count data models.
Resumo:
Background Maternal exposure to air pollution has been related to fetal growth in a number of recent scientific studies. The objective of this study was to assess the association between exposure to air pollution during pregnancy and anthropometric measures at birth in a cohort in Valencia, Spain. Methods Seven hundred and eighty-five pregnant women and their singleton newborns participated in the study. Exposure to ambient nitrogen dioxide (NO2) was estimated by means of land use regression. NO2 spatial estimations were adjusted to correspond to relevant pregnancy periods (whole pregnancy and trimesters) for each woman. Outcome variables were birth weight, length, and head circumference (HC), along with being small for gestational age (SGA). The association between exposure to residential outdoor NO2 and outcomes was assessed controlling for potential confounders and examining the shape of the relationship using generalized additive models (GAM). Results For continuous anthropometric measures, GAM indicated a change in slope at NO2 concentrations of around 40 μg/m3. NO2 exposure >40 μg/m3 during the first trimester was associated with a change in birth length of -0.27 cm (95% CI: -0.51 to -0.03) and with a change in birth weight of -40.3 grams (-96.3 to 15.6); the same exposure throughout the whole pregnancy was associated with a change in birth HC of -0.17 cm (-0.34 to -0.003). The shape of the relation was seen to be roughly linear for the risk of being SGA. A 10 μg/m3 increase in NO2 during the second trimester was associated with being SGA-weight, odds ratio (OR): 1.37 (1.01-1.85). For SGA-length the estimate for the same comparison was OR: 1.42 (0.89-2.25). Conclusions Prenatal exposure to traffic-related air pollution may reduce fetal growth. Findings from this study provide further evidence of the need for developing strategies to reduce air pollution in order to prevent risks to fetal health and development.
Resumo:
In recent years, some epidemiologic studies have attributed adverse effects of air pollutants on health not only to particles and sulfur dioxide but also to photochemical air pollutants (nitrogen dioxide and ozone). The effects are usually small, leading to some inconsistencies in the results of the studies. Furthermore, the different methodologic approaches of the studies used has made it difficult to derive generic conclusions. We provide here a quantitative summary of the short-term effects of photochemical air pollutants on mortality in seven Spanish cities involved in the EMECAM project, using generalized additive models from analyses of single and multiple pollutants. Nitrogen dioxide and ozone data were provided by seven EMECAM cities (Barcelona, Gijón, Huelva, Madrid, Oviedo, Seville, and Valencia). Mortality indicators included daily total mortality from all causes excluding external causes, daily cardiovascular mortality, and daily respiratory mortality. Individual estimates, obtained from city-specific generalized additive Poisson autoregressive models, were combined by means of fixed effects models and, if significant heterogeneity among local estimates was found, also by random effects models. Significant positive associations were found between daily mortality (all causes and cardiovascular) and NO(2), once the rest of air pollutants were taken into account. A 10 microg/m(3) increase in the 24-hr average 1-day NO(2)level was associated with an increase in the daily number of deaths of 0.43% [95% confidence interval (CI), -0.003-0.86%] for all causes excluding external. In the case of significant relationships, relative risks for cause-specific mortality were nearly twice as much as that for total mortality for all the photochemical pollutants. Ozone was independently related only to cardiovascular daily mortality. No independent statistically significant relationship between photochemical air pollutants and respiratory mortality was found. The results in this study suggest that, given the present levels of photochemical pollutants, people living in Spanish cities are exposed to health risks derived from air pollution.
Resumo:
BACKGROUND: Spirometry reference values are important for the interpretation of spirometry results. Reference values should be updated regularly, derived from a population as similar to the population for which they are to be used and span across all ages. Such spirometry reference equations are currently lacking for central European populations. OBJECTIVE: To develop spirometry reference equations for central European populations between 8 and 90 years of age. MATERIALS: We used data collected between January 1993 and December 2010 from a central European population. The data was modelled using "Generalized Additive Models for Location, Scale and Shape" (GAMLSS). RESULTS: The spirometry reference equations were derived from 118'891 individuals consisting of 60'624 (51%) females and 58'267 (49%) males. Altogether, there were 18'211 (15.3%) children under the age of 18 years. CONCLUSION: We developed spirometry reference equations for a central European population between 8 and 90 years of age that can be implemented in a wide range of clinical settings.
Resumo:
Aim We examined whether species occurrences are primarily limited by physiological tolerance in the abiotically more stressful end of climatic gradients (the asymmetric abiotic stress limitation (AASL) hypothesis) and the geographical predictions of this hypothesis: abiotic stress mainly determines upper-latitudinal and upper-altitudinal species range limits, and the importance of abiotic stress for these range limits increases the further northwards and upwards a species occurs. Location Europe and the Swiss Alps. Methods The AASL hypothesis predicts that species have skewed responses to climatic gradients, with a steep decline towards the more stressful conditions. Based on presence-absence data we examined the shape of plant species responses (measured as probability of occurrence) along three climatic gradients across latitudes in Europe (1577 species) and altitudes in the Swiss Alps (284 species) using Huisman-Olff-Fresco, generalized linear and generalized additive models. Results We found that almost half of the species from Europe and one-third from the Swiss Alps showed responses consistent with the predictions of the AASL hypothesis. Cold temperatures and a short growing season seemed to determine the upper-latitudinal and upper-altitudinal range limits of up to one-third of the species, while drought provided an important constraint at lower-latitudinal range limits for up to one-fifth of the species. We found a biome-dependent influence of abiotic stress and no clear support for abiotic stress as a stronger upper range-limit determinant for species with higher latitudinal and altitudinal distributions. However, the overall influence of climate as a range-limit determinant increased with latitude. Main conclusions Our results support the AASL hypothesis for almost half of the studied species, and suggest that temperature-related stress controls the upper-latitudinal and upper-altitudinal range limits of a large proportion of these species, while other factors including drought stress may be important at the lower range limits.
Resumo:
Thesis written in co-mentorship with Robert Michaud.
Resumo:
In recent years, some epidemiologic studies have attributed adverse effects of air pollutants on health not only to particles and sulfur dioxide but also to photochemical air pollutants (nitrogen dioxide and ozone). The effects are usually small, leading to some inconsistencies in the results of the studies. Furthermore, the different methodologic approaches of the studies used has made it difficult to derive generic conclusions. We provide here a quantitative summary of the short-term effects of photochemical air pollutants on mortality in seven Spanish cities involved in the EMECAM project, using generalized additive models from analyses of single and multiple pollutants. Nitrogen dioxide and ozone data were provided by seven EMECAM cities (Barcelona, Gijón, Huelva, Madrid, Oviedo, Seville, and Valencia). Mortality indicators included daily total mortality from all causes excluding external causes, daily cardiovascular mortality, and daily respiratory mortality. Individual estimates, obtained from city-specific generalized additive Poisson autoregressive models, were combined by means of fixed effects models and, if significant heterogeneity among local estimates was found, also by random effects models. Significant positive associations were found between daily mortality (all causes and cardiovascular) and NO2, once the rest of air pollutants were taken into account. A 10 μg/m3 increase in the 24-hr average 1-day NO2 level was associated with an increase in the daily number of deaths of 0.43% [95% confidence interval(CI), –0.003–0.86%] for all causes excluding external. In the case of significant relationships, relative risks for cause-specific mortality were nearly twice as much as that for total mortality for all the photochemical pollutants. Ozone was independently related only to cardiovascular daily mortality. No independent statistically significant relationship between photochemical air pollutants and respiratory mortality was found. The results in this study suggest that, given the present levels of photochemical pollutants, people living in Spanish cities are exposed to health risks derived from air pollution
Resumo:
Objective. To investigate the short-term effects of exposure to particulate matter from biomass burning in the Amazon on the daily demand for outpatient care due to respiratory diseases in children and the elderly. Methods. Epidemiologic study with ecologic time series design. Daily consultation records were obtained from the 14 primary health care clinics in the municipality of Alta Floresta, state of Mato Grosso, in the southern region of the Brazilian Amazon, between January 2004 and December 2005. Information on the daily levels of fine particulate matter was made available by the Brazilian National Institute for Spatial Research. To control for confounding factors ( situations in which a non-causal association between exposure and disease is observed due to a third variable), variables related to time trends, seasonality, temperature, relative humidity, rainfall, and calendar effects ( such as occurrence of holidays and weekends) were included in the model. Poisson regression with generalized additive models was used. Results. A 10 mu g/m(3) increase in the level of exposure to particulate matter was associated with increases of 2.9% and 2.6% in outpatient consultations due to respiratory diseases in children on the 6th and 7th days following exposure. Significant associations were not observed for elderly individuals. Conclusions. The results suggest that the levels of particulate matter from biomass burning in the Amazon are associated with adverse effects on the respiratory health of children.