922 resultados para Generalized Linear Model
Resumo:
Background: Parkinson’s disease (PD) is an incurable neurological disease with approximately 0.3% prevalence. The hallmark symptom is gradual movement deterioration. Current scientific consensus about disease progression holds that symptoms will worsen smoothly over time unless treated. Accurate information about symptom dynamics is of critical importance to patients, caregivers, and the scientific community for the design of new treatments, clinical decision making, and individual disease management. Long-term studies characterize the typical time course of the disease as an early linear progression gradually reaching a plateau in later stages. However, symptom dynamics over durations of days to weeks remains unquantified. Currently, there is a scarcity of objective clinical information about symptom dynamics at intervals shorter than 3 months stretching over several years, but Internet-based patient self-report platforms may change this. Objective: To assess the clinical value of online self-reported PD symptom data recorded by users of the health-focused Internet social research platform PatientsLikeMe (PLM), in which patients quantify their symptoms on a regular basis on a subset of the Unified Parkinson’s Disease Ratings Scale (UPDRS). By analyzing this data, we aim for a scientific window on the nature of symptom dynamics for assessment intervals shorter than 3 months over durations of several years. Methods: Online self-reported data was validated against the gold standard Parkinson’s Disease Data and Organizing Center (PD-DOC) database, containing clinical symptom data at intervals greater than 3 months. The data were compared visually using quantile-quantile plots, and numerically using the Kolmogorov-Smirnov test. By using a simple piecewise linear trend estimation algorithm, the PLM data was smoothed to separate random fluctuations from continuous symptom dynamics. Subtracting the trends from the original data revealed random fluctuations in symptom severity. The average magnitude of fluctuations versus time since diagnosis was modeled by using a gamma generalized linear model. Results: Distributions of ages at diagnosis and UPDRS in the PLM and PD-DOC databases were broadly consistent. The PLM patients were systematically younger than the PD-DOC patients and showed increased symptom severity in the PD off state. The average fluctuation in symptoms (UPDRS Parts I and II) was 2.6 points at the time of diagnosis, rising to 5.9 points 16 years after diagnosis. This fluctuation exceeds the estimated minimal and moderate clinically important differences, respectively. Not all patients conformed to the current clinical picture of gradual, smooth changes: many patients had regimes where symptom severity varied in an unpredictable manner, or underwent large rapid changes in an otherwise more stable progression. Conclusions: This information about short-term PD symptom dynamics contributes new scientific understanding about the disease progression, currently very costly to obtain without self-administered Internet-based reporting. This understanding should have implications for the optimization of clinical trials into new treatments and for the choice of treatment decision timescales.
Resumo:
In this thesis used four different methods in order to diagnose the precipitation extremes on Northeastern Brazil (NEB): Generalized Linear Model s via logistic regression and Poisson, extreme value theory analysis via generalized extre me value (GEV) and generalized Pareto (GPD) distributions and Vectorial Generalized Linea r Models via GEV (MVLG GEV). The logistic regression and Poisson models were used to identify the interactions between the precipitation extremes and other variables based on the odds ratios and relative risks. It was found that the outgoing longwave radiation was the indicator variable for the occurrence of extreme precipitation on eastern, northern and semi arid NEB, and the relative humidity was verified on southern NEB. The GEV and GPD distribut ions (based on the 95th percentile) showed that the location and scale parameters were presented the maximum on the eastern and northern coast NEB, the GEV verified a maximum core on western of Pernambuco influenced by weather systems and topography. The GEV and GPD shape parameter, for most regions the data fitted by Weibull negative an d Beta distributions (ξ < 0) , respectively. The levels and return periods of GEV (GPD) on north ern Maranhão (centerrn of Bahia) may occur at least an extreme precipitation event excee ding over of 160.9 mm /day (192.3 mm / day) on next 30 years. The MVLG GEV model found tha t the zonal and meridional wind components, evaporation and Atlantic and Pacific se a surface temperature boost the precipitation extremes. The GEV parameters show the following results: a) location ( ), the highest value was 88.26 ± 6.42 mm on northern Maran hão; b) scale ( σ ), most regions showed positive values, except on southern of Maranhão; an d c) shape ( ξ ), most of the selected regions were adjusted by the Weibull negative distr ibution ( ξ < 0 ). The southern Maranhão and southern Bahia have greater accuracy. The level period, it was estimated that the centern of Bahia may occur at least an extreme precipitatio n event equal to or exceeding over 571.2 mm/day on next 30 years.
Resumo:
Background: Identifying biological markers to aid diagnosis of bipolar disorder (BD) is critically important. To be considered a possible biological marker, neural patterns in BD should be discriminant from those in healthy individuals (HI). We examined patterns of neuromagnetic responses revealed by magnetoencephalography (MEG) during implicit emotion-processing using emotional (happy, fearful, sad) and neutral facial expressions, in sixteen BD and sixteen age- and gender-matched healthy individuals. Methods: Neuromagnetic data were recorded using a 306-channel whole-head MEG ELEKTA Neuromag System, and preprocessed using Signal Space Separation as implemented in MaxFilter (ELEKTA). Custom Matlab programs removed EOG and ECG signals from filtered MEG data, and computed means of epoched data (0-250ms, 250-500ms, 500-750ms). A generalized linear model with three factors (individual, emotion intensity and time) compared BD and HI. A principal component analysis of normalized mean channel data in selected brain regions identified principal components that explained 95% of data variation. These components were used in a quadratic support vector machine (SVM) pattern classifier. SVM classifier performance was assessed using the leave-one-out approach. Results: BD and HI showed significantly different patterns of activation for 0-250ms within both left occipital and temporal regions, specifically for neutral facial expressions. PCA analysis revealed significant differences between BD and HI for mild fearful, happy, and sad facial expressions within 250-500ms. SVM quadratic classifier showed greatest accuracy (84%) and sensitivity (92%) for neutral faces, in left occipital regions within 500-750ms. Conclusions: MEG responses may be used in the search for disease specific neural markers.
Resumo:
A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.
Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.
The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.
The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.
All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.
Resumo:
We analyze a real data set pertaining to reindeer fecal pellet-group counts obtained from a survey conducted in a forest area in northern Sweden. In the data set, over 70% of counts are zeros, and there is high spatial correlation. We use conditionally autoregressive random effects for modeling of spatial correlation in a Poisson generalized linear mixed model (GLMM), quasi-Poisson hierarchical generalized linear model (HGLM), zero-inflated Poisson (ZIP), and hurdle models. The quasi-Poisson HGLM allows for both under- and overdispersion with excessive zeros, while the ZIP and hurdle models allow only for overdispersion. In analyzing the real data set, we see that the quasi-Poisson HGLMs can perform better than the other commonly used models, for example, ordinary Poisson HGLMs, spatial ZIP, and spatial hurdle models, and that the underdispersed Poisson HGLMs with spatial correlation fit the reindeer data best. We develop R codes for fitting these models using a unified algorithm for the HGLMs. Spatial count response with an extremely high proportion of zeros, and underdispersion can be successfully modeled using the quasi-Poisson HGLM with spatial random effects.
Resumo:
Neste estudo foi investigado como a distribuição das espécies e a produção de biomassa de macrófitas aquáticas são influenciadas pelas condições físico-químicas do ambiente. Também foi avaliado como uma espécie com maior potencial competitivo pode interferir na diversidade de espécies da comunidade macrofítica. Para tanto, em cada um dos três arroios, foram dispostos seis transecções, perpendiculares à margem. Em cada transecção foram demarcadas três unidades amostrais de 1m², nas quais foram registrados os parâmetros fitossociológicos cobertura e frequência relativas e valor de importância. A diversidade de espécies foi estimada pelo índice de Shannon, utilizando os valores de cobertura de espécies. Para determinar a biomassa das macrófitas aquáticas foram usados quadrats de 0,25m², alocados dentro da unidade amostral de 1m² usadas para quantificar os dados fitossociológicos, nos mesmos pontos onde foi feito o levantamento de cobertura da vegetação. Utilizamos como variáveis preditoras a velocidade da corrente, radiação solar incidente, coeficiente de sombreamento, vegetação ripária arbórea adjacente, nitrogênio orgânico dissolvido, carbono orgânico dissolvido e condutividade elétrica. Foram registradas 32 espécies de macrófitas aquáticas, distribuídas em 19 famílias e 28 gêneros. Conforme Análise de Correspondência Canônica (CCA), as espécies com maiores valores de biomassa foram relacionadas a unidades amostrais com alta incidência luminosa. As unidades amostrais com dominância de Pistia stratiotes apresentaram menor diversidade de espécies indicando que esta espécie, quando encontra condições que permitam sua proliferação, pode excluir espécies de menor potencial competitivo. De acordo com GLM (Generalized Linear Model), a ausência de vegetação ripária ou presente em apenas uma das margens e baixas velocidades de corrente configura-se em condições favoráveis ao estabelecimento e desenvolvimento de macrófitas aquáticas, possibilitando produção maiores valores de biomassa.
Resumo:
Breast milk is regarded as an ideal source of nutrients for the growth and development of neonates, but it can also be a potential source of pollutants. Mothers can be exposed to different contaminants as a result of their lifestyle and environmental pollution. Mercury (Hg) and arsenic (As) could adversely affect the development of fetal and neonatal nervous system. Some fish and shellfish are rich in selenium (Se), an essential trace element that forms part of several enzymes related to the detoxification process, including glutathione S-transferase (GST). The goal of this study was to determine the interaction between Hg, As and Se and analyze its effect on the activity of GST in breast milk. Milk samples were collected from women between day 7 and 10 postpartum. The GST activity was determined spectrophotometrically; total Hg, As and Se concentrations were measured by atomic absorption spectrometry. To explain the possible association of Hg, As and Se concentrations with GST activity in breast milk, generalized linear models were constructed. The model explained 44% of the GST activity measured in breast milk. The GLM suggests that GST activity was positively correlated with Hg, As and Se concentrations. The activity of the enzyme was also explained by the frequency of consumption of marine fish and shellfish in the diet of the breastfeeding women.
Resumo:
INTRODUCTION: Attaining an accurate diagnosis in the acute phase for severely brain-damaged patients presenting Disorders of Consciousness (DOC) is crucial for prognostic validity; such a diagnosis determines further medical management, in terms of therapeutic choices and end-of-life decisions. However, DOC evaluation based on validated scales, such as the Revised Coma Recovery Scale (CRS-R), can lead to an underestimation of consciousness and to frequent misdiagnoses particularly in cases of cognitive motor dissociation due to other aetiologies. The purpose of this study is to determine the clinical signs that lead to a more accurate consciousness assessment allowing more reliable outcome prediction. METHODS: From the Unit of Acute Neurorehabilitation (University Hospital, Lausanne, Switzerland) between 2011 and 2014, we enrolled 33 DOC patients with a DOC diagnosis according to the CRS-R that had been established within 28 days of brain damage. The first CRS-R assessment established the initial diagnosis of Unresponsive Wakefulness Syndrome (UWS) in 20 patients and a Minimally Consciousness State (MCS) in the remaining13 patients. We clinically evaluated the patients over time using the CRS-R scale and concurrently from the beginning with complementary clinical items of a new observational Motor Behaviour Tool (MBT). Primary endpoint was outcome at unit discharge distinguishing two main classes of patients (DOC patients having emerged from DOC and those remaining in DOC) and 6 subclasses detailing the outcome of UWS and MCS patients, respectively. Based on CRS-R and MBT scores assessed separately and jointly, statistical testing was performed in the acute phase using a non-parametric Mann-Whitney U test; longitudinal CRS-R data were modelled with a Generalized Linear Model. RESULTS: Fifty-five per cent of the UWS patients and 77% of the MCS patients had emerged from DOC. First, statistical prediction of the first CRS-R scores did not permit outcome differentiation between classes; longitudinal regression modelling of the CRS-R data identified distinct outcome evolution, but not earlier than 19 days. Second, the MBT yielded a significant outcome predictability in the acute phase (p<0.02, sensitivity>0.81). Third, a statistical comparison of the CRS-R subscales weighted by MBT became significantly predictive for DOC outcome (p<0.02). DISCUSSION: The association of MBT and CRS-R scoring improves significantly the evaluation of consciousness and the predictability of outcome in the acute phase. Subtle motor behaviour assessment provides accurate insight into the amount and the content of consciousness even in the case of cognitive motor dissociation.
Resumo:
One of the objectives of this study is to perform classification of socio-demographic components for the level of city section in City of Lisbon. In order to accomplish suitable platform for the restaurant potentiality map, the socio-demographic components were selected to produce a map of spatial clusters in accordance to restaurant suitability. Consequently, the second objective is to obtain potentiality map in terms of underestimation and overestimation in number of restaurants. To the best of our knowledge there has not been found identical methodology for the estimation of restaurant potentiality. The results were achieved with combination of SOM (Self-Organized Map) which provides a segmentation map and GAM (Generalized Additive Model) with spatial component for restaurant potentiality. Final results indicate that the highest influence in restaurant potentiality is given to tourist sites, spatial autocorrelation in terms of neighboring restaurants (spatial component), and tax value, where lower importance is given to household with 1 or 2 members and employed population, respectively. In addition, an important conclusion is that the most attractive market sites have shown no change or moderate underestimation in terms of restaurants potentiality.
Resumo:
Transferring distribution models between different geographical areas may be problematic, as the performance of models outside their original scope is hard to predict. A modelling procedure is needed that gets the gist of the environmental descriptors of a distribution area, without either overfitting to the training data or overestimating the species’ distribution potential.We tested the transferability power of the favourability function, a generalized linear model, on the distribution of the Iberian desman (Galemys pyrenaicus) in the Iberian territories of Portugal and Spain.We also tested the effects of two of the main potential constraints on model transferability: the analysed ranges of the predictor variables, and the completeness of the species distribution data. We modelled 10 km×10km presence/absence data from Portugal and Spain separately, extrapolated each model to the other country, and compared predictions with observations. The Spanish model, despite arguably containing more false absences, showed good predictive ability in Portugal. The Portuguese model, whose predictors ranged between only a subset of the values observed in Spain, overestimated desman distribution when transferred.We discuss possible reasons for this differential model behaviour, and highlight the importance of this kind of models for prediction and conservation applications
Resumo:
In a sample of censored survival times, the presence of an immune proportion of individuals who are not subject to death, failure or relapse, may be indicated by a relatively high number of individuals with large censored survival times. In this paper the generalized log-gamma model is modified for the possibility that long-term survivors may be present in the data. The model attempts to separately estimate the effects of covariates on the surviving fraction, that is, the proportion of the population for which the event never occurs. The logistic function is used for the regression model of the surviving fraction. Inference for the model parameters is considered via maximum likelihood. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. Finally, a data set from the medical area is analyzed under the log-gamma generalized mixture model. A residual analysis is performed in order to select an appropriate model.
Resumo:
A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environ mental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed. Copyright (C) 2001 John Wiley & Sons, Ltd.
Resumo:
In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.
Resumo:
PURPOSE: The longitudinal relaxation rate (R1 ) measured in vivo depends on the local microstructural properties of the tissue, such as macromolecular, iron, and water content. Here, we use whole brain multiparametric in vivo data and a general linear relaxometry model to describe the dependence of R1 on these components. We explore a) the validity of having a single fixed set of model coefficients for the whole brain and b) the stability of the model coefficients in a large cohort. METHODS: Maps of magnetization transfer (MT) and effective transverse relaxation rate (R2 *) were used as surrogates for macromolecular and iron content, respectively. Spatial variations in these parameters reflected variations in underlying tissue microstructure. A linear model was applied to the whole brain, including gray/white matter and deep brain structures, to determine the global model coefficients. Synthetic R1 values were then calculated using these coefficients and compared with the measured R1 maps. RESULTS: The model's validity was demonstrated by correspondence between the synthetic and measured R1 values and by high stability of the model coefficients across a large cohort. CONCLUSION: A single set of global coefficients can be used to relate R1 , MT, and R2 * across the whole brain. Our population study demonstrates the robustness and stability of the model. Magn Reson Med, 2014. © 2014 The Authors. Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. Magn Reson Med 73:1309-1314, 2015. © 2014 Wiley Periodicals, Inc.
Resumo:
A physically motivated statistical model is used to diagnose variability and trends in wintertime ( October - March) Global Precipitation Climatology Project (GPCP) pentad (5-day mean) precipitation. Quasi-geostrophic theory suggests that extratropical precipitation amounts should depend multiplicatively on the pressure gradient, saturation specific humidity, and the meridional temperature gradient. This physical insight has been used to guide the development of a suitable statistical model for precipitation using a mixture of generalized linear models: a logistic model for the binary occurrence of precipitation and a Gamma distribution model for the wet day precipitation amount. The statistical model allows for the investigation of the role of each factor in determining variations and long-term trends. Saturation specific humidity q(s) has a generally negative effect on global precipitation occurrence and with the tropical wet pentad precipitation amount, but has a positive relationship with the pentad precipitation amount at mid- and high latitudes. The North Atlantic Oscillation, a proxy for the meridional temperature gradient, is also found to have a statistically significant positive effect on precipitation over much of the Atlantic region. Residual time trends in wet pentad precipitation are extremely sensitive to the choice of the wet pentad threshold because of increasing trends in low-amplitude precipitation pentads; too low a choice of threshold can lead to a spurious decreasing trend in wet pentad precipitation amounts. However, for not too small thresholds, it is found that the meridional temperature gradient is an important factor for explaining part of the long-term trend in Atlantic precipitation.