846 resultados para Generalized linear models


Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the current concern over climate change, descriptions of how rainfall patterns are changing over time can be useful. Observations of daily rainfall data over the last few decades provide information on these trends. Generalized linear models are typically used to model patterns in the occurrence and intensity of rainfall. These models describe rainfall patterns for an average year but are more limited when describing long-term trends, particularly when these are potentially non-linear. Generalized additive models (GAMS) provide a framework for modelling non-linear relationships by fitting smooth functions to the data. This paper describes how GAMS can extend the flexibility of models to describe seasonal patterns and long-term trends in the occurrence and intensity of daily rainfall using data from Mauritius from 1962 to 2001. Smoothed estimates from the models provide useful graphical descriptions of changing rainfall patterns over the last 40 years at this location. GAMS are particularly helpful when exploring non-linear relationships in the data. Care is needed to ensure the choice of smooth functions is appropriate for the data and modelling objectives. (c) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Regional differences in physician supply can be found in many health care systems, regardless of their organizational and financial structure. A theoretical model is developed for the physicians' decision on office allocation, covering demand-side factors and a consumption time function. METHODS: To test the propositions following the theoretical model, generalized linear models were estimated to explain differences in 412 German districts. Various factors found in the literature were included to control for physicians' regional preferences. RESULTS: Evidence in favor of the first three propositions of the theoretical model could be found. Specialists show a stronger association to higher populated districts than GPs. Although indicators for regional preferences are significantly correlated with physician density, their coefficients are not as high as population density. CONCLUSIONS: If regional disparities should be addressed by political actions, the focus should be to counteract those parameters representing physicians' preferences in over- and undersupplied regions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a two-step pseudo likelihood estimation technique for generalized linear mixed models with the random effects being correlated between groups. The core idea is to deal with the intractable integrals in the likelihood function by multivariate Taylor's approximation. The accuracy of the estimation technique is assessed in a Monte-Carlo study. An application of it with a binary response variable is presented using a real data set on credit defaults from two Swedish banks. Thanks to the use of two-step estimation technique, the proposed algorithm outperforms conventional pseudo likelihood algorithms in terms of computational time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the techniques of likelihood prediction for the generalized linear mixed models. Methods of likelihood prediction is explained through a series of examples; from a classical one to more complicated ones. The examples show, in simple cases, that the likelihood prediction (LP) coincides with already known best frequentist practice such as the best linear unbiased predictor. The paper outlines a way to deal with the covariate uncertainty while producing predictive inference. Using a Poisson error-in-variable generalized linear model, it has been shown that in complicated cases LP produces better results than already know methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Generalized linear mixed models are flexible tools for modeling non-normal data and are useful for accommodating overdispersion in Poisson regression models with random effects. Their main difficulty resides in the parameter estimation because there is no analytic solution for the maximization of the marginal likelihood. Many methods have been proposed for this purpose and many of them are implemented in software packages. The purpose of this study is to compare the performance of three different statistical principles - marginal likelihood, extended likelihood, Bayesian analysis-via simulation studies. Real data on contact wrestling are used for illustration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this study was to evaluate the use of probit and logit link functions for the genetic evaluation of early pregnancy using simulated data. The following simulation/analysis structures were constructed: logit/logit, logit/probit, probit/logit, and probit/probit. The percentages of precocious females were 5, 10, 15, 20, 25 and 30% and were adjusted based on a change in the mean of the latent variable. The parametric heritability (h²) was 0.40. Simulation and genetic evaluation were implemented in the R software. Heritability estimates (ĥ²) were compared with h² using the mean squared error. Pearson correlations between predicted and true breeding values and the percentage of coincidence between true and predicted ranking, considering the 10% of bulls with the highest breeding values (TOP10) were calculated. The mean ĥ² values were under- and overestimated for all percentages of precocious females when logit/probit and probit/logit models used. In addition, the mean squared errors of these models were high when compared with those obtained with the probit/probit and logit/logit models. Considering ĥ², probit/probit and logit/logit were also superior to logit/probit and probit/logit, providing values close to the parametric heritability. Logit/probit and probit/logit presented low Pearson correlations, whereas the correlations obtained with probit/probit and logit/logit ranged from moderate to high. With respect to the TOP10 bulls, logit/probit and probit/logit presented much lower percentages than probit/probit and logit/logit. The genetic parameter estimates and predictions of breeding values of the animals obtained with the logit/logit and probit/probit models were similar. In contrast, the results obtained with probit/logit and logit/probit were not satisfactory. There is need to compare the estimation and prediction ability of logit and probit link functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To effectively assess and mitigate risk of permafrost disturbance, disturbance-p rone areas can be predicted through the application of susceptibility models. In this study we developed regional susceptibility models for permafrost disturbances using a field disturbance inventory to test the transferability of the model to a broader region in the Canadian High Arctic. Resulting maps of susceptibility were then used to explore the effect of terrain variables on the occurrence of disturbances within this region. To account for a large range of landscape charac- teristics, the model was calibrated using two locations: Sabine Peninsula, Melville Island, NU, and Fosheim Pen- insula, Ellesmere Island, NU. Spatial patterns of disturbance were predicted with a generalized linear model (GLM) and generalized additive model (GAM), each calibrated using disturbed and randomized undisturbed lo- cations from both locations and GIS-derived terrain predictor variables including slope, potential incoming solar radiation, wetness index, topographic position index, elevation, and distance to water. Each model was validated for the Sabine and Fosheim Peninsulas using independent data sets while the transferability of the model to an independent site was assessed at Cape Bounty, Melville Island, NU. The regional GLM and GAM validated well for both calibration sites (Sabine and Fosheim) with the area under the receiver operating curves (AUROC) N 0.79. Both models were applied directly to Cape Bounty without calibration and validated equally with AUROC's of 0.76; however, each model predicted disturbed and undisturbed samples differently. Addition- ally, the sensitivity of the transferred model was assessed using data sets with different sample sizes. Results in- dicated that models based on larger sample sizes transferred more consistently and captured the variability within the terrain attributes in the respective study areas. Terrain attributes associated with the initiation of dis- turbances were similar regardless of the location. Disturbances commonly occurred on slopes between 4 and 15°, below Holocene marine limit, and in areas with low potential incoming solar radiation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In non-linear random effects some attention has been very recently devoted to the analysis ofsuitable transformation of the response variables separately (Taylor 1996) or not (Oberg and Davidian 2000) from the transformations of the covariates and, as far as we know, no investigation has been carried out on the choice of link function in such models. In our study we consider the use of a random effect model when a parameterized family of links (Aranda-Ordaz 1981, Prentice 1996, Pregibon 1980, Stukel 1988 and Czado 1997) is introduced. We point out the advantages and the drawbacks associated with the choice of this data-driven kind of modeling. Difficulties in the interpretation of regression parameters, and therefore in understanding the influence of covariates, as well as problems related to loss of efficiency of estimates and overfitting, are discussed. A case study on radiotherapy usage in breast cancer treatment is discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To effectively assess and mitigate risk of permafrost disturbance, disturbance-p rone areas can be predicted through the application of susceptibility models. In this study we developed regional susceptibility models for permafrost disturbances using a field disturbance inventory to test the transferability of the model to a broader region in the Canadian High Arctic. Resulting maps of susceptibility were then used to explore the effect of terrain variables on the occurrence of disturbances within this region. To account for a large range of landscape charac- teristics, the model was calibrated using two locations: Sabine Peninsula, Melville Island, NU, and Fosheim Pen- insula, Ellesmere Island, NU. Spatial patterns of disturbance were predicted with a generalized linear model (GLM) and generalized additive model (GAM), each calibrated using disturbed and randomized undisturbed lo- cations from both locations and GIS-derived terrain predictor variables including slope, potential incoming solar radiation, wetness index, topographic position index, elevation, and distance to water. Each model was validated for the Sabine and Fosheim Peninsulas using independent data sets while the transferability of the model to an independent site was assessed at Cape Bounty, Melville Island, NU. The regional GLM and GAM validated well for both calibration sites (Sabine and Fosheim) with the area under the receiver operating curves (AUROC) N 0.79. Both models were applied directly to Cape Bounty without calibration and validated equally with AUROC's of 0.76; however, each model predicted disturbed and undisturbed samples differently. Addition- ally, the sensitivity of the transferred model was assessed using data sets with different sample sizes. Results in- dicated that models based on larger sample sizes transferred more consistently and captured the variability within the terrain attributes in the respective study areas. Terrain attributes associated with the initiation of dis- turbances were similar regardless of the location. Disturbances commonly occurred on slopes between 4 and 15°, below Holocene marine limit, and in areas with low potential incoming solar radiation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we propose a general Linear Programming (LP) based formulation and solution methodology for obtaining optimal solution to the load distribution problem in divisible load scheduling. We exploit the power of the versatile LP formulation to propose algorithms that yield exact solutions to several very general load distribution problems for which either no solutions or only heuristic solutions were available. We consider both star (single-level tree) networks and linear daisy chain networks, having processors equipped with front-ends, that form the generic models for several important network topologies. We consider arbitrary processing node availability or release times and general models for communication delays and computation time that account for constant overheads such as start up times in communication and computation. The optimality of the LP based algorithms is proved rigorously.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model, normal base measures and Gibbs sampling procedures based on the Pólya urn scheme are often used to simulate posterior draws. These algorithms are applicable in the conjugate case when (for a normal base measure) the likelihood is normal. In the non-conjugate case, the algorithms proposed by MacEachern and Müller (1998) and Neal (2000) are often applied to generate posterior samples. Some common problems associated with simulation algorithms for non-conjugate MDP models include convergence and mixing difficulties. This paper proposes an algorithm based on the Pólya urn scheme that extends the Gibbs sampling algorithms to non-conjugate models with normal base measures and exponential family likelihoods. The algorithm proceeds by making Laplace approximations to the likelihood function, thereby reducing the procedure to that of conjugate normal MDP models. To ensure the validity of the stationary distribution in the non-conjugate case, the proposals are accepted or rejected by a Metropolis-Hastings step. In the special case where the data are normally distributed, the algorithm is identical to the Gibbs sampler.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the Bayesian framework, predictions for a regression problem are expressed in terms of a distribution of output values. The mode of this distribution corresponds to the most probable output, while the uncertainty associated with the predictions can conveniently be expressed in terms of error bars. In this paper we consider the evaluation of error bars in the context of the class of generalized linear regression models. We provide insights into the dependence of the error bars on the location of the data points and we derive an upper bound on the true error bars in terms of the contributions from individual data points which are themselves easily evaluated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Endocytosis is the process by which cells internalise molecules including nutrient proteins from the extracellular media. In one form, macropinocytosis, the membrane at the cell surface ruffles and folds over to give rise to an internalised vesicle. Negatively charged phospholipids within the membrane called phosphoinositides then undergo a series of transformations that are critical for the correct trafficking of the vesicle within the cell, and which are often pirated by pathogens such as Salmonella. Advanced fluorescent video microscopy imaging now allows the detailed observation and quantification of these events in live cells over time. Here we use these observations as a basis for building differential equation models of the transformations. An initial investigation of these interactions was modelled with reaction rates proportional to the sum of the concentrations of the individual constituents. A first order linear system for the concentrations results. The structure of the system enables analytical expressions to be obtained and the problem becomes one of determining the reaction rates which generate the observed data plots. We present results with reaction rates which capture the general behaviour of the reactions so that we now have a complete mathematical model of phosphoinositide transformations that fits the experimental observations. Some excellent fits are obtained with modulated exponential functions; however, these are not solutions of the linear system. The question arises as to how the model may be modified to obtain a system whose solution provides a more accurate fit.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motorcyclists are the most crash-prone road-user group in many Asian countries including Singapore; however, factors influencing motorcycle crashes are still not well understood. This study examines the effects of various roadway characteristics, traffic control measures and environmental factors on motorcycle crashes at different location types including expressways and intersections. Using techniques of categorical data analysis, this study has developed a set of log-linear models to investigate multi-vehicle motorcycle crashes in Singapore. Motorcycle crash risks in different circumstances have been calculated after controlling for the exposure estimated by the induced exposure technique. Results show that night-time influence increases crash risks of motorcycles particularly during merging and diverging manoeuvres on expressways, and turning manoeuvres at intersections. Riders appear to exercise more care while riding on wet road surfaces particularly during night. Many hazardous interactions at intersections tend to be related to the failure of drivers to notice a motorcycle as well as to judge correctly the speed/distance of an oncoming motorcycle. Road side conflicts due to stopping/waiting vehicles and interactions with opposing traffic on undivided roads have been found to be as detrimental factors on motorcycle safety along arterial, main and local roads away from intersections. Based on the findings of this study, several targeted countermeasures in the form of legislations, rider training, and safety awareness programmes have been recommended.