919 resultados para working correlation structure
Resumo:
Selecting an appropriate working correlation structure is pertinent to clustered data analysis using generalized estimating equations (GEE) because an inappropriate choice will lead to inefficient parameter estimation. We investigate the well-known criterion of QIC for selecting a working correlation Structure. and have found that performance of the QIC is deteriorated by a term that is theoretically independent of the correlation structures but has to be estimated with an error. This leads LIS to propose a correlation information criterion (CIC) that substantially improves the QIC performance. Extensive simulation studies indicate that the CIC has remarkable improvement in selecting the correct correlation structures. We also illustrate our findings using a data set from the Madras Longitudinal Schizophrenia Study.
Resumo:
Efficiency of analysis using generalized estimation equations is enhanced when intracluster correlation structure is accurately modeled. We compare two existing criteria (a quasi-likelihood information criterion, and the Rotnitzky-Jewell criterion) to identify the true correlation structure via simulations with Gaussian or binomial response, covariates varying at cluster or observation level, and exchangeable or AR(l) intracluster correlation structure. Rotnitzky and Jewell's approach performs better when the true intracluster correlation structure is exchangeable, while the quasi-likelihood criteria performs better for an AR(l) structure.
Resumo:
The method of generalised estimating equations for regression modelling of clustered outcomes allows for specification of a working matrix that is intended to approximate the true correlation matrix of the observations. We investigate the asymptotic relative efficiency of the generalised estimating equation for the mean parameters when the correlation parameters are estimated by various methods. The asymptotic relative efficiency depends on three-features of the analysis, namely (i) the discrepancy between the working correlation structure and the unobservable true correlation structure, (ii) the method by which the correlation parameters are estimated and (iii) the 'design', by which we refer to both the structures of the predictor matrices within clusters and distribution of cluster sizes. Analytical and numerical studies of realistic data-analysis scenarios show that choice of working covariance model has a substantial impact on regression estimator efficiency. Protection against avoidable loss of efficiency associated with covariance misspecification is obtained when a 'Gaussian estimation' pseudolikelihood procedure is used with an AR(1) structure.
Resumo:
Selection criteria and misspecification tests for the intra-cluster correlation structure (ICS) in longitudinal data analysis are considered. In particular, the asymptotical distribution of the correlation information criterion (CIC) is derived and a new method for selecting a working ICS is proposed by standardizing the selection criterion as the p-value. The CIC test is found to be powerful in detecting misspecification of the working ICS structures, while with respect to the working ICS selection, the standardized CIC test is also shown to have satisfactory performance. Some simulation studies and applications to two real longitudinal datasets are made to illustrate how these criteria and tests might be useful.
Resumo:
A modeling paradigm is proposed for covariate, variance and working correlation structure selection for longitudinal data analysis. Appropriate selection of covariates is pertinent to correct variance modeling and selecting the appropriate covariates and variance function is vital to correlation structure selection. This leads to a stepwise model selection procedure that deploys a combination of different model selection criteria. Although these criteria find a common theoretical root based on approximating the Kullback-Leibler distance, they are designed to address different aspects of model selection and have different merits and limitations. For example, the extended quasi-likelihood information criterion (EQIC) with a covariance penalty performs well for covariate selection even when the working variance function is misspecified, but EQIC contains little information on correlation structures. The proposed model selection strategies are outlined and a Monte Carlo assessment of their finite sample properties is reported. Two longitudinal studies are used for illustration.
Resumo:
This paper proposes a linear quantile regression analysis method for longitudinal data that combines the between- and within-subject estimating functions, which incorporates the correlations between repeated measurements. Therefore, the proposed method results in more efficient parameter estimation relative to the estimating functions based on an independence working model. To reduce computational burdens, the induced smoothing method is introduced to obtain parameter estimates and their variances. Under some regularity conditions, the estimators derived by the induced smoothing method are consistent and have asymptotically normal distributions. A number of simulation studies are carried out to evaluate the performance of the proposed method. The results indicate that the efficiency gain for the proposed method is substantial especially when strong within correlations exist. Finally, a dataset from the audiology growth research is used to illustrate the proposed methodology.
Resumo:
Spatial data analysis has become more and more important in the studies of ecology and economics during the last decade. One focus of spatial data analysis is how to select predictors, variance functions and correlation functions. However, in general, the true covariance function is unknown and the working covariance structure is often misspecified. In this paper, our target is to find a good strategy to identify the best model from the candidate set using model selection criteria. This paper is to evaluate the ability of some information criteria (corrected Akaike information criterion, Bayesian information criterion (BIC) and residual information criterion (RIC)) for choosing the optimal model when the working correlation function, the working variance function and the working mean function are correct or misspecified. Simulations are carried out for small to moderate sample sizes. Four candidate covariance functions (exponential, Gaussian, Matern and rational quadratic) are used in simulation studies. With the summary in simulation results, we find that the misspecified working correlation structure can still capture some spatial correlation information in model fitting. When the sample size is large enough, BIC and RIC perform well even if the the working covariance is misspecified. Moreover, the performance of these information criteria is related to the average level of model fitting which can be indicated by the average adjusted R square ( [GRAPHICS] ), and overall RIC performs well.
Resumo:
Objective To discuss generalized estimating equations as an extension of generalized linear models by commenting on the paper of Ziegler and Vens "Generalized Estimating Equations. Notes on the Choice of the Working Correlation Matrix". Methods Inviting an international group of experts to comment on this paper. Results Several perspectives have been taken by the discussants. Econometricians have established parallels to the generalized method of moments (GMM). Statisticians discussed model assumptions and the aspect of missing data Applied statisticians; commented on practical aspects in data analysis. Conclusions In general, careful modeling correlation is encouraged when considering estimation efficiency and other implications, and a comparison of choosing instruments in GMM and generalized estimating equations, (GEE) would be worthwhile. Some theoretical drawbacks of GEE need to be further addressed and require careful analysis of data This particularly applies to the situation when data are missing at random.
Resumo:
The method of generalized estimating equation-, (GEEs) has been criticized recently for a failure to protect against misspecification of working correlation models, which in some cases leads to loss of efficiency or infeasibility of solutions. However, the feasibility and efficiency of GEE methods can be enhanced considerably by using flexible families of working correlation models. We propose two ways of constructing unbiased estimating equations from general correlation models for irregularly timed repeated measures to supplement and enhance GEE. The supplementary estimating equations are obtained by differentiation of the Cholesky decomposition of the working correlation, or as score equations for decoupled Gaussian pseudolikelihood. The estimating equations are solved with computational effort equivalent to that required for a first-order GEE. Full details and analytic expressions are developed for a generalized Markovian model that was evaluated through simulation. Large-sample ".sandwich" standard errors for working correlation parameter estimates are derived and shown to have good performance. The proposed estimating functions are further illustrated in an analysis of repeated measures of pulmonary function in children.
Resumo:
This paper investigates how the correlations implied by a first-order simultaneous autoregressive (SAR(1)) process are affected by the weights matrix and the autocorrelation parameter. A graph theoretic representation of the covariances in terms of walks connecting the spatial units helps to clarify a number of correlation properties of the processes. In particular, we study some implications of row-standardizing the weights matrix, the dependence of the correlations on graph distance, and the behavior of the correlations at the extremes of the parameter space. Throughout the analysis differences between directed and undirected networks are emphasized. The graph theoretic representation also clarifies why it is difficult to relate properties ofW to correlation properties of SAR(1) models defined on irregular lattices.
Resumo:
A total of 20,065 weights recorded on 3016 Nelore animals were used to estimate covariance functions for growth from birth to 630 days of age, assuming a parametric correlation structure to model within-animal correlations. The model of analysis included fixed effects of contemporary groups and age of dam as quadratic covariable. Mean trends were taken into account by a cubic regression on orthogonal polynomials of animal age. Genetic effects of the animal and its dam and maternal permanent environmental effects were modelled by random regressions on Legendre polynomials of age at recording. Changes in direct permanent environmental effect variances were modelled by a polynomial variance function, together with a parametric correlation function to account for correlations between ages. Stationary and nonstationary models were used to model within-animal correlations between different ages. Residual variances were considered homogeneous or heterogeneous, with changes modelled by a step or polynomial function of age at recording. Based on Bayesian information criterion, a model with a cubic variance function combined with a nonstationary correlation function for permanent environmental effects, with 49 parameters to be estimated, fitted best. Modelling within-animal correlations through a parametric correlation structure can describe the variation pattern adequately. Moreover, the number of parameters to be estimated can be decreased substantially compared to a model fitting random regression on Legendre polynomial of age. © 2004 Elsevier B.V. All rights reserved.
Resumo:
The article describes a generalized estimating equations approach that was used to investigate the impact of technology on vessel performance in a trawl fishery during 1988-96, while accounting for spatial and temporal correlations in the catch-effort data. Robust estimation of parameters in the presence of several levels of clustering depended more on the choice of cluster definition than on the choice of correlation structure within the cluster. Models with smaller cluster sizes produced stable results, while models with larger cluster sizes, that may have had complex within-cluster correlation structures and that had within-cluster covariates, produced estimates sensitive to the correlation structure. The preferred model arising from this dataset assumed that catches from a vessel were correlated in the same years and the same areas, but independent in different years and areas. The model that assumed catches from a vessel were correlated in all years and areas, equivalent to a random effects term for vessel, produced spurious results. This was an unexpected finding that highlighted the need to adopt a systematic strategy for modelling. The article proposes a modelling strategy of selecting the best cluster definition first, and the working correlation structure (within clusters) second. The article discusses the selection and interpretation of the model in the light of background knowledge of the data and utility of the model, and the potential for this modelling approach to apply in similar statistical situations.
Resumo:
In longitudinal data analysis, our primary interest is in the regression parameters for the marginal expectations of the longitudinal responses; the longitudinal correlation parameters are of secondary interest. The joint likelihood function for longitudinal data is challenging, particularly for correlated discrete outcome data. Marginal modeling approaches such as generalized estimating equations (GEEs) have received much attention in the context of longitudinal regression. These methods are based on the estimates of the first two moments of the data and the working correlation structure. The confidence regions and hypothesis tests are based on the asymptotic normality. The methods are sensitive to misspecification of the variance function and the working correlation structure. Because of such misspecifications, the estimates can be inefficient and inconsistent, and inference may give incorrect results. To overcome this problem, we propose an empirical likelihood (EL) procedure based on a set of estimating equations for the parameter of interest and discuss its characteristics and asymptotic properties. We also provide an algorithm based on EL principles for the estimation of the regression parameters and the construction of a confidence region for the parameter of interest. We extend our approach to variable selection for highdimensional longitudinal data with many covariates. In this situation it is necessary to identify a submodel that adequately represents the data. Including redundant variables may impact the model’s accuracy and efficiency for inference. We propose a penalized empirical likelihood (PEL) variable selection based on GEEs; the variable selection and the estimation of the coefficients are carried out simultaneously. We discuss its characteristics and asymptotic properties, and present an algorithm for optimizing PEL. Simulation studies show that when the model assumptions are correct, our method performs as well as existing methods, and when the model is misspecified, it has clear advantages. We have applied the method to two case examples.
Resumo:
The success of any diversification strategy depends upon the quality of the estimated correlation between assets. It is well known, however, that there is a tendency for the average correlation among assets to increase when the market falls and vice-versa. Thus, assuming that the correlation between assets is a constant over time seems unrealistic. Nonetheless, these changes in the correlation structure as a consequence of changes in the market’s return suggests that correlation shifts can be modelled as a function of the market return. This is the idea behind the model of Spurgin et al (2000), which models the beta or systematic risk, of the asset as a function of the returns in the market. This is an approach that offers particular attractions to fund managers as it suggest ways by which they can adjust their portfolios to benefit from changes in overall market conditions. In this paper the Spurgin et al (2000) model is applied to 31 real estate market segments in the UK using monthly data over the period 1987:1 to 2000:12. The results show that a number of market segments display significant negative correlation shifts, while others show significantly positive correlation shifts. Using this information fund managers can make strategic and tactical portfolio allocation decisions based on expectations of market volatility alone and so help them achieve greater portfolio performance overall and especially during different phases of the real estate cycle.