153 resultados para correlation analysis
Resumo:
The export of sediments from coastal catchments can have detrimental impacts on estuaries and near shore reef ecosystems such as the Great Barrier Reef. Catchment management approaches aimed at reducing sediment loads require monitoring to evaluate their effectiveness in reducing loads over time. However, load estimation is not a trivial task due to the complex behaviour of constituents in natural streams, the variability of water flows and often a limited amount of data. Regression is commonly used for load estimation and provides a fundamental tool for trend estimation by standardising the other time specific covariates such as flow. This study investigates whether load estimates and resultant power to detect trends can be enhanced by (i) modelling the error structure so that temporal correlation can be better quantified, (ii) making use of predictive variables, and (iii) by identifying an efficient and feasible sampling strategy that may be used to reduce sampling error. To achieve this, we propose a new regression model that includes an innovative compounding errors model structure and uses two additional predictive variables (average discounted flow and turbidity). By combining this modelling approach with a new, regularly optimised, sampling strategy, which adds uniformity to the event sampling strategy, the predictive power was increased to 90%. Using the enhanced regression model proposed here, it was possible to detect a trend of 20% over 20 years. This result is in stark contrast to previous conclusions presented in the literature. (C) 2014 Elsevier B.V. All rights reserved.
Resumo:
This paper proposes a linear quantile regression analysis method for longitudinal data that combines the between- and within-subject estimating functions, which incorporates the correlations between repeated measurements. Therefore, the proposed method results in more efficient parameter estimation relative to the estimating functions based on an independence working model. To reduce computational burdens, the induced smoothing method is introduced to obtain parameter estimates and their variances. Under some regularity conditions, the estimators derived by the induced smoothing method are consistent and have asymptotically normal distributions. A number of simulation studies are carried out to evaluate the performance of the proposed method. The results indicate that the efficiency gain for the proposed method is substantial especially when strong within correlations exist. Finally, a dataset from the audiology growth research is used to illustrate the proposed methodology.
Resumo:
With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.
Resumo:
Objective To discuss generalized estimating equations as an extension of generalized linear models by commenting on the paper of Ziegler and Vens "Generalized Estimating Equations. Notes on the Choice of the Working Correlation Matrix". Methods Inviting an international group of experts to comment on this paper. Results Several perspectives have been taken by the discussants. Econometricians have established parallels to the generalized method of moments (GMM). Statisticians discussed model assumptions and the aspect of missing data Applied statisticians; commented on practical aspects in data analysis. Conclusions In general, careful modeling correlation is encouraged when considering estimation efficiency and other implications, and a comparison of choosing instruments in GMM and generalized estimating equations, (GEE) would be worthwhile. Some theoretical drawbacks of GEE need to be further addressed and require careful analysis of data This particularly applies to the situation when data are missing at random.
Resumo:
The method of generalized estimating equations (GEEs) provides consistent estimates of the regression parameters in a marginal regression model for longitudinal data, even when the working correlation model is misspecified (Liang and Zeger, 1986). However, the efficiency of a GEE estimate can be seriously affected by the choice of the working correlation model. This study addresses this problem by proposing a hybrid method that combines multiple GEEs based on different working correlation models, using the empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working correlation structures correctly models the within-subject correlations, then this hybrid method provides the most efficient parameter estimates. In simulations, the hybrid method's finite-sample performance is superior to a GEE under any of the commonly used working correlation models and is almost fully efficient in all scenarios studied. The hybrid method is illustrated using data from a longitudinal study of the respiratory infection rates in 275 Indonesian children.
Resumo:
Selecting an appropriate working correlation structure is pertinent to clustered data analysis using generalized estimating equations (GEE) because an inappropriate choice will lead to inefficient parameter estimation. We investigate the well-known criterion of QIC for selecting a working correlation Structure. and have found that performance of the QIC is deteriorated by a term that is theoretically independent of the correlation structures but has to be estimated with an error. This leads LIS to propose a correlation information criterion (CIC) that substantially improves the QIC performance. Extensive simulation studies indicate that the CIC has remarkable improvement in selecting the correct correlation structures. We also illustrate our findings using a data set from the Madras Longitudinal Schizophrenia Study.
Resumo:
We consider ranked-based regression models for clustered data analysis. A weighted Wilcoxon rank method is proposed to take account of within-cluster correlations and varying cluster sizes. The asymptotic normality of the resulting estimators is established. A method to estimate covariance of the estimators is also given, which can bypass estimation of the density function. Simulation studies are carried out to compare different estimators for a number of scenarios on the correlation structure, presence/absence of outliers and different correlation values. The proposed methods appear to perform well, in particular, the one incorporating the correlation in the weighting achieves the highest efficiency and robustness against misspecification of correlation structure and outliers. A real example is provided for illustration.
Resumo:
We consider the analysis of longitudinal data when the covariance function is modeled by additional parameters to the mean parameters. In general, inconsistent estimators of the covariance (variance/correlation) parameters will be produced when the "working" correlation matrix is misspecified, which may result in great loss of efficiency of the mean parameter estimators (albeit the consistency is preserved). We consider using different "Working" correlation models for the variance and the mean parameters. In particular, we find that an independence working model should be used for estimating the variance parameters to ensure their consistency in case the correlation structure is misspecified. The designated "working" correlation matrices should be used for estimating the mean and the correlation parameters to attain high efficiency for estimating the mean parameters. Simulation studies indicate that the proposed algorithm performs very well. We also applied different estimation procedures to a data set from a clinical trial for illustration.
Resumo:
The approach of generalized estimating equations (GEE) is based on the framework of generalized linear models but allows for specification of a working matrix for modeling within-subject correlations. The variance is often assumed to be a known function of the mean. This article investigates the impacts of misspecifying the variance function on estimators of the mean parameters for quantitative responses. Our numerical studies indicate that (1) correct specification of the variance function can improve the estimation efficiency even if the correlation structure is misspecified; (2) misspecification of the variance function impacts much more on estimators for within-cluster covariates than for cluster-level covariates; and (3) if the variance function is misspecified, correct choice of the correlation structure may not necessarily improve estimation efficiency. We illustrate impacts of different variance functions using a real data set from cow growth.
Resumo:
Efficiency of analysis using generalized estimation equations is enhanced when intracluster correlation structure is accurately modeled. We compare two existing criteria (a quasi-likelihood information criterion, and the Rotnitzky-Jewell criterion) to identify the true correlation structure via simulations with Gaussian or binomial response, covariates varying at cluster or observation level, and exchangeable or AR(l) intracluster correlation structure. Rotnitzky and Jewell's approach performs better when the true intracluster correlation structure is exchangeable, while the quasi-likelihood criteria performs better for an AR(l) structure.
Resumo:
The method of generalized estimating equation-, (GEEs) has been criticized recently for a failure to protect against misspecification of working correlation models, which in some cases leads to loss of efficiency or infeasibility of solutions. However, the feasibility and efficiency of GEE methods can be enhanced considerably by using flexible families of working correlation models. We propose two ways of constructing unbiased estimating equations from general correlation models for irregularly timed repeated measures to supplement and enhance GEE. The supplementary estimating equations are obtained by differentiation of the Cholesky decomposition of the working correlation, or as score equations for decoupled Gaussian pseudolikelihood. The estimating equations are solved with computational effort equivalent to that required for a first-order GEE. Full details and analytic expressions are developed for a generalized Markovian model that was evaluated through simulation. Large-sample ".sandwich" standard errors for working correlation parameter estimates are derived and shown to have good performance. The proposed estimating functions are further illustrated in an analysis of repeated measures of pulmonary function in children.
Resumo:
The method of generalised estimating equations for regression modelling of clustered outcomes allows for specification of a working matrix that is intended to approximate the true correlation matrix of the observations. We investigate the asymptotic relative efficiency of the generalised estimating equation for the mean parameters when the correlation parameters are estimated by various methods. The asymptotic relative efficiency depends on three-features of the analysis, namely (i) the discrepancy between the working correlation structure and the unobservable true correlation structure, (ii) the method by which the correlation parameters are estimated and (iii) the 'design', by which we refer to both the structures of the predictor matrices within clusters and distribution of cluster sizes. Analytical and numerical studies of realistic data-analysis scenarios show that choice of working covariance model has a substantial impact on regression estimator efficiency. Protection against avoidable loss of efficiency associated with covariance misspecification is obtained when a 'Gaussian estimation' pseudolikelihood procedure is used with an AR(1) structure.
Resumo:
The article describes a generalized estimating equations approach that was used to investigate the impact of technology on vessel performance in a trawl fishery during 1988-96, while accounting for spatial and temporal correlations in the catch-effort data. Robust estimation of parameters in the presence of several levels of clustering depended more on the choice of cluster definition than on the choice of correlation structure within the cluster. Models with smaller cluster sizes produced stable results, while models with larger cluster sizes, that may have had complex within-cluster correlation structures and that had within-cluster covariates, produced estimates sensitive to the correlation structure. The preferred model arising from this dataset assumed that catches from a vessel were correlated in the same years and the same areas, but independent in different years and areas. The model that assumed catches from a vessel were correlated in all years and areas, equivalent to a random effects term for vessel, produced spurious results. This was an unexpected finding that highlighted the need to adopt a systematic strategy for modelling. The article proposes a modelling strategy of selecting the best cluster definition first, and the working correlation structure (within clusters) second. The article discusses the selection and interpretation of the model in the light of background knowledge of the data and utility of the model, and the potential for this modelling approach to apply in similar statistical situations.
Resumo:
OBJECTIVE To quantify genetic overlap between migraine and ischemic stroke (IS) with respect to common genetic variation. METHODS We applied 4 different approaches to large-scale meta-analyses of genome-wide data on migraine (23,285 cases and 95,425 controls) and IS (12,389 cases and 62,004 controls). First, we queried known genome-wide significant loci for both disorders, looking for potential overlap of signals. We then analyzed the overall shared genetic load using polygenic scores and estimated the genetic correlation between disease subtypes using data derived from these models. We further interrogated genomic regions of shared risk using analysis of covariance patterns between the 2 phenotypes using cross-phenotype spatial mapping. RESULTS We found substantial genetic overlap between migraine and IS using all 4 approaches. Migraine without aura (MO) showed much stronger overlap with IS and its subtypes than migraine with aura (MA). The strongest overlap existed between MO and large artery stroke (LAS; p = 6.4 x 10(-28) for the LAS polygenic score in MO) and between MO and cardioembolic stroke (CE; p = 2.7 x 10(-20) for the CE score in MO). CONCLUSIONS Our findings indicate shared genetic susceptibility to migraine and IS, with a particularly strong overlap between MO and both LAS and CE pointing towards shared mechanisms. Our observations on MA are consistent with a limited role of common genetic variants in this subtype.
Resumo:
Telomere length (TL) has been associated with aging and mortality, but individual differences are also influenced by genetic factors, with previous studies reporting heritability estimates ranging from 34 to 82%. Here we investigate the heritability, mode of inheritance and the influence of parental age at birth on TL in six large, independent cohort studies with a total of 19 713 participants. The meta-analysis estimate of TL heritability was 0.70 (95% CI 0.64–0.76) and is based on a pattern of results that is highly similar for twins and other family members. We observed a stronger mother–offspring (r=0.42; P-value=3.60 × 10−61) than father–offspring correlation (r=0.33; P-value=7.01 × 10−5), and a significant positive association with paternal age at offspring birth (β=0.005; P-value=7.01 × 10−5). Interestingly, a significant and quite substantial correlation in TL between spouses (r=0.25; P-value=2.82 × 10−30) was seen, which appeared stronger in older spouse pairs (mean age ≥55 years; r=0.31; P-value=4.27 × 10−23) than in younger pairs (mean age<55 years; r=0.20; P-value=3.24 × 10−10). In summary, we find a high and very consistent heritability estimate for TL, evidence for a maternal inheritance component and a positive association with paternal age.