990 resultados para Linear Series
Resumo:
Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of "nuisance" confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.
Resumo:
Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases. This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM to subsetted versions of the data. Additional gains in efficiency are achieved for Poisson models, commonly used in disease mapping problems, because of their special collapsibility property which allows data reduction through summaries. Convergence of the proposed iterative procedure is guaranteed for canonical link functions. The strategy is applied to investigate the relationship between ischemic heart disease, socioeconomic status and age/gender category in New South Wales, Australia, based on outcome data consisting of approximately 33 million records. A simulation study demonstrates the algorithm's reliability in analyzing a data set with 12 million records for a (non-collapsible) logistic regression model.
Resumo:
We introduce a diagnostic test for the mixing distribution in a generalised linear mixed model. The test is based on the difference between the marginal maximum likelihood and conditional maximum likelihood estimates of a subset of the fixed effects in the model. We derive the asymptotic variance of this difference, and propose a test statistic that has a limiting chi-square distribution under the null hypothesis that the mixing distribution is correctly specified. For the important special case of the logistic regression model with random intercepts, we evaluate via simulation the power of the test in finite samples under several alternative distributional forms for the mixing distribution. We illustrate the method by applying it to data from a clinical trial investigating the effects of hormonal contraceptives in women.
Resumo:
Multi-site time series studies of air pollution and mortality and morbidity have figured prominently in the literature as comprehensive approaches for estimating acute effects of air pollution on health. Hierarchical models are generally used to combine site-specific information and estimate pooled air pollution effects taking into account both within-site statistical uncertainty, and across-site heterogeneity. Within a site, characteristics of time series data of air pollution and health (small pollution effects, missing data, highly correlated predictors, non linear confounding etc.) make modelling all sources of uncertainty challenging. One potential consequence is underestimation of the statistical variance of the site-specific effects to be combined. In this paper we investigate the impact of variance underestimation on the pooled relative rate estimate. We focus on two-stage normal-normal hierarchical models and on under- estimation of the statistical variance at the first stage. By mathematical considerations and simulation studies, we found that variance underestimation does not affect the pooled estimate substantially. However, some sensitivity of the pooled estimate to variance underestimation is observed when the number of sites is small and underestimation is severe. These simulation results are applicable to any two-stage normal-normal hierarchical model for combining information of site-specific results, and they can be easily extended to more general hierarchical formulations. We also examined the impact of variance underestimation on the national average relative rate estimate from the National Morbidity Mortality Air Pollution Study and we found that variance underestimation as much as 40% has little effect on the national average.
Resumo:
This paper proposes Poisson log-linear multilevel models to investigate population variability in sleep state transition rates. We specifically propose a Bayesian Poisson regression model that is more flexible, scalable to larger studies, and easily fit than other attempts in the literature. We further use hierarchical random effects to account for pairings of individuals and repeated measures within those individuals, as comparing diseased to non-diseased subjects while minimizing bias is of epidemiologic importance. We estimate essentially non-parametric piecewise constant hazards and smooth them, and allow for time varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming piecewise constant hazards. This relationship allows us to synthesize two methods currently used to analyze sleep transition phenomena: stratified multi-state proportional hazards models and log-linear models with GEE for transition counts. An example data set from the Sleep Heart Health Study is analyzed.
Resumo:
A time series is a sequence of observations made over time. Examples in public health include daily ozone concentrations, weekly admissions to an emergency department or annual expenditures on health care in the United States. Time series models are used to describe the dependence of the response at each time on predictor variables including covariates and possibly previous values in the series. Time series methods are necessary to account for the correlation among repeated responses over time. This paper gives an overview of time series ideas and methods used in public health research.
Resumo:
An important problem in unsupervised data clustering is how to determine the number of clusters. Here we investigate how this can be achieved in an automated way by using interrelation matrices of multivariate time series. Two nonparametric and purely data driven algorithms are expounded and compared. The first exploits the eigenvalue spectra of surrogate data, while the second employs the eigenvector components of the interrelation matrix. Compared to the first algorithm, the second approach is computationally faster and not limited to linear interrelation measures.
Resumo:
Fossil pollen data from stratigraphic cores are irregularly spaced in time due to non-linear age-depth relations. Moreover, their marginal distributions may vary over time. We address these features in a nonparametric regression model with errors that are monotone transformations of a latent continuous-time Gaussian process Z(T). Although Z(T) is unobserved, due to monotonicity, under suitable regularity conditions, it can be recovered facilitating further computations such as estimation of the long-memory parameter and the Hermite coefficients. The estimation of Z(T) itself involves estimation of the marginal distribution function of the regression errors. These issues are considered in proposing a plug-in algorithm for optimal bandwidth selection and construction of confidence bands for the trend function. Some high-resolution time series of pollen records from Lago di Origlio in Switzerland, which go back ca. 20,000 years are used to illustrate the methods.
Resumo:
The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
Resumo:
BACKGROUND The Journey bicruciate substituting (BCS) total knee replacement (TKR) is intended to improve knee kinematics by more closely approximating the surfaces of a normal knee. The purpose of this analysis was to address the safety of Journey BCS knees by studying early complication and revision rates in a consecutive case series. METHODS Between December 2006 and May 2011, a single surgeon implanted 226 Journey BCS total knee prostheses in 191 patients (124 women, 67 men) who were eligible for study. Mean age at surgery was 68 years (41-85 years).Outcome measures were early complications and minor and major revision rates. All complications were considered, irrespective of whether conservative treatment or revision was required. RESULTS The average implantation time was 3.5 years (range 1.3-5.8 years). Thirty-three complications (14.6% of 226 knees) required minor or major revision surgery in 25 patients. The remaining eight patients were treated conservatively. Sixteen minor revisions were performed in 12 patients. Thirteen major revisions were required in 13 patients, which results in a rate of 1.65 major revisions per 100 component years. The linear trend of the early complication rate by treatment year was not significant (p = .22).Multivariate logistic regression showed no significant predictors for the occurrence of a complication or for revision surgery. A tendency towards higher complication rates was observed in female patients, although it was not significant (p = .066). CONCLUSIONS The complication and revision rates of the Journey BCS knee implant are high in comparison with those reported for other established total knee systems. Caution is advised when using this implant, particularly for less experienced knee surgeons.
Resumo:
A discussion of nonlinear dynamics, demonstrated by the familiar automobile, is followed by the development of a systematic method of analysis of a possibly nonlinear time series using difference equations in the general state-space format. This format allows recursive state-dependent parameter estimation after each observation thereby revealing the dynamics inherent in the system in combination with random external perturbations.^ The one-step ahead prediction errors at each time period, transformed to have constant variance, and the estimated parametric sequences provide the information to (1) formally test whether time series observations y(,t) are some linear function of random errors (ELEM)(,s), for some t and s, or whether the series would more appropriately be described by a nonlinear model such as bilinear, exponential, threshold, etc., (2) formally test whether a statistically significant change has occurred in structure/level either historically or as it occurs, (3) forecast nonlinear system with a new and innovative (but very old numerical) technique utilizing rational functions to extrapolate individual parameters as smooth functions of time which are then combined to obtain the forecast of y and (4) suggest a measure of resilience, i.e. how much perturbation a structure/level can tolerate, whether internal or external to the system, and remain statistically unchanged. Although similar to one-step control, this provides a less rigid way to think about changes affecting social systems.^ Applications consisting of the analysis of some familiar and some simulated series demonstrate the procedure. Empirical results suggest that this state-space or modified augmented Kalman filter may provide interesting ways to identify particular kinds of nonlinearities as they occur in structural change via the state trajectory.^ A computational flow-chart detailing computations and software input and output is provided in the body of the text. IBM Advanced BASIC program listings to accomplish most of the analysis are provided in the appendix. ^
Resumo:
Life expectancy has consistently increased over the last 150 years due to improvements in nutrition, medicine, and public health. Several studies found that in many developed countries, life expectancy continued to rise following a nearly linear trend, which was contrary to a common belief that the rate of improvement in life expectancy would decelerate and was fit with an S-shaped curve. Using samples of countries that exhibited a wide range of economic development levels, we explored the change in life expectancy over time by employing both nonlinear and linear models. We then observed if there were any significant differences in estimates between linear models, assuming an auto-correlated error structure. When data did not have a sigmoidal shape, nonlinear growth models sometimes failed to provide meaningful parameter estimates. The existence of an inflection point and asymptotes in the growth models made them inflexible with life expectancy data. In linear models, there was no significant difference in the life expectancy growth rate and future estimates between ordinary least squares (OLS) and generalized least squares (GLS). However, the generalized least squares model was more robust because the data involved time-series variables and residuals were positively correlated. ^
Resumo:
Hydrocarbons, sterols and alkenones were analyzed in samples collected from a 10 month sediment trap time series deployed in the Indian Ocean sector of the Southern Ocean. Fluxes and within-class distributions varied seasonally. During higher mass and organic carbon (OC) flux periods, which occurred in austral summer and fall, fresh marine inputs were predominant. Vertical fluxes were most intense in January, but limited to one week in duration. They were, however, low compared with other oceanic regions. In contrast, low mass and OC flux periods were characterized by a strong unresolved complex mixture (UCM) in the hydrocarbon fraction and a high proportion of stanols as a result of zooplanktonic grazing. Terrigenous inputs were not detectable. The alkenone compositions were consistent with previous data on suspended particles from Antarctic waters. However, UK'37 values diverged from the linear and exponential fits established by Sikes et al. (1997, doi:10.1016/S0016-7037(97)00017-3) in the low temperature range. The seasonal pattern of alkenone production implied that IPT (integrated production temperature) is likely to be strongly imprinted by austral summer and fall SST (sea surface temperature).