23 resultados para Pick-Sloan Missouri Basin Program (U.S.) Garrison Division Unit.
em Collection Of Biostatistics Research Archive
Resumo:
In natural history studies of chronic disease, it is of interest to understand the evolution of key variables that measure aspects of disease progression. This is particularly true for immunological variables in persons infected with the Human Immunodeficiency Virus (HIV). The natural timescale for such studies is time since infection. However, most data available for analysis arise from prevalent cohorts, where the date of infection is unknown for most or all individuals. As a result, standard curve fitting algorithms are not immediately applicable. Here we propose two methods to circumvent this difficulty. The first uses repeated measurement data to provide information not only on the level of the variable of interest, but also on its rate of change, while the second uses an estimate of the expected time since infection. Both methods are based on the principal curves algorithm of Hastie and Stuetzle, and are applied to data from a prevalent cohort of HIV-infected homosexual men, giving estimates of the average pattern of CD4+ lymphocyte decline. These methods are applicable to natural history studies using data from prevalent cohorts where the time of disease origin is uncertain, provided certain ancillary information is available from external sources.
Resumo:
In biostatistical applications interest often focuses on the estimation of the distribution of a time-until-event variable T. If one observes whether or not T exceeds an observed monitoring time at a random number of monitoring times, then the data structure is called interval censored data. We extend this data structure by allowing the presence of a possibly time-dependent covariate process that is observed until end of follow up. If one only assumes that the censoring mechanism satisfies coarsening at random, then, by the curve of dimensionality, typically no regular estimators will exist. To fight the curse of dimensionality we follow the approach of Robins and Rotnitzky (1992) by modeling parameters of the censoring mechanism. We model the right-censoring mechanism by modeling the hazard of the follow up time, conditional on T and the covariate process. For the monitoring mechanism we avoid modeling the joint distribution of the monitoring times by only modeling a univariate hazard of the pooled monitoring times, conditional on the follow up time, T, and the covariates process, which can be estimated by treating the pooled sample of monitoring times as i.i.d. In particular, it is assumed that the monitoring times and the right-censoring times only depend on T through the observed covariate process. We introduce inverse probability of censoring weighted (IPCW) estimator of the distribution of T and of smooth functionals thereof which are guaranteed to be consistent and asymptotically normal if we have available correctly specified semiparametric models for the two hazards of the censoring process. Furthermore, given such correctly specified models for these hazards of the censoring process, we propose a one-step estimator which will improve on the IPCW estimator if we correctly specify a lower-dimensional working model for the conditional distribution of T, given the covariate process, that remains consistent and asymptotically normal if this latter working model is misspecified. It is shown that the one-step estimator is efficient if each subject is at most monitored once and the working model contains the truth. In general, it is shown that the one-step estimator optimally uses the surrogate information if the working model contains the truth. It is not optimal in using the interval information provided by the current status indicators at the monitoring times, but simulations in Peterson, van der Laan (1997) show that the efficiency loss is small.
Resumo:
Common goals in epidemiologic studies of infectious diseases include identification of the infectious agent, description of the modes of transmission and characterization of factors that influence the probability of transmission from infected to uninfected individuals. In the case of AIDS, the agent has been identified as the Human Immunodeficiency Virus (HIV), and transmission is known to occur through a variety of contact mechanisms including unprotected sexual intercourse, transfusion of infected blood products and sharing of needles in intravenous drug use. Relatively little is known about the probability of IV transmission associated with the various modes of contact, or the role that other cofactors play in promoting or suppressing transmission. Here, transmission probability refers to the probability that the virus is transmitted to a susceptible individual following exposure consisting of a series of potentially infectious contacts. The infectivity of HIV for a given route of transmission is defined to be the per contact probability of infection. Knowledge of infectivity and its relationship to other factors is important in understanding the dynamics of the AIDS epidemic and in suggesting appropriate measures to control its spread. The primary source of empirical data about infectivity comes from sexual partners of infected individuals. Partner studies consist of a series of such partnerships, usually heterosexual and monogamous, each composed of an initially infected "index case" and a partner who may or may not be infected by the time of data collection. However, because the infection times of both partners may be unknown and the history of contacts uncertain, any quantitative characterization of infectivity is extremely difficult. Thus, most statistical analyses of partner study data involve the simplifying assumption that infectivity is a constant common to all partnerships. The major objectives of this work are to describe and discuss the design and analysis of partner studies, providing a general statistical framework for investigations of infectivity and risk factors for HIV transmission. The development is largely based on three papers: Jewell and Shiboski (1990), Kim and Lagakos (1990), and Shiboski and Jewell (1992).
Resumo:
A large number of proposals for estimating the bivariate survival function under random censoring has been made. In this paper we discuss nonparametric maximum likelihood estimation and the bivariate Kaplan-Meier estimator of Dabrowska. We show how these estimators are computed, present their intuitive background and compare their practical performance under different levels of dependence and censoring, based on extensive simulation results, which leads to a practical advise.
Resumo:
In Malani and Neilsen (1992) we have proposed alternative estimates of survival function (for time to disease) using a simple marker that describes time to some intermediate stage in a disease process. In this paper we derive the asymptotic variance of one such proposed estimator using two different methods and compare terms of order 1/n when there is no censoring. In the absence of censoring the asymptotic variance obtained using the Greenwood type approach converges to exact variance up to terms involving 1/n. But the asymptotic variance obtained using the theory of the counting process and results from Voelkel and Crowley (1984) on semi-Markov processes has a different term of order 1/n. It is not clear to us at this point why the variance formulae using the latter approach give different results.
Resumo:
Jewell and Kalbfleisch (1992) consider the use of marker processes for applications related to estimation of the survival distribution of time to failure. Marker processes were assumed to be stochastic processes that, at a given point in time, provide information about the current hazard and consequently on the remaining time to failure. Particular attention was paid to calculations based on a simple additive model for the relationship between the hazard function at time t and the history of the marker process up until time t. Specific applications to the analysis of AIDS data included the use of markers as surrogate responses for onset of AIDS with censored data and as predictors of the time elapsed since infection in prevalent individuals. Here we review recent work on the use of marker data to tackle these kinds of problems with AIDS data. The Poisson marker process with an additive model, introduced in Jewell and Kalbfleisch (1992) may be a useful "test" example for comparison of various procedures.
Resumo:
In biostatistical applications, interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed monitoring time, then the data is described by the well known singly-censored current status model, also known as interval censored data, case I. We extend this current status model by allowing the presence of a time-dependent process, which is partly observed and allowing C to depend on T through the observed part of this time-dependent process. Because of the high dimension of the covariate process, no globally efficient estimators exist with a good practical performance at moderate sample sizes. We follow the approach of Robins and Rotnitzky (1992) by modeling the censoring variable, given the time-variable and the covariate-process, i.e., the missingness process, under the restriction that it satisfied coarsening at random. We propose a generalization of the simple current status estimator of the distribution of T and of smooth functionals of the distribution of T, which is based on an estimate of the missingness. In this estimator the covariates enter only through the estimate of the missingness process. Due to the coarsening at random assumption, the estimator has the interesting property that if we estimate the missingness process more nonparametrically, then we improve its efficiency. We show that by local estimation of an optimal model or optimal function of the covariates for the missingness process, the generalized current status estimator for smooth functionals become locally efficient; meaning it is efficient if the right model or covariate is consistently estimated and it is consistent and asymptotically normal in general. Estimation of the optimal model requires estimation of the conditional distribution of T, given the covariates. Any (prior) knowledge of this conditional distribution can be used at this stage without any risk of losing root-n consistency. We also propose locally efficient one step estimators. Finally, we show some simulation results.
Resumo:
We investigate the interplay of smoothness and monotonicity assumptions when estimating a density from a sample of observations. The nonparametric maximum likelihood estimator of a decreasing density on the positive half line attains a rate of convergence at a fixed point if the density has a negative derivative. The same rate is obtained by a kernel estimator, but the limit distributions are different. If the density is both differentiable and known to be monotone, then a third estimator is obtained by isotonization of a kernel estimator. We show that this again attains the rate of convergence and compare the limit distributors of the three types of estimators. It is shown that both isotonization and smoothing lead to a more concentrated limit distribution and we study the dependence on the proportionality constant in the bandwidth. We also show that isotonization does not change the limit behavior of a kernel estimator with a larger bandwidth, in the case that the density is known to have more than one derivative.
Resumo:
Estimation for bivariate right censored data is a problem that has had much study over the past 15 years. In this paper we propose a new class of estimators for the bivariate survival function based on locally efficient estimation. We introduce the locally efficient estimator for bivariate right censored data, present an asymptotic theorem, present the results of simulation studies and perform a brief data analysis illustrating the use of the locally efficient estimator.
Resumo:
Much controversy exists over whether the course of schizophrenia, as defined by the lengths of repeated community tenures, is progressively ameliorating or deteriorating. This article employs a new statistical method proposed by Wang and Chen (2000) to analyze the Denmark registry data in Eaton, et al (1992). The new statistical method correctly handles the bias caused by induced informative censoring, which is an interaction of the heterogeneity of schizophrenia patients and long-term follow-up. The analysis shows a progressive deterioration pattern in terms of community tenures for the full registry cohort, rather than a progressive amelioration pattern as reported for a selected sub-cohort in Eaton, et al (1992). When adjusted for the long-term chronicity of calendar time, no significant progressive pattern was found for the full cohort.
Resumo:
It is possible for a pair of dichotomous, categorical variables to have an overall positive association (say) in a stratified population, and at the same time be negatively associated in every stratum. This (essentially) is Simpson's paradox. A graphical device, the risk diagram, provides insight into Simpson's paradox and related concepts.
Resumo:
Estimation of the number of mixture components (k) is an unsolved problem. Available methods for estimation of k include bootstrapping the likelihood ratio test statistics and optimizing a variety of validity functionals such as AIC, BIC/MDL, and ICOMP. We investigate the minimization of distance between fitted mixture model and the true density as a method for estimating k. The distances considered are Kullback-Leibler (KL) and “L sub 2”. We estimate these distances using cross validation. A reliable estimate of k is obtained by voting of B estimates of k corresponding to B cross validation estimates of distance. This estimation methods with KL distance is very similar to Monte Carlo cross validated likelihood methods discussed by Smyth (2000). With focus on univariate normal mixtures, we present simulation studies that compare the cross validated distance method with AIC, BIC/MDL, and ICOMP. We also apply the cross validation estimate of distance approach along with AIC, BIC/MDL and ICOMP approach, to data from an osteoporosis drug trial in order to find groups that differentially respond to treatment.
Resumo:
Backcalculation is the primary method used to reconstruct past human immunodeficiency virus (HIV) infection rates, to estimate current prevalence of HIV infection, and to project future incidence of acquired immunodeficiency syndrome (AIDS). The method is very sensitive to uncertainty about the incubation period. We estimate incubation distributions from three sets of cohort data and find that the estimates for the cohorts are substantially different. Backcalculations employing the different estimates produce equally good fits to reported AIDS counts but quite different estimates of cumulative infections. These results suggest that the incubation distribution is likely to differ for different populations and that the differences are large enough to have a big impact on the resulting estimates of HIV infection rates. This seriously limits the usefulness of backcalculation for populations (such as intravenous drug users, heterosexuals, and women) that lack precise information on incubation times.
Resumo:
In many applications the observed data can be viewed as a censored high dimensional full data random variable X. By the curve of dimensionality it is typically not possible to construct estimators that are asymptotically efficient at every probability distribution in a semiparametric censored data model of such a high dimensional censored data structure. We provide a general method for construction of one-step estimators that are efficient at a chosen submodel of the full-data model, are still well behaved off this submodel and can be chosen to always improve on a given initial estimator. These one-step estimators rely on good estimators of the censoring mechanism and thus will require a parametric or semiparametric model for the censoring mechanism. We present a general theorem that provides a template for proving the desired asymptotic results. We illustrate the general one-step estimation methods by constructing locally efficient one-step estimators of marginal distributions and regression parameters with right-censored data, current status data and bivariate right-censored data, in all models allowing the presence of time-dependent covariates. The conditions of the asymptotics theorem are rigorously verified in one of the examples and the key condition of the general theorem is verified for all examples.
Resumo:
In estimation of a survival function, current status data arises when the only information available on individuals is their survival status at a single monitoring time. Here we briefly review extensions of this form of data structure in two directions: (i) doubly censored current status data, where there is incomplete information on the origin of the failure time random variable, and (ii) current status information on more complicated stochastic processes. Simple examples of these data forms are presented for motivation.