17 resultados para COUNT DATA MODELS

em Collection Of Biostatistics Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we study panel count data with informative observation times. We assume nonparametric and semiparametric proportional rate models for the underlying recurrent event process, where the form of the baseline rate function is left unspecified and a subject-specific frailty variable inflates or deflates the rate function multiplicatively. The proposed models allow the recurrent event processes and observation times to be correlated through their connections with the unobserved frailty; moreover, the distributions of both the frailty variable and observation times are considered as nuisance parameters. The baseline rate function and the regression parameters are estimated by maximizing a conditional likelihood function of observed event counts and solving estimation equations. Large sample properties of the proposed estimators are studied. Numerical studies demonstrate that the proposed estimation procedures perform well for moderate sample sizes. An application to a bladder tumor study is presented to illustrate the use of the proposed methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many applications the observed data can be viewed as a censored high dimensional full data random variable X. By the curve of dimensionality it is typically not possible to construct estimators that are asymptotically efficient at every probability distribution in a semiparametric censored data model of such a high dimensional censored data structure. We provide a general method for construction of one-step estimators that are efficient at a chosen submodel of the full-data model, are still well behaved off this submodel and can be chosen to always improve on a given initial estimator. These one-step estimators rely on good estimators of the censoring mechanism and thus will require a parametric or semiparametric model for the censoring mechanism. We present a general theorem that provides a template for proving the desired asymptotic results. We illustrate the general one-step estimation methods by constructing locally efficient one-step estimators of marginal distributions and regression parameters with right-censored data, current status data and bivariate right-censored data, in all models allowing the presence of time-dependent covariates. The conditions of the asymptotics theorem are rigorously verified in one of the examples and the key condition of the general theorem is verified for all examples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider nonparametric missing data models for which the censoring mechanism satisfies coarsening at random and which allow complete observations on the variable X of interest. W show that beyond some empirical process conditions the only essential condition for efficiency of an NPMLE of the distribution of X is that the regions associated with incomplete observations on X contain enough complete observations. This is heuristically explained by describing the EM-algorithm. We provide identifiably of the self-consistency equation and efficiency of the NPMLE in order to make this statement rigorous. The usual kind of differentiability conditions in the proof are avoided by using an identity which holds for the NPMLE of linear parameters in convex models. We provide a bivariate censoring application in which the condition and hence the NPMLE fails, but where other estimators, not based on the NPMLE principle, are highly inefficient. It is shown how to slightly reduce the data so that the conditions hold for the reduced data. The conditions are verified for the univariate censoring, double censored, and Ibragimov-Has'minski models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being more efficient computationally than other Bayesian approaches. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. Conclusions based on a real dataset of cancer cases in Taiwan are similar albeit less conclusive with respect to comparing the approaches. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A method is given for proving efficiency of NPMLE directly linked to empirical process theory. The conditions in general are appropriate consistency of the NPMLE, differentiability of the model, differentiability of the parameter of interest, local convexity of the parameter space, and a Donsker class condition for the class of efficient influence functions obtained by varying the parameters. For the case that the model is linear in the parameter and the parameter space is convex, as with most nonparametric missing data models, we show that the method leads to an identity for the NPMLE which almost says that the NPMLE is efficient and provides us straightforwardly with a consistency and efficiency proof. This identify is extended to an almost linear class of models which contain biased sampling models. To illustrate, the method is applied to the univariate censoring model, random truncation models, interval censoring case I model, the class of parametric models and to a class of semiparametric models.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Investigators interested in whether a disease aggregates in families often collect case-control family data, which consist of disease status and covariate information for families selected via case or control probands. Here, we focus on the use of case-control family data to investigate the relative contributions to the disease of additive genetic effects (A), shared family environment (C), and unique environment (E). To this end, we describe a ACE model for binary family data and then introduce an approach to fitting the model to case-control family data. The structural equation model, which has been described previously, combines a general-family extension of the classic ACE twin model with a (possibly covariate-specific) liability-threshold model for binary outcomes. Our likelihood-based approach to fitting involves conditioning on the proband’s disease status, as well as setting prevalence equal to a pre-specified value that can be estimated from the data themselves if necessary. Simulation experiments suggest that our approach to fitting yields approximately unbiased estimates of the A, C, and E variance components, provided that certain commonly-made assumptions hold. These assumptions include: the usual assumptions for the classic ACE and liability-threshold models; assumptions about shared family environment for relative pairs; and assumptions about the case-control family sampling, including single ascertainment. When our approach is used to fit the ACE model to Austrian case-control family data on depression, the resulting estimate of heritability is very similar to those from previous analyses of twin data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.