3 resultados para Modeling Methodology

em Collection Of Biostatistics Research Archive


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The last two decades have seen intense scientific and regulatory interest in the health effects of particulate matter (PM). Influential epidemiological studies that characterize chronic exposure of individuals rely on monitoring data that are sparse in space and time, so they often assign the same exposure to participants in large geographic areas and across time. We estimate monthly PM during 1988-2002 in a large spatial domain for use in studying health effects in the Nurses' Health Study. We develop a conceptually simple spatio-temporal model that uses a rich set of covariates. The model is used to estimate concentrations of PM10 for the full time period and PM2.5 for a subset of the period. For the earlier part of the period, 1988-1998, few PM2.5 monitors were operating, so we develop a simple extension to the model that represents PM2.5 conditionally on PM10 model predictions. In the epidemiological analysis, model predictions of PM10 are more strongly associated with health effects than when using simpler approaches to estimate exposure. Our modeling approach supports the application in estimating both fine-scale and large-scale spatial heterogeneity and capturing space-time interaction through the use of monthly-varying spatial surfaces. At the same time, the model is computationally feasible, implementable with standard software, and readily understandable to the scientific audience. Despite simplifying assumptions, the model has good predictive performance and uncertainty characterization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel class of models for functional data exhibiting skewness or other shape characteristics that vary with spatial or temporal location. We use copulas so that the marginal distributions and the dependence structure can be modeled independently. Dependence is modeled with a Gaussian or t-copula, so that there is an underlying latent Gaussian process. We model the marginal distributions using the skew t family. The mean, variance, and shape parameters are modeled nonparametrically as functions of location. A computationally tractable inferential framework for estimating heterogeneous asymmetric or heavy-tailed marginal distributions is introduced. This framework provides a new set of tools for increasingly complex data collected in medical and public health studies. Our methods were motivated by and are illustrated with a state-of-the-art study of neuronal tracts in multiple sclerosis patients and healthy controls. Using the tools we have developed, we were able to find those locations along the tract most affected by the disease. However, our methods are general and highly relevant to many functional data sets. In addition to the application to one-dimensional tract profiles illustrated here, higher-dimensional extensions of the methodology could have direct applications to other biological data including functional and structural MRI.