4 resultados para phi value analysis
em Collection Of Biostatistics Research Archive
Resumo:
Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number (Olshen {\it et~al}, 2004). The algorithm tests for change-points using a maximal $t$-statistic with a permutation reference distribution to obtain the corresponding $p$-value. The number of computations required for the maximal test statistic is $O(N^2),$ where $N$ is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster. algorithm. Results: We present a hybrid approach to obtain the $p$-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analysis of array CGH data from a breast cancer cell line to show the impact of the new approaches on the analysis of real data. Availability: An R (R Development Core Team, 2006) version of the CBS algorithm has been implemented in the ``DNAcopy'' package of the Bioconductor project (Gentleman {\it et~al}, 2004). The proposed hybrid method for the $p$-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher.
Resumo:
Among the many applications of microarray technology, one of the most popular is the identification of genes that are differentially expressed in two conditions. A common statistical approach is to quantify the interest of each gene with a p-value, adjust these p-values for multiple comparisons, chose an appropriate cut-off, and create a list of candidate genes. This approach has been criticized for ignoring biological knowledge regarding how genes work together. Recently a series of methods, that do incorporate biological knowledge, have been proposed. However, many of these methods seem overly complicated. Furthermore, the most popular method, Gene Set Enrichment Analysis (GSEA), is based on a statistical test known for its lack of sensitivity. In this paper we compare the performance of a simple alternative to GSEA.We find that this simple solution clearly outperforms GSEA.We demonstrate this with eight different microarray datasets.
Resumo:
Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a complicated target distribution via simple ergodic averages. A fundamental question in MCMC applications is when should the sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a method that stops the MCMC sampling the first time the width of a confidence interval based on the ergodic averages is less than a user-specified value. Hence calculating Monte Carlo standard errors is a critical step in assessing the output of the simulation. In particular, we consider the regenerative simulation and batch means methods of estimating the variance of the asymptotic normal distribution. We describe sufficient conditions for the strong consistency and asymptotic normality of both methods and investigate their finite sample properties in a variety of examples.
Resumo:
We previously showed that lifetime cumulative lead dose, measured as lead concentration in the tibia bone by X-ray fluorescence, was associated with persistent and progressive declines in cognitive function and with decreases in MRI-based brain volumes in former lead workers. Moreover, larger region-specific brain volumes were associated with better cognitive function. These findings motivated us to explore a novel application of path analysis to evaluate effect mediation. Voxel-wise path analysis, at face value, represents the natural evolution of voxel-based morphometry methods to answer questions of mediation. Application of these methods to the former lead worker data demonstrated potential limitations in this approach where there was a tendency for results to be strongly biased towards the null hypothesis (lack of mediation). Moreover, a complimentary analysis using anatomically-derived regions of interest volumes yielded opposing results, suggesting evidence of mediation. Specifically, in the ROI-based approach, there was evidence that the association of tibia lead with function in three cognitive domains was mediated through the volumes of total brain, frontal gray matter, and/or possibly cingulate. A simulation study was conducted to investigate whether the voxel-wise results arose from an absence of localized mediation, or more subtle defects in the methodology. The simulation results showed the same null bias evidenced as seen in the lead workers data. Both the lead worker data results and the simulation study suggest that a null-bias in voxel-wise path analysis limits its inferential utility for producing confirmatory results.