6 resultados para Multi-model inference
em Duke University
Resumo:
Technological advances in genotyping have given rise to hypothesis-based association studies of increasing scope. As a result, the scientific hypotheses addressed by these studies have become more complex and more difficult to address using existing analytic methodologies. Obstacles to analysis include inference in the face of multiple comparisons, complications arising from correlations among the SNPs (single nucleotide polymorphisms), choice of their genetic parametrization and missing data. In this paper we present an efficient Bayesian model search strategy that searches over the space of genetic markers and their genetic parametrization. The resulting method for Multilevel Inference of SNP Associations, MISA, allows computation of multilevel posterior probabilities and Bayes factors at the global, gene and SNP level, with the prior distribution on SNP inclusion in the model providing an intrinsic multiplicity correction. We use simulated data sets to characterize MISA's statistical power, and show that MISA has higher power to detect association than standard procedures. Using data from the North Carolina Ovarian Cancer Study (NCOCS), MISA identifies variants that were not identified by standard methods and have been externally "validated" in independent studies. We examine sensitivity of the NCOCS results to prior choice and method for imputing missing data. MISA is available in an R package on CRAN.
Resumo:
Extensive investigation has been conducted on network data, especially weighted network in the form of symmetric matrices with discrete count entries. Motivated by statistical inference on multi-view weighted network structure, this paper proposes a Poisson-Gamma latent factor model, not only separating view-shared and view-specific spaces but also achieving reduced dimensionality. A multiplicative gamma process shrinkage prior is implemented to avoid over parameterization and efficient full conditional conjugate posterior for Gibbs sampling is accomplished. By the accommodating of view-shared and view-specific parameters, flexible adaptability is provided according to the extents of similarity across view-specific space. Accuracy and efficiency are tested by simulated experiment. An application on real soccer network data is also proposed to illustrate the model.
Resumo:
In this paper, we propose generalized sampling approaches for measuring a multi-dimensional object using a compact compound-eye imaging system called thin observation module by bound optics (TOMBO). This paper shows the proposed system model, physical examples, and simulations to verify TOMBO imaging using generalized sampling. In the system, an object is modulated and multiplied by a weight distribution with physical coding, and the coded optical signal is integrated on to a detector array. A numerical estimation algorithm employing a sparsity constraint is used for object reconstruction.
Resumo:
BACKGROUND: Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. RESULTS: Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. CONCLUSIONS: Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Resumo:
On-board image guidance, such as cone-beam CT (CBCT) and kV/MV 2D imaging, is essential in many radiation therapy procedures, such as intensity modulated radiotherapy (IMRT) and stereotactic body radiation therapy (SBRT). These imaging techniques provide predominantly anatomical information for treatment planning and target localization. Recently, studies have shown that treatment planning based on functional and molecular information about the tumor and surrounding tissue could potentially improve the effectiveness of radiation therapy. However, current on-board imaging systems are limited in their functional and molecular imaging capability. Single Photon Emission Computed Tomography (SPECT) is a candidate to achieve on-board functional and molecular imaging. Traditional SPECT systems typically take 20 minutes or more for a scan, which is too long for on-board imaging. A robotic multi-pinhole SPECT system was proposed in this dissertation to provide shorter imaging time by using a robotic arm to maneuver the multi-pinhole SPECT system around the patient in position for radiation therapy.
A 49-pinhole collimated SPECT detector and its shielding were designed and simulated in this work using the computer-aided design (CAD) software. The trajectories of robotic arm about the patient, treatment table and gantry in the radiation therapy room and several detector assemblies such as parallel holes, single pinhole and 49 pinholes collimated detector were investigated. The rail mounted system was designed to enable a full range of detector positions and orientations to various crucial treatment sites including head and torso, while avoiding collision with linear accelerator (LINAC), patient table and patient.
An alignment method was developed in this work to calibrate the on-board robotic SPECT to the LINAC coordinate frame and to the coordinate frames of other on-board imaging systems such as CBCT. This alignment method utilizes line sources and one pinhole projection of these line sources. The model consists of multiple alignment parameters which maps line sources in 3-dimensional (3D) space to their 2-dimensional (2D) projections on the SPECT detector. Computer-simulation studies and experimental evaluations were performed as a function of number of line sources, Radon transform accuracy, finite line-source width, intrinsic camera resolution, Poisson noise and acquisition geometry. In computer-simulation studies, when there was no error in determining angles (α) and offsets (ρ) of the measured projections, the six alignment parameters (3 translational and 3 rotational) were estimated perfectly using three line sources. When angles (α) and offsets (ρ) were provided by Radon transform, the estimation accuracy was reduced. The estimation error was associated with rounding errors of Radon transform, finite line-source width, Poisson noise, number of line sources, intrinsic camera resolution and detector acquisition geometry. The estimation accuracy was significantly improved by using 4 line sources rather than 3 and also by using thinner line-source projections (obtained by better intrinsic detector resolution). With 5 line sources, median errors were 0.2 mm for the detector translations, 0.7 mm for the detector radius of rotation, and less than 0.5° for detector rotation, tilt and twist. In experimental evaluations, average errors relative to a different, independent registration technique were about 1.8 mm for detector translations, 1.1 mm for the detector radius of rotation (ROR), 0.5° and 0.4° for detector rotation and tilt, respectively, and 1.2° for detector twist.
Simulation studies were performed to investigate the improvement of imaging sensitivity and accuracy of hot sphere localization for breast imaging of patients in prone position. A 3D XCAT phantom was simulated in the prone position with nine hot spheres of 10 mm diameter added in the left breast. A no-treatment-table case and two commercial prone breast boards, 7 and 24 cm thick, were simulated. Different pinhole focal lengths were assessed for root-mean-square-error (RMSE). The pinhole focal lengths resulting in the lowest RMSE values were 12 cm, 18 cm and 21 cm for no table, thin board, and thick board, respectively. In both no table and thin board cases, all 9 hot spheres were easily visualized above background with 4-minute scans utilizing the 49-pinhole SPECT system while seven of nine hot spheres were visible with the thick board. In comparison with parallel-hole system, our 49-pinhole system shows reduction in noise and bias under these simulation cases. These results correspond to smaller radii of rotation for no-table case and thinner prone board. Similarly, localization accuracy with the 49-pinhole system was significantly better than with the parallel-hole system for both the thin and thick prone boards. Median localization errors for the 49-pinhole system with the thin board were less than 3 mm for 5 of 9 hot spheres, and less than 6 mm for the other 4 hot spheres. Median localization errors of 49-pinhole system with the thick board were less than 4 mm for 5 of 9 hot spheres, and less than 8 mm for the other 4 hot spheres.
Besides prone breast imaging, respiratory-gated region-of-interest (ROI) imaging of lung tumor was also investigated. A simulation study was conducted on the potential of multi-pinhole, region-of-interest (ROI) SPECT to alleviate noise effects associated with respiratory-gated SPECT imaging of the thorax. Two 4D XCAT digital phantoms were constructed, with either a 10 mm or 20 mm diameter tumor added in the right lung. The maximum diaphragm motion was 2 cm (for 10 mm tumor) or 4 cm (for 20 mm tumor) in superior-inferior direction and 1.2 cm in anterior-posterior direction. Projections were simulated with a 4-minute acquisition time (40 seconds per each of 6 gates) using either the ROI SPECT system (49-pinhole) or reference single and dual conventional broad cross-section, parallel-hole collimated SPECT. The SPECT images were reconstructed using OSEM with up to 6 iterations. Images were evaluated as a function of gate by profiles, noise versus bias curves, and a numerical observer performing a forced-choice localization task. Even for the 20 mm tumor, the 49-pinhole imaging ROI was found sufficient to encompass fully usual clinical ranges of diaphragm motion. Averaged over the 6 gates, noise at iteration 6 of 49-pinhole ROI imaging (10.9 µCi/ml) was approximately comparable to noise at iteration 2 of the two dual and single parallel-hole, broad cross-section systems (12.4 µCi/ml and 13.8 µCi/ml, respectively). Corresponding biases were much lower for the 49-pinhole ROI system (3.8 µCi/ml), versus 6.2 µCi/ml and 6.5 µCi/ml for the dual and single parallel-hole systems, respectively. Median localization errors averaged over 6 gates, for the 10 mm and 20 mm tumors respectively, were 1.6 mm and 0.5 mm using the ROI imaging system and 6.6 mm and 2.3 mm using the dual parallel-hole, broad cross-section system. The results demonstrate substantially improved imaging via ROI methods. One important application may be gated imaging of patients in position for radiation therapy.
A robotic SPECT imaging system was constructed utilizing a gamma camera detector (Digirad 2020tc) and a robot (KUKA KR150-L110 robot). An imaging study was performed with a phantom (PET CT Phantom
In conclusion, the proposed on-board robotic SPECT can be aligned to LINAC/CBCT with a single pinhole projection of the line-source phantom. Alignment parameters can be estimated using one pinhole projection of line sources. This alignment method may be important for multi-pinhole SPECT, where relative pinhole alignment may vary during rotation. For single pinhole and multi-pinhole SPECT imaging onboard radiation therapy machines, the method could provide alignment of SPECT coordinates with those of CBCT and the LINAC. In simulation studies of prone breast imaging and respiratory-gated lung imaging, the 49-pinhole detector showed better tumor contrast recovery and localization in a 4-minute scan compared to parallel-hole detector. On-board SPECT could be achieved by a robot maneuvering a SPECT detector about patients in position for radiation therapy on a flat-top couch. The robot inherent coordinate frames could be an effective means to estimate detector pose for use in SPECT image reconstruction.
Resumo:
Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.
We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.
We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.
Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.
This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.