14 resultados para source code analysis
em Duke University
Resumo:
MOTIVATION: Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genetics studies. Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. This article develops a computationally feasible method based on boosting and stability selection. Specifically, we modified the component-wise gradient boosting to improve the computational feasibility and introduced random permutation in stability selection for controlling false discoveries. RESULTS: We have proposed a high-dimensional variable selection method by incorporating stability selection to control false discovery. Comparisons between the proposed method and the commonly used univariate and Lasso approaches for variable selection reveal that the proposed method yields fewer false discoveries. The proposed method is applied to study the associations of 2339 common single-nucleotide polymorphisms (SNPs) with overall survival among cutaneous melanoma (CM) patients. The results have confirmed that BRCA2 pathway SNPs are likely to be associated with overall survival, as reported by previous literature. Moreover, we have identified several new Fanconi anemia (FA) pathway SNPs that are likely to modulate survival of CM patients. AVAILABILITY AND IMPLEMENTATION: The related source code and documents are freely available at https://sites.google.com/site/bestumich/issues. CONTACT: yili@umich.edu.
Resumo:
This article describes advances in statistical computation for large-scale data analysis in structured Bayesian mixture models via graphics processing unit (GPU) programming. The developments are partly motivated by computational challenges arising in fitting models of increasing heterogeneity to increasingly large datasets. An example context concerns common biological studies using high-throughput technologies generating many, very large datasets and requiring increasingly high-dimensional mixture models with large numbers of mixture components.We outline important strategies and processes for GPU computation in Bayesian simulation and optimization approaches, give examples of the benefits of GPU implementations in terms of processing speed and scale-up in ability to analyze large datasets, and provide a detailed, tutorial-style exposition that will benefit readers interested in developing GPU-based approaches in other statistical models. Novel, GPU-oriented approaches to modifying existing algorithms software design can lead to vast speed-up and, critically, enable statistical analyses that presently will not be performed due to compute time limitations in traditional computational environments. Supplementalmaterials are provided with all source code, example data, and details that will enable readers to implement and explore the GPU approach in this mixture modeling context. © 2010 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America.
Resumo:
BACKGROUND: Computer simulations are of increasing importance in modeling biological phenomena. Their purpose is to predict behavior and guide future experiments. The aim of this project is to model the early immune response to vaccination by an agent based immune response simulation that incorporates realistic biophysics and intracellular dynamics, and which is sufficiently flexible to accurately model the multi-scale nature and complexity of the immune system, while maintaining the high performance critical to scientific computing. RESULTS: The Multiscale Systems Immunology (MSI) simulation framework is an object-oriented, modular simulation framework written in C++ and Python. The software implements a modular design that allows for flexible configuration of components and initialization of parameters, thus allowing simulations to be run that model processes occurring over different temporal and spatial scales. CONCLUSION: MSI addresses the need for a flexible and high-performing agent based model of the immune system.
Resumo:
The outcomes for both (i) radiation therapy and (ii) preclinical small animal radio- biology studies are dependent on the delivery of a known quantity of radiation to a specific and intentional location. Adverse effects can result from these procedures if the dose to the target is too high or low, and can also result from an incorrect spatial distribution in which nearby normal healthy tissue can be undesirably damaged by poor radiation delivery techniques. Thus, in mice and humans alike, the spatial dose distributions from radiation sources should be well characterized in terms of the absolute dose quantity, and with pin-point accuracy. When dealing with the steep spatial dose gradients consequential to either (i) high dose rate (HDR) brachytherapy or (ii) within the small organs and tissue inhomogeneities of mice, obtaining accurate and highly precise dose results can be very challenging, considering commercially available radiation detection tools, such as ion chambers, are often too large for in-vivo use.
In this dissertation two tools are developed and applied for both clinical and preclinical radiation measurement. The first tool is a novel radiation detector for acquiring physical measurements, fabricated from an inorganic nano-crystalline scintillator that has been fixed on an optical fiber terminus. This dosimeter allows for the measurement of point doses to sub-millimeter resolution, and has the ability to be placed in-vivo in humans and small animals. Real-time data is displayed to the user to provide instant quality assurance and dose-rate information. The second tool utilizes an open source Monte Carlo particle transport code, and was applied for small animal dosimetry studies to calculate organ doses and recommend new techniques of dose prescription in mice, as well as to characterize dose to the murine bone marrow compartment with micron-scale resolution.
Hardware design changes were implemented to reduce the overall fiber diameter to <0.9 mm for the nano-crystalline scintillator based fiber optic detector (NanoFOD) system. Lower limits of device sensitivity were found to be approximately 0.05 cGy/s. Herein, this detector was demonstrated to perform quality assurance of clinical 192Ir HDR brachytherapy procedures, providing comparable dose measurements as thermo-luminescent dosimeters and accuracy within 20% of the treatment planning software (TPS) for 27 treatments conducted, with an inter-quartile range ratio to the TPS dose value of (1.02-0.94=0.08). After removing contaminant signals (Cerenkov and diode background), calibration of the detector enabled accurate dose measurements for vaginal applicator brachytherapy procedures. For 192Ir use, energy response changed by a factor of 2.25 over the SDD values of 3 to 9 cm; however a cap made of 0.2 mm thickness silver reduced energy dependence to a factor of 1.25 over the same SDD range, but had the consequence of reducing overall sensitivity by 33%.
For preclinical measurements, dose accuracy of the NanoFOD was within 1.3% of MOSFET measured dose values in a cylindrical mouse phantom at 225 kV for x-ray irradiation at angles of 0, 90, 180, and 270˝. The NanoFOD exhibited small changes in angular sensitivity, with a coefficient of variation (COV) of 3.6% at 120 kV and 1% at 225 kV. When the NanoFOD was placed alongside a MOSFET in the liver of a sacrificed mouse and treatment was delivered at 225 kV with 0.3 mm Cu filter, the dose difference was only 1.09% with use of the 4x4 cm collimator, and -0.03% with no collimation. Additionally, the NanoFOD utilized a scintillator of 11 µm thickness to measure small x-ray fields for microbeam radiation therapy (MRT) applications, and achieved 2.7% dose accuracy of the microbeam peak in comparison to radiochromic film. Modest differences between the full-width at half maximum measured lateral dimension of the MRT system were observed between the NanoFOD (420 µm) and radiochromic film (320 µm), but these differences have been explained mostly as an artifact due to the geometry used and volumetric effects in the scintillator material. Characterization of the energy dependence for the yttrium-oxide based scintillator material was performed in the range of 40-320 kV (2 mm Al filtration), and the maximum device sensitivity was achieved at 100 kV. Tissue maximum ratio data measurements were carried out on a small animal x-ray irradiator system at 320 kV and demonstrated an average difference of 0.9% as compared to a MOSFET dosimeter in the range of 2.5 to 33 cm depth in tissue equivalent plastic blocks. Irradiation of the NanoFOD fiber and scintillator material on a 137Cs gamma irradiator to 1600 Gy did not produce any measurable change in light output, suggesting that the NanoFOD system may be re-used without the need for replacement or recalibration over its lifetime.
For small animal irradiator systems, researchers can deliver a given dose to a target organ by controlling exposure time. Currently, researchers calculate this exposure time by dividing the total dose that they wish to deliver by a single provided dose rate value. This method is independent of the target organ. Studies conducted here used Monte Carlo particle transport codes to justify a new method of dose prescription in mice, that considers organ specific doses. Monte Carlo simulations were performed in the Geant4 Application for Tomographic Emission (GATE) toolkit using a MOBY mouse whole-body phantom. The non-homogeneous phantom was comprised of 256x256x800 voxels of size 0.145x0.145x0.145 mm3. Differences of up to 20-30% in dose to soft-tissue target organs was demonstrated, and methods for alleviating these errors were suggested during whole body radiation of mice by utilizing organ specific and x-ray tube filter specific dose rates for all irradiations.
Monte Carlo analysis was used on 1 µm resolution CT images of a mouse femur and a mouse vertebra to calculate the dose gradients within the bone marrow (BM) compartment of mice based on different radiation beam qualities relevant to x-ray and isotope type irradiators. Results and findings indicated that soft x-ray beams (160 kV at 0.62 mm Cu HVL and 320 kV at 1 mm Cu HVL) lead to substantially higher dose to BM within close proximity to mineral bone (within about 60 µm) as compared to hard x-ray beams (320 kV at 4 mm Cu HVL) and isotope based gamma irradiators (137Cs). The average dose increases to the BM in the vertebra for these four aforementioned radiation beam qualities were found to be 31%, 17%, 8%, and 1%, respectively. Both in-vitro and in-vivo experimental studies confirmed these simulation results, demonstrating that the 320 kV, 1 mm Cu HVL beam caused statistically significant increased killing to the BM cells at 6 Gy dose levels in comparison to both the 320 kV, 4 mm Cu HVL and the 662 keV, 137Cs beams.
Resumo:
We describe a strategy for Markov chain Monte Carlo analysis of non-linear, non-Gaussian state-space models involving batch analysis for inference on dynamic, latent state variables and fixed model parameters. The key innovation is a Metropolis-Hastings method for the time series of state variables based on sequential approximation of filtering and smoothing densities using normal mixtures. These mixtures are propagated through the non-linearities using an accurate, local mixture approximation method, and we use a regenerating procedure to deal with potential degeneracy of mixture components. This provides accurate, direct approximations to sequential filtering and retrospective smoothing distributions, and hence a useful construction of global Metropolis proposal distributions for simulation of posteriors for the set of states. This analysis is embedded within a Gibbs sampler to include uncertain fixed parameters. We give an example motivated by an application in systems biology. Supplemental materials provide an example based on a stochastic volatility model as well as MATLAB code.
Resumo:
BACKGROUND: With the globalization of clinical trials, large developing nations have substantially increased their participation in multi-site studies. This participation has raised ethical concerns, among them the fear that local customs, habits and culture are not respected while asking potential participants to take part in study. This knowledge gap is particularly noticeable among Indian subjects, since despite the large number of participants, little is known regarding what factors affect their willingness to participate in clinical trials. METHODS: We conducted a meta-analysis of all studies evaluating the factors and barriers, from the perspective of potential Indian participants, contributing to their participation in clinical trials. We searched both international as well as Indian-specific bibliographic databases, including Pubmed, Cochrane, Openjgate, MedInd, Scirus and Medknow, also performing hand searches and communicating with authors to obtain additional references. We enrolled studies dealing exclusively with the participation of Indians in clinical trials. Data extraction was conducted by three researchers, with disagreement being resolved by consensus. RESULTS: Six qualitative studies and one survey were found evaluating the main themes affecting the participation of Indian subjects. Themes included Personal health benefits, Altruism, Trust in physicians, Source of extra income, Detailed knowledge, Methods for motivating participants as factors favoring, while Mistrust on trial organizations, Concerns about efficacy and safety of trials, Psychological reasons, Trial burden, Loss of confidentiality, Dependency issues, Language as the barriers. CONCLUSION: We identified factors that facilitated and barriers that have negative implications on trial participation decisions in Indian subjects. Due consideration and weightage should be assigned to these factors while planning future trials in India.
Resumo:
Complex diseases will have multiple functional sites, and it will be invaluable to understand the cross-locus interaction in terms of linkage disequilibrium (LD) between those sites (epistasis) in addition to the haplotype-LD effects. We investigated the statistical properties of a class of matrix-based statistics to assess this epistasis. These statistical methods include two LD contrast tests (Zaykin et al., 2006) and partial least squares regression (Wang et al., 2008). To estimate Type 1 error rates and power, we simulated multiple two-variant disease models using the SIMLA software package. SIMLA allows for the joint action of up to two disease genes in the simulated data with all possible multiplicative interaction effects between them. Our goal was to detect an interaction between multiple disease-causing variants by means of their linkage disequilibrium (LD) patterns with other markers. We measured the effects of marginal disease effect size, haplotype LD, disease prevalence and minor allele frequency have on cross-locus interaction (epistasis). In the setting of strong allele effects and strong interaction, the correlation between the two disease genes was weak (r=0.2). In a complex system with multiple correlations (both marginal and interaction), it was difficult to determine the source of a significant result. Despite these complications, the partial least squares and modified LD contrast methods maintained adequate power to detect the epistatic effects; however, for many of the analyses we often could not separate interaction from a strong marginal effect. While we did not exhaust the entire parameter space of possible models, we do provide guidance on the effects that population parameters have on cross-locus interaction.
Resumo:
ct: We introduce a new concept for stimulated-Brillouin-scattering-based slow light in optical fibers that is applicable for broadly-tunable frequency-swept sources. It allows slow light to be achieved, in principle, over the entire transparency window of the optical fiber. We demonstrate a slow light delay of 10 ns at 1.55 μm using a 10-m-long photonic crystal fiber with a source sweep rate of 400 MHz/μs and a pump power of 200 mW. We also show that there exists a maximal delay obtainable by this method, which is set by the SBS threshold, independent of sweep rate. For our fiber with optimum length, this maximum delay is ~38 ns, obtained for a pump power of 760 mW.
Resumo:
We use an information-theoretic method developed by Neifeld and Lee [J. Opt. Soc. Am. A 25, C31 (2008)] to analyze the performance of a slow-light system. Slow-light is realized in this system via stimulated Brillouin scattering in a 2 km-long, room-temperature, highly nonlinear fiber pumped by a laser whose spectrum is tailored and broadened to 5 GHz. We compute the information throughput (IT), which quantifies the fraction of information transferred from the source to the receiver and the information delay (ID), which quantifies the delay of a data stream at which the information transfer is largest, for a range of experimental parameters. We also measure the eye-opening (EO) and signal-to-noise ratio (SNR) of the transmitted data stream and find that they scale in a similar fashion to the information-theoretic method. Our experimental findings are compared to a model of the slow-light system that accounts for all pertinent noise sources in the system as well as data-pulse distortion due to the filtering effect of the SBS process. The agreement between our observations and the predictions of our model is very good. Furthermore, we compare measurements of the IT for an optimal flattop gain profile and for a Gaussian-shaped gain profile. For a given pump-beam power, we find that the optimal profile gives a 36% larger ID and somewhat higher IT compared to the Gaussian profile. Specifically, the optimal (Gaussian) profile produces a fractional slow-light ID of 0.94 (0.69) and an IT of 0.86 (0.86) at a pump-beam power of 450 mW and a data rate of 2.5 Gbps. Thus, the optimal profile better utilizes the available pump-beam power, which is often a valuable resource in a system design.
Resumo:
In 1995, Crawford and Ostrom proposed a grammatical syntax for examining institutional statements (i.e., rules, norms, and strategies) as part of the institutional analysis and development framework. This article constitutes the first attempt at applying the grammatical syntax to code institutional statements using two pieces of U.S. legislation. The authors illustrate how the grammatical syntax can serve as a basis for collecting, presenting, and analyzing data in a way that is reliable and conveys valid and substantive meaning for the researcher. The article concludes by describing some implementation challenges and ideas for future theoretical and field research. © 2010 University of Utah.
Resumo:
Most studies that apply qualitative comparative analysis (QCA) rely on macro-level data, but an increasing number of studies focus on units of analysis at the micro or meso level (i.e., households, firms, protected areas, communities, or local governments). For such studies, qualitative interview data are often the primary source of information. Yet, so far no procedure is available describing how to calibrate qualitative data as fuzzy sets. The authors propose a technique to do so and illustrate it using examples from a study of Guatemalan local governments. By spelling out the details of this important analytic step, the authors aim at contributing to the growing literature on best practice in QCA. © The Author(s) 2012.
Resumo:
Ongoing Cryptococcus gattii outbreaks in the Western United States and Canada illustrate the impact of environmental reservoirs and both clonal and recombining propagation in driving emergence and expansion of microbial pathogens. C. gattii comprises four distinct molecular types: VGI, VGII, VGIII, and VGIV, with no evidence of nuclear genetic exchange, indicating these represent distinct species. C. gattii VGII isolates are causing the Pacific Northwest outbreak, whereas VGIII isolates frequently infect HIV/AIDS patients in Southern California. VGI, VGII, and VGIII have been isolated from patients and animals in the Western US, suggesting these molecular types occur in the environment. However, only two environmental isolates of C. gattii have ever been reported from California: CBS7750 (VGII) and WM161 (VGIII). The incongruence of frequent clinical presence and uncommon environmental isolation suggests an unknown C. gattii reservoir in California. Here we report frequent isolation of C. gattii VGIII MATα and MATa isolates and infrequent isolation of VGI MATα from environmental sources in Southern California. VGIII isolates were obtained from soil debris associated with tree species not previously reported as hosts from sites near residences of infected patients. These isolates are fertile under laboratory conditions, produce abundant spores, and are part of both locally and more distantly recombining populations. MLST and whole genome sequence analysis provide compelling evidence that these environmental isolates are the source of human infections. Isolates displayed wide-ranging virulence in macrophage and animal models. When clinical and environmental isolates with indistinguishable MLST profiles were compared, environmental isolates were less virulent. Taken together, our studies reveal an environmental source and risk of C. gattii to HIV/AIDS patients with implications for the >1,000,000 cryptococcal infections occurring annually for which the causative isolate is rarely assigned species status. Thus, the C. gattii global health burden could be more substantial than currently appreciated.
Resumo:
Early interventions are a preferred method for addressing behavioral problems in high-risk children, but often have only modest effects. Identifying sources of variation in intervention effects can suggest means to improve efficiency. One potential source of such variation is the genome. We conducted a genetic analysis of the Fast Track randomized control trial, a 10-year-long intervention to prevent high-risk kindergarteners from developing adult externalizing problems including substance abuse and antisocial behavior. We tested whether variants of the glucocorticoid receptor gene NR3C1 were associated with differences in response to the Fast Track intervention. We found that in European-American children, a variant of NR3C1 identified by the single-nucleotide polymorphism rs10482672 was associated with increased risk for externalizing psychopathology in control group children and decreased risk for externalizing psychopathology in intervention group children. Variation in NR3C1 measured in this study was not associated with differential intervention response in African-American children. We discuss implications for efforts to prevent externalizing problems in high-risk children and for public policy in the genomic era.
Resumo:
The fundamental phenotypes of growth rate, size and morphology are the result of complex interactions between genotype and environment. We developed a high-throughput software application, WormSizer, which computes size and shape of nematodes from brightfield images. Existing methods for estimating volume either coarsely model the nematode as a cylinder or assume the worm shape or opacity is invariant. Our estimate is more robust to changes in morphology or optical density as it only assumes radial symmetry. This open source software is written as a plugin for the well-known image-processing framework Fiji/ImageJ. It may therefore be extended easily. We evaluated the technical performance of this framework, and we used it to analyze growth and shape of several canonical Caenorhabditis elegans mutants in a developmental time series. We confirm quantitatively that a Dumpy (Dpy) mutant is short and fat and that a Long (Lon) mutant is long and thin. We show that daf-2 insulin-like receptor mutants are larger than wild-type upon hatching but grow slow, and WormSizer can distinguish dauer larvae from normal larvae. We also show that a Small (Sma) mutant is actually smaller than wild-type at all stages of larval development. WormSizer works with Uncoordinated (Unc) and Roller (Rol) mutants as well, indicating that it can be used with mutants despite behavioral phenotypes. We used our complete data set to perform a power analysis, giving users a sense of how many images are needed to detect different effect sizes. Our analysis confirms and extends on existing phenotypic characterization of well-characterized mutants, demonstrating the utility and robustness of WormSizer.