7 resultados para second-generation sequencing
em DigitalCommons@The Texas Medical Center
Resumo:
Second-generation antipsychotics (SGAs) are increasingly prescribed to treat psychiatric symptoms in pediatric patients infected with HIV. We examined the relationship between prescribed SGAs and physical growth in a cohort of youth with perinatally acquired HIV-1 infection. Pediatric AIDS Clinical Trials Group (PACTG), Protocol 219C (P219C), a multicenter, longitudinal observational study of children and adolescents perinatally exposed to HIV, was conducted from September 2000 until May 2007. The analysis included P219C participants who were perinatally HIV-infected, 3-18 years old, prescribed first SGA for at least 1 month, and had available baseline data prior to starting first SGA. Each participant prescribed an SGA was matched (based on gender, age, Tanner stage, baseline body mass index [BMI] z score) with 1-3 controls without antipsychotic prescriptions. The main outcomes were short-term (approximately 6 months) and long-term (approximately 2 years) changes in BMI z scores from baseline. There were 236 participants in the short-term and 198 in the long-term analysis. In linear regression models, youth with SGA prescriptions had increased BMI z scores relative to youth without antipsychotic prescriptions, for all SGAs (short-term increase = 0.192, p = 0.003; long-term increase = 0.350, p < 0.001), and for risperidone alone (short-term = 0.239, p = 0.002; long-term = 0.360, p = 0.001). Participants receiving both protease inhibitors (PIs) and SGAs showed especially large increases. These findings suggest that growth should be carefully monitored in youth with perinatally acquired HIV who are prescribed SGAs. Future research should investigate the interaction between PIs and SGAs in children and adolescents with perinatally acquired HIV infection.
Resumo:
Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^
Resumo:
Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (
Resumo:
Plasma low-density lipoprotein (LDL) levels are positively correlated with the incidence of coronary artery disease. In the circulation, the plasma LDL clearance is mainly achieved by the uptake via LDL receptor (LDLR). Proprotein convertase subtilisin/kexin type 9 (PCSK9) is a newly discovered gene, playing an important role in LDL metabolism. Gain-of-function mutations of PCSK9 lead to hypercholesterolemia and loss-of-function mutations of PCSK9 are associated with decrease of LDL cholesterol. The effects of PCSK9 on cholesterol levels are the consequence of a strong interaction between the catalytic domain of PCSK9 and epidermal growth factor-like repeat A (EGF-A) domain of LDLR on the cell surface of hepatocytes. This PCSK9/LDLR complex enters the cell via endocytosis, where both PCSK9 and LDLR are removed via the lysosome pathway, resulting in decreased levels of LDLR and accumulation of LDL in the plasma. However, whether this is the exclusive function of PCSK9 on LDL metabolism was challenged by us; we observed PCSK9 interacted with apolipoprotein B (apoB) and increased apoB production, irrespective of the LDLR. ApoB is the primary structure protein of LDL particle and it also serves as the ligand for the LDL receptor. There is ample evidence showing that the levels of apoB are a better indicator for heart disease than either total cholesterol or LDL cholesterol levels. We used a second-generation adenoviral vector to overexpress PCSK9 (Ad-PCSK9) in wild-type C57BL/6 and LDLR deficient mice (Ldlr-/- and Ldlr-/-Apobec1-/-). Our study revealed that overexpression of PCSK9 promoted the production and secretion of apoB in the form of very-low density lipoprotein (VLDL), which is the precursor of LDL, in the 3 mouse models studied (C57BL/6J, Ldlr-/-, and Ldlr-/-Apobec1-/-). The increased apoB production in mice was regulated at post-transcriptional levels, since there was no difference in apoB mRNA levels between mice treated with Ad-PCSK9 and control vector Ad-Null. By using pulse-chase experiment on primary hepatocytes, we showed that overexpression of PCSK9 increased the secretion of apoB, independent of LDLR. In the circulation, we showed that PCSK9 was associated with LDL particles. By using 3 different protein–protein interaction assays of co-immunoprecipitation, mammalian two-hybrid system, and in situ proximity ligation assay, we demonstrated a direct protein–protein interaction between PCSK9 and apoB. The impact of this interaction inhibited the physiological removal process of apoB via autophagosome/lysosome pathway in an LDLR-independent fashion, resulting in increased production and secretion of apoB-containing lipoproteins. The significance of this process was shown in the Pcsk9 knockout mice in the background of Ldlr-/-Apobec1-/- mice (triple knockout mice); in the absence of Pcsk9 (triple knockout mice) the levels of cholesterol, triacylglycerol, and apoB decreased significantly in comparison to that of Ldlr-/-Apobec1-/- mice. Taken together, our study demonstrated a direct intracellular interaction of PCSK9 with apoB, resulting in the inhibition of apoB degradation via the autophagosome/lysosome pathway independent of LDLR. This discovery provides a new concept of the importance of PCSK9 and suggests new approaches for the therapeutic intervention of hyperlipidemia.
Resumo:
It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays as well as next generation sequencing assays interrogating somatic mutation, insertion, deletion, translocation and structural rearrangements. Given the massive amount of data, a major challenge is to integrate information from multiple sources and formulate testable hypotheses. This thesis focuses on developing methodologies for integrative analyses of genomic assays profiled on the same set of samples. We have developed several novel methods for integrative biomarker identification and cancer classification. We introduce a regression-based approach to identify biomarkers predictive to therapy response or survival by integrating multiple assays including gene expression, methylation and copy number data through penalized regression. To identify key cancer-specific genes accounting for multiple mechanisms of regulation, we have developed the integIRTy software that provides robust and reliable inferences about gene alteration by automatically adjusting for sample heterogeneity as well as technical artifacts using Item Response Theory. To cope with the increasing need for accurate cancer diagnosis and individualized therapy, we have developed a robust and powerful algorithm called SIBER to systematically identify bimodally expressed genes using next generation RNAseq data. We have shown that prediction models built from these bimodal genes have the same accuracy as models built from all genes. Further, prediction models with dichotomized gene expression measurements based on their bimodal shapes still perform well. The effectiveness of outcome prediction using discretized signals paves the road for more accurate and interpretable cancer classification by integrating signals from multiple sources.
Resumo:
Cardiovascular disease (CVD) is a threat to public health. It has been reported to be the leading cause of death in United States. The invention of next generation sequencing (NGS) technology has revolutionized the biomedical research. To investigate NGS data of CVD related quantitative traits would contribute to address the unknown etiology and disease mechanism of CVD. NHLBI's Exome Sequencing Project (ESP) contains CVD related phenotypes and their associated NGS exomes sequence data. Initially, a subset of next generation sequencing data consisting of 13 CVD-related quantitative traits was investigated. Only 6 traits, systolic blood pressure (SBP), diastolic blood pressure (DBP), height, platelet counts, waist circumference, and weight, were analyzed by functional linear model (FLM) and 7 currently existing methods. FLM outperformed all currently existing methods by identifying the highest number of significant genes and had identified 96, 139, 756, 1162, 1106, and 298 genes associated with SBP, DBP, Height, Platelet, Waist, and Weight respectively. ^
Resumo:
The genomic era brought by recent advances in the next-generation sequencing technology makes the genome-wide scans of natural selection a reality. Currently, almost all the statistical tests and analytical methods for identifying genes under selection was performed on the individual gene basis. Although these methods have the power of identifying gene subject to strong selection, they have limited power in discovering genes targeted by moderate or weak selection forces, which are crucial for understanding the molecular mechanisms of complex phenotypes and diseases. Recent availability and rapid completeness of many gene network and protein-protein interaction databases accompanying the genomic era open the avenues of exploring the possibility of enhancing the power of discovering genes under natural selection. The aim of the thesis is to explore and develop normal mixture model based methods for leveraging gene network information to enhance the power of natural selection target gene discovery. The results show that the developed statistical method, which combines the posterior log odds of the standard normal mixture model and the Guilt-By-Association score of the gene network in a naïve Bayes framework, has the power to discover moderate/weak selection gene which bridges the genes under strong selection and it helps our understanding the biology under complex diseases and related natural selection phenotypes.^