931 resultados para 060102 Bioinformatics


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Numerous studies have been carried out to try to better understand the genetic predisposition for cardiovascular disease. Although it is widely believed that multifactorial diseases such as cardiovascular disease is the result from effects of many genes which working alone or interact with other genes, most genetic studies have been focused on identifying of cardiovascular disease susceptibility genes and usually ignore the effects of gene-gene interactions in the analysis. The current study applies a novel linkage disequilibrium based statistic for testing interactions between two linked loci using data from a genome-wide study of cardiovascular disease. A total of 53,394 single nucleotide polymorphisms (SNPs) are tested for pair-wise interactions, and 8,644 interactions are found to be significant with p-values less than 3.5×10-11. Results indicate that known cardiovascular disease susceptibility genes tend not to have many significantly interactions. One SNP in the CACNG1 (calcium channel, voltage-dependent, gamma subunit 1) gene and one SNP in the IL3RA (interleukin 3 receptor, alpha) gene are found to have the most significant pair-wise interactions. Findings from the current study should be replicated in other independent cohort to eliminate potential false positive results.^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Aortic aneurysms and dissections are the 15th most common cause of death in the United States. Genetic factors contribute to the pathogenesis of thoracic aortic aneurysms and dissections (TAAD). Currently, six loci and four genes have been identified for familial TAAD. Notably, mutations in smooth muscle cell (SMC) contractile genes, ACTA2 and MYH11, are responsible for 15% of familial TAAD, suggesting that proper SMC contraction is important for normal aorta function. Therefore, we hypothesize that mutations in other genes encoding SMC contractile proteins also cause familial TAAD. ^ To test this hypothesis, we used a candidate gene approach to identify causative mutations in SMC contractile genes for familial TAAD. Sequencing DNA in 80 TAAD patients from unrelated families, we identified putative mutations in eight contractile genes. We chose myosin light chain kinase (MLCK ) S1759P for further study for the following reasons: (1) Serine 1759 is conserved between vertebrates and invertebrates. (2) S1759P is predicted to be functionally deleterious by bioinformatics. (3) Low blood pressure is observed in SMC-selective MLCK-deficient mice. ^ In the presence of Ca2+/Calmodulin (CaM), MLCK containing CaM binding and kinase domains are activated to phosphorylate myosin light chain, thereby initiate SMC contraction. The CaM binding sequence of MLCK forms an α-helix structure required for CaM binding. MLCK Serine 1759 is located within the CaM binding domain. S1759P is predicted to decrease the α-helix composition in the CaM binding domain. Hence, we hypothesize that MLCK mutations cause TAAD through disturbing CaM binding and MLCK activity. ^ We further sequenced MLCK in DNA samples from additional 86 probands with familial TAAD. Two more mutations, MLCK A1754T and R1480Stop, were identified, supporting that MLCK mutations cause familial TAAD. ^ To define whether MLCK mutations disrupted CaM binding and MLCK activity, we performed co-immunoprecipitation and kinase assays. Decreased CaM binding and kinase activity was detected in A1754T and S1759P. Moreover, R1480Stop is predicted to truncate kinase and CaM binding domains. We conclude that MLCK mutations disrupt CaM binding and MLCK activity. ^ Collectively, our study is first to show mutations in genes regulating SMC contraction cause TAAD. This finding further highlights the importance of SMC contraction in maintaining aorta function. ^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most studies of differential gene-expressions have been conducted between two given conditions. The two-condition experimental (TCE) approach is simple in that all genes detected display a common differential expression pattern responsive to a common two-condition difference. Therefore, the genes that are differentially expressed under the other conditions other than the given two conditions are undetectable with the TCE approach. In order to address the problem, we propose a new approach called multiple-condition experiment (MCE) without replication and develop corresponding statistical methods including inference of pairs of conditions for genes, new t-statistics, and a generalized multiple-testing method for any multiple-testing procedure via a control parameter C. We applied these statistical methods to analyze our real MCE data from breast cancer cell lines and found that 85 percent of gene-expression variations were caused by genotypic effects and genotype-ANAX1 overexpression interactions, which agrees well with our expected results. We also applied our methods to the adenoma dataset of Notterman et al. and identified 93 differentially expressed genes that could not be found in TCE. The MCE approach is a conceptual breakthrough in many aspects: (a) many conditions of interests can be conducted simultaneously; (b) study of association between differential expressions of genes and conditions becomes easy; (c) it can provide more precise information for molecular classification and diagnosis of tumors; (d) it can save lot of experimental resources and time for investigators.^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Systemic sclerosis (SSc) or Scleroderma is a complex disease and its etiopathogenesis remains unelucidated. Fibrosis in multiple organs is a key feature of SSc and studies have shown that transforming growth factor-β (TGF-β) pathway has a crucial role in fibrotic responses. For a complex disease such as SSc, expression quantitative trait loci (eQTL) analysis is a powerful tool for identifying genetic variations that affect expression of genes involved in this disease. In this study, a multilevel model is described to perform a multivariate eQTL for identifying genetic variation (SNPs) specifically associated with the expression of three members of TGF-β pathway, CTGF, SPARC and COL3A1. The uniqueness of this model is that all three genes were included in one model, rather than one gene being examined at a time. A protein might contribute to multiple pathways and this approach allows the identification of important genetic variations linked to multiple genes belonging to the same pathway. In this study, 29 SNPs were identified and 16 of them located in known genes. Exploring the roles of these genes in TGF-β regulation will help elucidate the etiology of SSc, which will in turn help to better manage this complex disease. ^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. Many recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. The current study incorporated gene network information into gene-based analysis of GWAS data for Crohn's disease (CD). The purpose was to develop statistical models to boost the power of identifying disease-associated genes and gene subnetworks by maximizing the use of existing biological knowledge from multiple sources. The results revealed that Markov random field (MRF) based mixture model incorporating direct neighborhood information from a single gene network is not efficient in identifying CD-related genes based on the GWAS data. The incorporation of solely direct neighborhood information might lead to the low efficiency of these models. Alternative MRF models looking beyond direct neighboring information are necessary to be developed in the future for the purpose of this study.^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mechanisms that allow pathogens to colonize the host are not the product of isolated genes, but instead emerge from the concerted operation of regulatory networks. Therefore, identifying components and the systemic behavior of networks is necessary to a better understanding of gene regulation and pathogenesis. To this end, I have developed systems biology approaches to study transcriptional and post-transcriptional gene regulation in bacteria, with an emphasis in the human pathogen Mycobacterium tuberculosis (Mtb). First, I developed a network response method to identify parts of the Mtb global transcriptional regulatory network utilized by the pathogen to counteract phagosomal stresses and survive within resting macrophages. As a result, the method unveiled transcriptional regulators and associated regulons utilized by Mtb to establish a successful infection of macrophages throughout the first 14 days of infection. Additionally, this network-based analysis identified the production of Fe-S proteins coupled to lipid metabolism through the alkane hydroxylase complex as a possible strategy employed by Mtb to survive in the host. Second, I developed a network inference method to infer the small non-coding RNA (sRNA) regulatory network in Mtb. The method identifies sRNA-mRNA interactions by integrating a priori knowledge of possible binding sites with structure-driven identification of binding sites. The reconstructed network was useful to predict functional roles for the multitude of sRNAs recently discovered in the pathogen, being that several sRNAs were postulated to be involved in virulence-related processes. Finally, I applied a combined experimental and computational approach to study post-transcriptional repression mediated by small non-coding RNAs in bacteria. Specifically, a probabilistic ranking methodology termed rank-conciliation was developed to infer sRNA-mRNA interactions based on multiple types of data. The method was shown to improve target prediction in Escherichia coli, and therefore is useful to prioritize candidate targets for experimental validation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Epilepsy is a very complex disease which can have a variety of etiologies, co-morbidities, and a long list of psychosocial factors4. Clinical management of epilepsy patients typically includes serological tests, EEG's, and imaging studies to determine the single best antiepileptic drug (AED). Self-management is a vital component of achieving optimal health when living with a chronic disease. For patients with epilepsy self-management includes any necessary actions to control seizures and cope with any subsequent effects of the condition9; including aspects of treatment, seizure, and lifestyle. The use of computer-based applications can allow for more effective use of clinic visits and ultimately enhance the patient-provider relationship through focused discussion of determinants affecting self-management. ^ The purpose of this study is to conduct a systematic literature review on informatics application in epilepsy self-management in an effort to describe current evidence for informatics applications and decision support as an adjunct to successful clinical management of epilepsy. Each publication was analyzed for the type of study design utilized. ^ A total of 68 publications were included and categorized by the study design used, development stage, and clinical domain. Descriptive study designs comprised of three-fourths of the publications and indicate an underwhelming use of prospective studies. The vast majority of prospective studies also focused on clinician use to increase knowledge in treating patients with epilepsy. ^ Due to the chronic nature of epilepsy and the difficulty that both clinicians and patients can experience in managing epilepsy, more prospective studies are needed to evaluate applications that can effectively increase management activities. Within the last two decades of epilepsy research, management studies have employed the use of biomedical informatics applications. While the use of computer applications to manage epilepsy has increased, more progress is needed.^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pancreatic cancer is the 4th most common cause for cancer death in the United States, accompanied by less than 5% five-year survival rate based on current treatments, particularly because it is usually detected at a late stage. Identifying a high-risk population to launch an effective preventive strategy and intervention to control this highly lethal disease is desperately needed. The genetic etiology of pancreatic cancer has not been well profiled. We hypothesized that unidentified genetic variants by previous genome-wide association study (GWAS) for pancreatic cancer, due to stringent statistical threshold or missing interaction analysis, may be unveiled using alternative approaches. To achieve this aim, we explored genetic susceptibility to pancreatic cancer in terms of marginal associations of pathway and genes, as well as their interactions with risk factors. We conducted pathway- and gene-based analysis using GWAS data from 3141 pancreatic cancer patients and 3367 controls with European ancestry. Using the gene set ridge regression in association studies (GRASS) method, we analyzed 197 pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Using the logistic kernel machine (LKM) test, we analyzed 17906 genes defined by University of California Santa Cruz (UCSC) database. Using the likelihood ratio test (LRT) in a logistic regression model, we analyzed 177 pathways and 17906 genes for interactions with risk factors in 2028 pancreatic cancer patients and 2109 controls with European ancestry. After adjusting for multiple comparisons, six pathways were marginally associated with risk of pancreatic cancer ( P < 0.00025): Fc epsilon RI signaling, maturity onset diabetes of the young, neuroactive ligand-receptor interaction, long-term depression (Ps < 0.0002), and the olfactory transduction and vascular smooth muscle contraction pathways (P = 0.0002; Nine genes were marginally associated with pancreatic cancer risk (P < 2.62 × 10−5), including five reported genes (ABO, HNF1A, CLPTM1L, SHH and MYC), as well as four novel genes (OR13C4, OR 13C3, KCNA6 and HNF4 G); three pathways significantly interacted with risk factors on modifying the risk of pancreatic cancer (P < 2.82 × 10−4): chemokine signaling pathway with obesity ( P < 1.43 × 10−4), calcium signaling pathway (P < 2.27 × 10−4) and MAPK signaling pathway with diabetes (P < 2.77 × 10−4). However, none of the 17906 genes tested for interactions survived the multiple comparisons corrections. In summary, our current GWAS study unveiled unidentified genetic susceptibility to pancreatic cancer using alternative methods. These novel findings provide new perspectives on genetic susceptibility to and molecular mechanisms of pancreatic cancer, once confirmed, will shed promising light on the prevention and treatment of this disease. ^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Metabolic reprogramming has been shown to be a major cancer hallmark providing tumor cells with significant advantages for survival, proliferation, growth, metastasis and resistance against anti-cancer therapies. Glycolysis, glutaminolysis and mitochondrial biogenesis are among the most essential cancer metabolic alterations because these pathways provide cancer cells with not only energy but also crucial metabolites to support large-scale biosynthesis, rapid proliferation and tumorigenesis. In this study, we find that 14-3-3σ suppresses all these three metabolic processes by promoting the degradation of their main driver, c-Myc. In fact, 14-3-3s significantly enhances c-Myc poly-ubiquitination and subsequent degradation, reduces c-Myc transcriptional activity, and down-regulates c-Myc-induced metabolic target genes expression. Therefore, 14-3-3σ remarkably blocks glycolysis, decreases glutaminolysis and diminishes mitochondrial mass of cancer cells both in vitro and in vivo, thereby severely suppressing cancer bioenergetics and metabolism. As a result, a high level of 14-3-3σ in tumors is strongly associated with increased breast cancer patients’ overall and metastasis-free survival as well as better clinical outcomes. Thus, this study reveals a new role for 14-3-3s as a significant regulator of cancer bioenergetics and a promising target for the development of anti-cancer metabolism therapies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Candida albicans is the most important fungal pathogen of humans. Transcript profiling studies show that upon phagocytosis by macrophages, C. albicans undergoes a massive metabolic reorganization activating genes involved in alternative carbon metabolism, including the glyoxylate cycle, β-oxidation and gluconeogenesis. Mutations in key enzymes such as ICL1 (glyoxylate cycle) and FOX2 (fatty acid β-oxidation) revealed that alternative carbon metabolic pathways are required for full virulence in C. albicans. These studies indicate C. albicans uses non-preferred carbon sources allowing its adaptation to microenvironments were nutrients are scarce. It has become apparent that the regulatory networks required for regulation of alternative carbon metabolism in C. albicans are considerably different from the Saccharomyces cerevisiae paradigm and appear more analogous to the Aspergillus nidulans systems. Well-characterized transcription factors in S. cerevisiae have no apparent phenotype or are missing in C. albicans. CTF1 was found to be a single functional homolog of the A. nidulans FarA/FarB proteins, which are transcription factors required for fatty acid utilization. Both FOX2 and ICL1 were found to be part of a large CTF1 regulon. To increase our understanding of how CTF1 regulates its target genes, including whether regulation is direct or indirect, the FOX2 and ICL1 promoter regions were analyzed using a combination of bioinformatics and promoter deletion analysis. To begin characterizing the FOX2 and ICL1 promoters, 5’ rapid amplification of cDNA ends (5’RACE) was used to identify two transcriptional initiation sites in FOX2 and one in ICL1. GFP reporter assays show FOX2 and ICL1 are rapidly expressed in the presence of alternative carbon sources. Both FOX2 and ICL1 harbor the CCTCGG sequence known to be bound by the Far proteins, hence rendering the motif as a putative CTF1 DNA binding element. In this study, the CCTCGG sequence was found to be essential for FOX2 regulation. However, this motif does not appear to be equally important for the regulation of ICL1. This study supports the notion that although C. albicans has diverged from the paradigms of model fungi, C. albicans has made specific adaptations to its transcription-based regulatory network that may contribute to its metabolic flexibility.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Clinical text understanding (CTU) is of interest to health informatics because critical clinical information frequently represented as unconstrained text in electronic health records are extensively used by human experts to guide clinical practice, decision making, and to document delivery of care, but are largely unusable by information systems for queries and computations. Recent initiatives advocating for translational research call for generation of technologies that can integrate structured clinical data with unstructured data, provide a unified interface to all data, and contextualize clinical information for reuse in multidisciplinary and collaborative environment envisioned by CTSA program. This implies that technologies for the processing and interpretation of clinical text should be evaluated not only in terms of their validity and reliability in their intended environment, but also in light of their interoperability, and ability to support information integration and contextualization in a distributed and dynamic environment. This vision adds a new layer of information representation requirements that needs to be accounted for when conceptualizing implementation or acquisition of clinical text processing tools and technologies for multidisciplinary research. On the other hand, electronic health records frequently contain unconstrained clinical text with high variability in use of terms and documentation practices, and without commitmentto grammatical or syntactic structure of the language (e.g. Triage notes, physician and nurse notes, chief complaints, etc). This hinders performance of natural language processing technologies which typically rely heavily on the syntax of language and grammatical structure of the text. This document introduces our method to transform unconstrained clinical text found in electronic health information systems to a formal (computationally understandable) representation that is suitable for querying, integration, contextualization and reuse, and is resilient to the grammatical and syntactic irregularities of the clinical text. We present our design rationale, method, and results of evaluation in processing chief complaints and triage notes from 8 different emergency departments in Houston Texas. At the end, we will discuss significance of our contribution in enabling use of clinical text in a practical bio-surveillance setting.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Embryonic stem cells (ESCs) possess two unique characteristics: infinite self-renewal and the potential to differentiate into almost every cell type (pluripotency). Recently, global expression analyses of metastatic breast and lung cancers revealed an ESC-like expression program or signature, specifically for cancers that are mutant for p53 function. Surprisingly, although p53 is widely recognized as the guardian of the genome, due to its roles in cell cycle checkpoints, programmed cell death or senescence, relatively little is known about p53 functions in normal cells, especially in ESCs. My hypothesis is that p53 has specific transcription regulatory functions in human ESCs (hESCs) that a) oppose pluripotency and b) protect the stem cell genome in response to DNA damage and stress signaling. In mouse ESCs, these roles are believed to coincide, as p53 promotes differentiation in response to DNA damage, but this is unexplored in hESCs. To determine the biological roles of p53, specifically in hESCs, we mapped genome-wide chromatin interactions of p53 by chromatin immunoprecipitation and massively parallel tag sequencing (ChIP-Seq), and did so under three VIdifferent conditions of hESC status: pluripotency, differentiation-initiated and DNA-damage-induced. ChIP-Seq showed that p53 is enriched at distinct, induction-specific gene loci during each of these different conditions. Microarray gene expression analysis and functional annotation of the distinct p53-target genes revealed that p53 regulates specific genes encoding developmental regulators, which are expressed in differentiation-initiated but not DNA- damaged hESCs. We further discovered that, in response to differentiation signaling, p53 binds regions of chromatin that are repressed but also poised for rapid activation by core pluripotency factors OCT4 and NANOG in pluripotent hESCs. In response to DNA damage, genes associated with migration and motility are targeted by p53; whereas, the prime targets of p53 in control of cell death are conserved for p53 regulation in both differentiation and DNA damage. Our genome-wide profiling and bioinformatics analyses show that p53 occupies a special set of developmental regulatory genes during early differentiation of hESCs and functions in an induction-specific manner. In conclusion, our research unveiled previously unknown functions of p53 in ESC biology, which augments our understanding of one of the most deregulated proteins in human cancers.