975 resultados para Complex diseases
Resumo:
The recent advance in high-throughput sequencing and genotyping protocols allows rapid investigation of Mendelian and complex diseases on a scale not previously been possible. In my thesis research I took advantage of these modern techniques to study retinitis pigmentosa (RP), a rare inherited disease characterized by progressive loss of photoreceptors and leading to blindness; and hypertension, a common condition affecting 30% of the adult population. Firstly, I compared the performance of different next generation sequencing (NGS) platforms in the sequencing of the RP-linked gene PRPF31. The gene contained a mutation in an intronic repetitive element, which presented difficulties for both classic sequencing methods and NGS. We showed that all NGS platforms are powerful tools to identify rare and common DNA variants, also in case of more complex sequences. Moreover, we evaluated the features of different NGS platforms that are important in re-sequencing projects. The main focus of my thesis was then to investigate the involvement of pre-mRNA splicing factors in autosomal dominant RP (adRP). I screened 5 candidate genes in a large cohort of patients by using long-range PCR as enrichment step, followed by NGS. We tested two different approaches: in one, all target PCRs from all patients were pooled and sequenced as a single DNA library; in the other, PCRs from each patient were separated within the pool by DNA barcodes. The first solution was more cost-effective, while the second one allowed obtaining faster and more accurate results, but overall they both proved to be effective strategies for gene screenings in many samples. We could in fact identify novel missense mutations in the SNRNP200 gene, encoding an essential RNA helicase for splicing catalysis. Interestingly, one of these mutations showed incomplete penetrance in one family with adRP. Thus, we started to study the possible molecular causes underlying phenotypic differences between asymptomatic and affected members of this family. For the study of hypertension, I joined a European consortium to perform genome-wide association studies (GWAS). Thanks to the use of very informative genotyping arrays and of phenotipically well-characterized cohorts, we could identify a novel susceptibility locus for hypertension in the promoter region of the endothelial nitric oxide synthase gene (NOS3). Moreover, we have proven the direct causality of the associated SNP using three different methods: 1) targeted resequencing, 2) luciferase assay, and 3) population study. - Le récent progrès dans le Séquençage à haut Débit et les protocoles de génotypage a permis une plus vaste et rapide étude des maladies mendéliennes et multifactorielles à une échelle encore jamais atteinte. Durant ma thèse de recherche, j'ai utilisé ces nouvelles techniques de séquençage afin d'étudier la retinite pigmentale (RP), une maladie héréditaire rare caractérisée par une perte progressive des photorécepteurs de l'oeil qui entraine la cécité; et l'hypertension, une maladie commune touchant 30% de la population adulte. Tout d'abord, j'ai effectué une comparaison des performances de différentes plateformes de séquençage NGS (Next Generation Sequencing) lors du séquençage de PRPF31, un gène lié à RP. Ce gène contenait une mutation dans un élément répétable intronique, qui présentait des difficultés de séquençage avec la méthode classique et les NGS. Nous avons montré que les plateformes de NGS analysées sont des outils très puissants pour identifier des variations de l'ADN rares ou communes et aussi dans le cas de séquences complexes. De plus, nous avons exploré les caractéristiques des différentes plateformes NGS qui sont importantes dans les projets de re-séquençage. L'objectif principal de ma thèse a été ensuite d'examiner l'effet des facteurs d'épissage de pre-ARNm dans une forme autosomale dominante de RP (adRP). Un screening de 5 gènes candidats issus d'une large cohorte de patients a été effectué en utilisant la long-range PCR comme étape d'enrichissement, suivie par séquençage avec NGS. Nous avons testé deux approches différentes : dans la première, toutes les cibles PCRs de tous les patients ont été regroupées et séquencées comme une bibliothèque d'ADN unique; dans la seconde, les PCRs de chaque patient ont été séparées par code barres d'ADN. La première solution a été la plus économique, tandis que la seconde a permis d'obtenir des résultats plus rapides et précis. Dans l'ensemble, ces deux stratégies se sont démontrées efficaces pour le screening de gènes issus de divers échantillons. Nous avons pu identifier des nouvelles mutations faux-sens dans le gène SNRNP200, une hélicase ayant une fonction essentielle dans l'épissage. Il est intéressant de noter qu'une des ces mutations montre une pénétrance incomplète dans une famille atteinte d'adRP. Ainsi, nous avons commencé une étude sur les causes moléculaires entrainant des différences phénotypiques entre membres affectés et asymptomatiques de cette famille. Lors de l'étude de l'hypertension, j'ai rejoint un consortium européen pour réaliser une étude d'association Pangénomique ou genome-wide association study Grâce à l'utilisation de tableaux de génotypage très informatifs et de cohortes extrêmement bien caractérisées au niveau phénotypique, un nouveau locus lié à l'hypertension a été identifié dans la région promotrice du gène endothélial nitric oxide sinthase (NOS3). Par ailleurs, nous avons prouvé la cause directe du SNP associé au moyen de trois méthodes différentes: i) en reséquençant la cible avec NGS, ii) avec des essais à la luciférase et iii) une étude de population.
Resumo:
Mapping perturbed molecular circuits that underlie complex diseases remains a great challenge. We developed a comprehensive resource of 394 cell type- and tissue-specific gene regulatory networks for human, each specifying the genome-wide connectivity among transcription factors, enhancers, promoters and genes. Integration with 37 genome-wide association studies (GWASs) showed that disease-associated genetic variants-including variants that do not reach genome-wide significance-often perturb regulatory modules that are highly specific to disease-relevant cell types or tissues. Our resource opens the door to systematic analysis of regulatory programs across hundreds of human cell types and tissues (http://regulatorycircuits.org).
Resumo:
Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.
Resumo:
(Full text is available at http://www.manu.edu.mk/prilozi). New generation genomic platforms enable us to decipher the complex genetic basis of complex diseases and Balkan Endemic Nephropathy (BEN) at a high-throughput basis. They give valuable information about predisposing Single Nucleotide Polymorphisms (SNPs), Copy Number Variations (CNVs) or Loss of Heterozygosity (LOH) (using SNP-array) and about disease-causing mutations along the whole sequence of candidate-genes (using Next Generation Sequencing). This information could be used for screening of individuals in risk families and moving the main medicine stream to the prevention. They also might have an impact on more effective treatment. Here we discuss these genomic platforms and report some applications of SNP-array technology in a case with familial nephrotic syndrome. Key words: complex diseases, genome wide association studies, SNP, genomic arrays, next generation sequ-encing.
Resumo:
My dissertation focuses on developing methods for gene-gene/environment interactions and imprinting effect detections for human complex diseases and quantitative traits. It includes three sections: (1) generalizing the Natural and Orthogonal interaction (NOIA) model for the coding technique originally developed for gene-gene (GxG) interaction and also to reduced models; (2) developing a novel statistical approach that allows for modeling gene-environment (GxE) interactions influencing disease risk, and (3) developing a statistical approach for modeling genetic variants displaying parent-of-origin effects (POEs), such as imprinting. In the past decade, genetic researchers have identified a large number of causal variants for human genetic diseases and traits by single-locus analysis, and interaction has now become a hot topic in the effort to search for the complex network between multiple genes or environmental exposures contributing to the outcome. Epistasis, also known as gene-gene interaction is the departure from additive genetic effects from several genes to a trait, which means that the same alleles of one gene could display different genetic effects under different genetic backgrounds. In this study, we propose to implement the NOIA model for association studies along with interaction for human complex traits and diseases. We compare the performance of the new statistical models we developed and the usual functional model by both simulation study and real data analysis. Both simulation and real data analysis revealed higher power of the NOIA GxG interaction model for detecting both main genetic effects and interaction effects. Through application on a melanoma dataset, we confirmed the previously identified significant regions for melanoma risk at 15q13.1, 16q24.3 and 9p21.3. We also identified potential interactions with these significant regions that contribute to melanoma risk. Based on the NOIA model, we developed a novel statistical approach that allows us to model effects from a genetic factor and binary environmental exposure that are jointly influencing disease risk. Both simulation and real data analyses revealed higher power of the NOIA model for detecting both main genetic effects and interaction effects for both quantitative and binary traits. We also found that estimates of the parameters from logistic regression for binary traits are no longer statistically uncorrelated under the alternative model when there is an association. Applying our novel approach to a lung cancer dataset, we confirmed four SNPs in 5p15 and 15q25 region to be significantly associated with lung cancer risk in Caucasians population: rs2736100, rs402710, rs16969968 and rs8034191. We also validated that rs16969968 and rs8034191 in 15q25 region are significantly interacting with smoking in Caucasian population. Our approach identified the potential interactions of SNP rs2256543 in 6p21 with smoking on contributing to lung cancer risk. Genetic imprinting is the most well-known cause for parent-of-origin effect (POE) whereby a gene is differentially expressed depending on the parental origin of the same alleles. Genetic imprinting affects several human disorders, including diabetes, breast cancer, alcoholism, and obesity. This phenomenon has been shown to be important for normal embryonic development in mammals. Traditional association approaches ignore this important genetic phenomenon. In this study, we propose a NOIA framework for a single locus association study that estimates both main allelic effects and POEs. We develop statistical (Stat-POE) and functional (Func-POE) models, and demonstrate conditions for orthogonality of the Stat-POE model. We conducted simulations for both quantitative and qualitative traits to evaluate the performance of the statistical and functional models with different levels of POEs. Our results showed that the newly proposed Stat-POE model, which ensures orthogonality of variance components if Hardy-Weinberg Equilibrium (HWE) or equal minor and major allele frequencies is satisfied, had greater power for detecting the main allelic additive effect than a Func-POE model, which codes according to allelic substitutions, for both quantitative and qualitative traits. The power for detecting the POE was the same for the Stat-POE and Func-POE models under HWE for quantitative traits.
Resumo:
The modern approach to the development of new chemical entities against complex diseases, especially the neglected endemic diseases such as tuberculosis and malaria, is based on the use of defined molecular targets. Among the advantages, this approach allows (i) the search and identification of lead compounds with defined molecular mechanisms against a defined target (e.g. enzymes from defined pathways), (ii) the analysis of a great number of compounds with a favorable cost/benefit ratio, (iii) the development even in the initial stages of compounds with selective toxicity (the fundamental principle of chemotherapy), (iv) the evaluation of plant extracts as well as of pure substances. The current use of such technology, unfortunately, is concentrated in developed countries, especially in the big pharma. This fact contributes in a significant way to hamper the development of innovative new compounds to treat neglected diseases. The large biodiversity within the territory of Brazil puts the country in a strategic position to develop the rational and sustained exploration of new metabolites of therapeutic value. The extension of the country covers a wide range of climates, soil types, and altitudes, providing a unique set of selective pressures for the adaptation of plant life in these scenarios. Chemical diversity is also driven by these forces, in an attempt to best fit the plant communities to the particular abiotic stresses, fauna, and microbes that co-exist with them. Certain areas of vegetation (Amazonian Forest, Atlantic Forest, Araucaria Forest, Cerrado-Brazilian Savanna, and Caatinga) are rich in species and types of environments to be used to search for natural compounds active against tuberculosis, malaria, and chronic-degenerative diseases. The present review describes some strategies to search for natural compounds, whose choice can be based on ethnobotanical and chemotaxonomical studies, and screen for their ability to bind to immobilized drug targets and to inhibit their activities. Molecular cloning, gene knockout, protein expression and purification, N-terminal sequencing, and mass spectrometry are the methods of choice to provide homogeneous drug targets for immobilization by optimized chemical reactions. Plant extract preparations, fractionation of promising plant extracts, propagation protocols and definition of in planta studies to maximize product yield of plant species producing active compounds have to be performed to provide a continuing supply of bioactive materials. Chemical characterization of natural compounds, determination of mode of action by kinetics and other spectroscopic methods (MS, X-ray, NMR), as well as in vitro and in vivo biological assays, chemical derivatization, and structure-activity relationships have to be carried out to provide a thorough knowledge on which to base the search for natural compounds or their derivatives with biological activity.
Resumo:
Copy number variation (CNV) has recently gained considerable interest as a source of genetic variation likely to play a role in phenotypic diversity and evolution. Much effort has been put into the identification and mapping of regions that vary in copy number among seemingly normal individuals in humans and a number of model organisms, using bioinformatics or hybridization-based methods. These have allowed uncovering associations between copy number changes and complex diseases in whole-genome association studies, as well as identify new genomic disorders. At the genome-wide scale, however, the functional impact of CNV remains poorly studied. Here we review the current catalogs of CNVs, their association with diseases and how they link genotype and phenotype. We describe initial evidence which revealed that genes in CNV regions are expressed at lower and more variable levels than genes mapping elsewhere, and also that CNV not only affects the expression of genes varying in copy number, but also have a global influence on the transcriptome. Further studies are warranted for complete cataloguing and fine mapping of CNVs, as well as to elucidate the different mechanisms by which they influence gene expression.
Resumo:
Epigenetics is defined as the study of all inheritable and potentially reversible changes in genome function that do not alter the nucleotide sequence within the DNA. Epigenetic mechanisms such as DNA methylation, histone modification, nucleosome positioning, and microRNAs (miRNAs) are essential to carry out key functions in the regulation of gene expression. Therefore, the epigenetic mechanisms are a window to understanding the possible mechanisms involved in the pathogenesis of complex diseases such as autoimmune diseases. It is noteworthy that autoimmune diseases do not have the same epidemiology, pathology, or symptoms but do have a common origin that can be explained by the sharing of immunogenetic mechanisms. Currently, epigenetic research is looking for disruption in one or more epigenetic mechanisms to provide new insights into autoimmune diseases. The identification of cell-specific targets of epigenetic deregulation will serve us as clinical markers for diagnosis, disease progression, and therapy approaches.
Resumo:
Background: Genetic and epigenetic factors interacting with the environment over time are the main causes of complex diseases such as autoimmune diseases (ADs). Among the environmental factors are organic solvents (OSs), which are chemical compounds used routinely in commercial industries. Since controversy exists over whether ADs are caused by OSs, a systematic review and meta-analysis were performed to assess the association between OSs and ADs. Methods and Findings: The systematic search was done in the PubMed, SCOPUS, SciELO and LILACS databases up to February 2012. Any type of study that used accepted classification criteria for ADs and had information about exposure to OSs was selected. Out of a total of 103 articles retrieved, 33 were finally included in the meta-analysis. The final odds ratios (ORs) and 95% confidence intervals (CIs) were obtained by the random effect model. A sensitivity analysis confirmed results were not sensitive to restrictions on the data included. Publication bias was trivial. Exposure to OSs was associated to systemic sclerosis, primary systemic vasculitis and multiple sclerosis individually and also to all the ADs evaluated and taken together as a single trait (OR: 1.54; 95% CI: 1.25-1.92; p-value, 0.001). Conclusion: Exposure to OSs is a risk factor for developing ADs. As a corollary, individuals with non-modifiable risk factors (i.e., familial autoimmunity or carrying genetic factors) should avoid any exposure to OSs in order to avoid increasing their risk of ADs.
Resumo:
The modern approach to the development of new chemical entities against complex diseases, especially the neglected endemic diseases such as tuberculosis and malaria, is based on the use of defined molecular targets. Among the advantages, this approach allows (i) the search and identification of lead compounds with defined molecular mechanisms against a defined target (e.g. enzymes from defined pathways), (ii) the analysis of a great number of compounds with a favorable cost/benefit ratio, (iii) the development even in the initial stages of compounds with selective toxicity (the fundamental principle of chemotherapy), (iv) the evaluation of plant extracts as well as of pure substances. The current use of such technology, unfortunately, is concentrated in developed countries, especially in the big pharma. This fact contributes in a significant way to hamper the development of innovative new compounds to treat neglected diseases. The large biodiversity within the territory of Brazil puts the country in a strategic position to develop the rational and sustained exploration of new metabolites of therapeutic value. The extension of the country covers a wide range of climates, soil types, and altitudes, providing a unique set of selective pressures for the adaptation of plant life in these scenarios. Chemical diversity is also driven by these forces, in an attempt to best fit the plant communities to the particular abiotic stresses, fauna, and microbes that co-exist with them. Certain areas of vegetation (Amazonian Forest, Atlantic Forest, Araucaria Forest, Cerrado-Brazilian Savanna, and Caatinga) are rich in species and types of environments to be used to search for natural compounds active against tuberculosis, malaria, and chronic-degenerative diseases. The present review describes some strategies to search for natural compounds, whose choice can be based on ethnobotanical and chemotaxonomical studies, and screen for their ability to bind to immobilized drug targets and to inhibit their activities. Molecular cloning, gene knockout, protein expression and purification, N-terminal sequencing, and mass spectrometry are the methods of choice to provide homogeneous drug targets for immobilization by optimized chemical reactions. Plant extract preparations, fractionation of promising plant extracts, propagation protocols and definition of in planta studies to maximize product yield of plant species producing active compounds have to be performed to provide a continuing supply of bioactive materials. Chemical characterization of natural compounds, determination of mode of action by kinetics and other spectroscopic methods (MS, X-ray, NMR), as well as in vitro and in vivo biological assays, chemical derivatization, and structure-activity relationships have to be carried out to provide a thorough knowledge on which to base the search for natural compounds or their derivatives with biological activity.
Resumo:
The domestic dog offers a unique opportunity to explore the genetic basis of disease, morphology and behaviour. Humans share many diseases with our canine companions, making dogs an ideal model organism for comparative disease genetics. Using newly developed resources, genome-wide association studies in dog breeds are proving to be exceptionally powerful. Towards this aim, veterinarians and geneticists from 12 European countries are collaborating to collect and analyse the DNA from large cohorts of dogs suffering from a range of carefully defined diseases of relevance to human health. This project, named LUPA, has already delivered considerable results. The consortium has collaborated to develop a new high density single nucleotide polymorphism (SNP) array. Mutations for four monogenic diseases have been identified and the information has been utilised to find mutations in human patients. Several complex diseases have been mapped and fine mapping is underway. These findings should ultimately lead to a better understanding of the molecular mechanisms underlying complex diseases in both humans and their best friend.
Resumo:
Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^
Resumo:
As for other complex diseases, linkage analyses of schizophrenia (SZ) have produced evidence for numerous chromosomal regions, with inconsistent results reported across studies. The presence of locus heterogeneity appears likely and may reduce the power of linkage analyses if homogeneity is assumed. In addition, when multiple heterogeneous datasets are pooled, intersample variation in the proportion of linked families ( a) may diminish the power of the pooled sample to detect susceptibility loci, in spite of the larger sample size obtained. We compare the significance of linkage. findings obtained using allele- sharing LOD scores ( LODexp) - which assume homogeneity - and heterogeneity LOD scores ( HLOD) in European American and African American NIMH SZ families. We also pool these two samples and evaluate the relative power of the LODexp and two different heterogeneity statistics. One of these ( HLOD- P) estimates the heterogeneity parameter a only in aggregate data, while the second ( HLOD- S) determines a separately for each sample. In separate and combined data, we show consistently improved performance of HLOD scores over LODexp. Notably, genome-wide significant evidence for linkage is obtained at chromosome 10p in the European American sample using a recessive HLOD score. When the two samples are combined, linkage at the 10p locus also achieves genome-wide significance under HLOD- S, but not HLOD- P. Using HLOD- S, improved evidence for linkage was also obtained for a previously reported region on chromosome 15q. In linkage analyses of complex disease, power may be maximised by routinely modelling locus heterogeneity within individual datasets, even when multiple datasets are combined to form larger samples.
Resumo:
Background/Aims: Statistical analysis of age-at-onset involving family data is particularly complicated because there is a correlation pattern that needs to be modeled and also because there are measurements that are censored. In this paper, our main purpose was to evaluate the effect of genetic and shared family environmental factors on age-at-onset of three cardiovascular risk factors: hypertension, diabetes and high cholesterol. Methods: The mixed-effects Cox model proposed by Pankratz et al. [2005] was used to analyze the data from 81 families, involving 1,675 individuals from the village of Baependi, in the state of Minas Gerais, Brazil. Results: The analyses performed showed that the polygenic effect plays a greater role than the shared family environmental effect in explaining the variability of the age-at-onset of hypertension, diabetes and high cholesterol. The model which simultaneously evaluated both effects indicated that there are individuals which may have risk of hypertension due to polygenic effects 130% higher than the overall average risk for the entire sample. For diabetes and high cholesterol the risks of some individuals were 115 and 45%, respectively, higher than the overall average risk for the entire population. Conclusions: Results showed evidence of significant polygenic effects indicating that age-at-onset is a useful trait for gene mapping of the common complex diseases analyzed. In addition, we found that the polygenic random component might absorb the effects of some covariates usually considered in the risk evaluation, such as gender, age and BMI. Copyright (C) 2008 S. Karger AG, Basel