4 resultados para Genetic variant

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The genetic and environmental risk factors of vascular cognitive impairment are still largely unknown. This thesis aimed to assess the genetic background of two clinically similar familial small vessel diseases (SVD), CADASIL (Cerebral Autosomal Dominant Arteriopathy with Subcortical Infarcts and Leukoencephalopathy) and Swedish hMID (hereditary multi-infarct dementia of Swedish type). In the first study, selected genetic modifiers of CADASIL were studied in a homogenous Finnish CADASIL population of 134 patients, all carrying the p.Arg133Cys mutation in NOTCH3. Apolipoprotein E (APOE) genotypes, angiotensinogen (AGT) p.Met268Thr polymorphism and eight NOTCH3 polymorphisms were studied, but no associations between any particular genetic variant and first-ever stroke or migraine were seen. In the second study, smoking, statin medication and physical activity were suggested to be the most profound environmental differences among the monozygotic twins with CADASIL. Swedish hMID was for long misdiagnosed as CADASIL. In the third study, the CADASIL diagnosis in the Swedish hMID family was ruled out on the basis of genetic, radiological and pathological findings, and Swedish hMID was suggested to represent a novel SVD. In the fourth study, the gene defect of Swedish hMID was then sought using whole exome sequencing paired with a linkage analysis. The strongest candidate for the pathogenic mutation was a 3’UTR variant in the COL4A1 gene, but further studies are needed to confirm its functionality. This study provided new information about the genetic background of two inherited SVDs. Profound knowledge about the pathogenic mutations causing familial SVD is also important for correct diagnosis and treatment options.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Protein engineering aims to improve the properties of enzymes and affinity reagents by genetic changes. Typical engineered properties are affinity, specificity, stability, expression, and solubility. Because proteins are complex biomolecules, the effects of specific genetic changes are seldom predictable. Consequently, a popular strategy in protein engineering is to create a library of genetic variants of the target molecule, and render the population in a selection process to sort the variants by the desired property. This technique, called directed evolution, is a central tool for trimming protein-based products used in a wide range of applications from laundry detergents to anti-cancer drugs. New methods are continuously needed to generate larger gene repertoires and compatible selection platforms to shorten the development timeline for new biochemicals. In the first study of this thesis, primer extension mutagenesis was revisited to establish higher quality gene variant libraries in Escherichia coli cells. In the second study, recombination was explored as a method to expand the number of screenable enzyme variants. A selection platform was developed to improve antigen binding fragment (Fab) display on filamentous phages in the third article and, in the fourth study, novel design concepts were tested by two differentially randomized recombinant antibody libraries. Finally, in the last study, the performance of the same antibody repertoire was compared in phage display selections as a genetic fusion to different phage capsid proteins and in different antibody formats, Fab vs. single chain variable fragment (ScFv), in order to find out the most suitable display platform for the library at hand. As a result of the studies, a novel gene library construction method, termed selective rolling circle amplification (sRCA), was developed. The method increases mutagenesis frequency close to 100% in the final library and the number of transformants over 100-fold compared to traditional primer extension mutagenesis. In the second study, Cre/loxP recombination was found to be an appropriate tool to resolve the DNA concatemer resulting from error-prone RCA (epRCA) mutagenesis into monomeric circular DNA units for higher efficiency transformation into E. coli. Library selections against antigens of various size in the fourth study demonstrated that diversity placed closer to the antigen binding site of antibodies supports generation of antibodies against haptens and peptides, whereas diversity at more peripheral locations is better suited for targeting proteins. The conclusion from a comparison of the display formats was that truncated capsid protein three (p3Δ) of filamentous phage was superior to the full-length p3 and protein nine (p9) in obtaining a high number of uniquely specific clones. Especially for digoxigenin, a difficult hapten target, the antibody repertoire as ScFv-p3Δ provided the clones with the highest affinity for binding. This thesis on the construction, design, and selection of gene variant libraries contributes to the practical know-how in directed evolution and contains useful information for scientists in the field to support their undertakings.