864 resultados para Textual Genetics
Resumo:
Cardiovascular diseases (CVD) are major contributors to morbidity and mortality worldwide. Several interacting environmental, biochemical, and genetic risk factors can increase disease susceptibility. While some of the genes involved in the etiology of CVD are known, many are yet to be discovered. During the last few decades, scientists have searched for these genes with genome-wide linkage and association methods, and with more targeted candidate gene studies. This thesis investigates variation within the upstream transcription factor 1 (USF1) gene locus in relation to CVD risk factors, atherosclerosis, and incidence and prevalence of CVD. This candidate gene was first identified in Finnish families ascertained for familial combined hyperlipidemia, a common dyslipidemia predisposing to coronary heart disease. The gene is a ubiquitously expressed transcription factor regulating expression of several genes from lipid and glucose metabolism, inflammation, and endothelial function. First, we examined association between USF1 variants and several CVD risk factors, such as lipid phenotypes, body composition measures, and metabolic syndrome, in two prospective population cohorts. Our data suggested that USF1 contributes to these CVD risk factors at the population level. Notably, the associations with quantitative measurements were mostly detected among study subjects with CVD or metabolic syndrome, suggesting complex interactions between USF1 effects and the pathophysiological state of an individual. Second, we investigated how variation at the USF1 locus contributes to atherosclerotic lesions of the coronary arteries and abdominal aorta. For this, we used two study samples of middle-aged men with detailed measurements of atherosclerosis obtained in autopsy. USF1 variation significantly associated with areas of several types of lesions, especially with calcification of the arteries. Next, we tested what effect the USF1 risk variants have on sudden cardiac death and incidence of CVD. The atherosclerosis-associated risk variant increased the risk of sudden cardiac death of the same study subjects. Furthermore, USF1 alleles associated with incidence of CVD in the Finnish population follow-up cohorts. These associations were especially prominent among women, suggesting a sex specific effect, which has also been detected in subsequent studies. Finally, as some of the low-yield DNA samples of the Finnish follow-up study cohort needed to be whole-genome amplified (WGA) prior to genotyping, we evaluated whether the produced WGA genotypes were of good quality. Although the samples giving genotype discrepancies could not be detected before genotyping with standard laboratory quality control methods, our results suggested that enhanced quality control at the time of the genotyping could identify such samples. In addition, combining two WGA reactions into one pooled DNA sample for genotyping markedly reduced the number of discrepancies and samples showing them. In conclusion, USF1 seems to have a role in the etiology of CVD. Additional studies are warranted to identify functional variants and to study interactions between USF1 and other genetic or environmental factors. This USF1 study, and other studies with low DNA yield of some samples, can benefit from whole genome amplification of the low-yield samples prior to genotyping. Careful quality control procedures are, however, needed in WGA genotyping.
Resumo:
Digital Image
Resumo:
Puumala virus (PUUV) is the causative agent of nephropathia epidemica (NE), a mild form of hemorrhagic fever with renal syndrome. Finland has the highest documented incidence of NE with around 1000 cases diagnosed annually. PUUV is also found in other Scandinavian countries, Central Europe and the European part of Russia. PUUV belongs to the genus Hantavirus in the family Bunyaviridae. Hantaviruses are rodent-borne viruses each carried by a specific host that is persistently and asymptomatically infected by the virus. PUUV is carried by the bank voles (Myodes glareolus, previously known as Clethrionomys glareolus). Hantaviruses have co-evolved with their carrier rodents for millions of years and these host animals are the evolutionary scene of hantaviruses. In this study, PUUV sequences were recovered from bank voles captured in Denmark and Russian Karelia to study the evolution of PUUV in Scandinavia. Phylogenetic analysis of these strains showed a geographical clustering of genetic variants following the presumable migration pattern of bank voles during the recolonization of Scandinavia after the last ice age approximately 10 000 years ago. The currently known PUUV genome sequences were subjected to in-depth phylogenetic analyses and the results showed that genetic drift seems to be the major mechanism of PUUV evolution. In general, PUUV seems to evolve quite slowly following a molecular clock. We also found evidence for recombination in the evolution of some genetic lineages of PUUV. Viral microevolution was studied in controlled virus transmission in colonized bank voles and changes in quasispecies dynamics were recorded as the virus was transmitted from one animal to another. We witnessed PUUV evolution in vivo, as one synonymous mutation became repeatedly fixed in the viral genome during the experiment. The detailed knowledge on the PUUV diversity was used to establish new sensitive and specific detection methods for this virus. Direct viral invasion of the hypophysis was demonstrated for the first time in a lethal case of NE. PUUV detection was done by immunohistochemistry, in situ hybridization and RT-nested-PCR of the autopsy tissue samples.
Resumo:
As for other complex diseases, linkage analyses of schizophrenia (SZ) have produced evidence for numerous chromosomal regions, with inconsistent results reported across studies. The presence of locus heterogeneity appears likely and may reduce the power of linkage analyses if homogeneity is assumed. In addition, when multiple heterogeneous datasets are pooled, inter-sample variation in the proportion of linked families (alpha) may diminish the power of the pooled sample to detect susceptibility loci, in spite of the larger sample size obtained. We compare the significance of linkage findings obtained using allele-sharing LOD scores (LOD(exp))-which assume homogeneity-and heterogeneity LOD scores (HLOD) in European American and African American NIMH SZ families. We also pool these two samples and evaluate the relative power of the LOD(exp) and two different heterogeneity statistics. One of these (HLOD-P) estimates the heterogeneity parameter alpha only in aggregate data, while the second (HLOD-S) determines alpha separately for each sample. In separate and combined data, we show consistently improved performance of HLOD scores over LOD(exp). Notably, genome-wide significant evidence for linkage is obtained at chromosome 10p in the European American sample using a recessive HLOD score. When the two samples are combined, linkage at the 10p locus also achieves genome-wide significance under HLOD-S, but not HLOD-P. Using HLOD-S, improved evidence for linkage was also obtained for a previously reported region on chromosome 15q. In linkage analyses of complex disease, power may be maximised by routinely modelling locus heterogeneity within individual datasets, even when multiple datasets are combined to form larger samples.
Resumo:
Nested association mapping (NAM) offers power to dissect complex, quantitative traits. This study made use of a recently developed sorghum backcross (BC)-NAM population to dissect the genetic architecture of flowering time in sorghum; to compare the QTL identified with other genomic regions identified in previous sorghum and maize flowering time studies and to highlight the implications of our findings for plant breeding. A subset of the sorghum BC-NAM population consisting of over 1,300 individuals from 24 families was evaluated for flowering time across multiple environments. Two QTL analysis methodologies were used to identify 40 QTLs with predominately small, additive effects on flowering time; 24 of these co-located with previously identified QTL for flowering time in sorghum and 16 were novel in sorghum. Significant synteny was also detected with the QTL for flowering time detected in a comparable NAM resource recently developed for maize (Zea mays) by Buckler et al. (Science 325:714-718, 2009). The use of the sorghum BC-NAM population allowed us to catalogue allelic variants at a maximal number of QTL and understand their contribution to the flowering time phenotype and distribution across diverse germplasm. The successful demonstration of the power of the sorghum BC-NAM population is exemplified not only by correspondence of QTL previously identified in sorghum, but also by correspondence of QTL in different taxa, specifically maize in this case. The unification across taxa of the candidate genes influencing complex traits, such as flowering time can further facilitate the detailed dissection of the genetic control and causal genes.
Resumo:
Wood is an important biological resource which contributes to nutrient and hydrology cycles through ecosystems, and provides structural support at the plant level. Thousands of genes are involved in wood development, yet their effects on phenotype are not well understood. We have exploited the low genomic linkage disequilibrium (LD) and abundant phenotypic variation of forest trees to explore allelic diversity underlying wood traits in an association study. Candidate gene allelic diversity was modelled against quantitative variation to identify SNPs influencing wood properties, growth and disease resistance across three populations of Corymbia citriodora subsp. variegata, a forest tree of eastern Australia. Nine single nucleotide polymorphism (SNP) associations from six genes were identified in a discovery population (833 individuals). Associations were subsequently tested in two smaller populations (130160 individuals), validating our findings in three cases for actin 7 (ACT7) and COP1 interacting protein 7 (CIP7). The results imply a functional role for these genes in mediating wood chemical composition and growth, respectively. A flip in the effect of ACT7 on pulp yield between populations suggests gene by environment interactions are at play. Existing evidence of gene function lends strength to the observed associations, and in the case of CIP7 supports a role in cortical photosynthesis.
Resumo:
Since the first investigation 25 years ago, the application of genetic tools to address ecological and evolutionary questions in elasmobranch studies has greatly expanded. Major developments in genetic theory as well as in the availability, cost effectiveness and resolution of genetic markers were instrumental for particularly rapid progress over the last 10 years. Genetic studies of elasmobranchs are of direct importance and have application to fisheries management and conservation issues such as the definition of management units and identification of species from fins. In the future, increased application of the most recent and emerging technologies will enable accelerated genetic data production and the development of new markers at reduced costs, paving the way for a paradigm shift from gene to genome-scale research, and more focus on adaptive rather than just neutral variation. Current literature is reviewed in six fields of elasmobranch molecular genetics relevant to fisheries and conservation management (species identification, phylogeography, philopatry, genetic effective population size, molecular evolutionary rate and emerging methods). Where possible, examples from the Indo-Pacific region, which has been underrepresented in previous reviews, are emphasized within a global perspective. (C) 2012 The Authors Journal of Fish Biology (C) 2012 The Fisheries Society of the British Isles
Resumo:
Reproduction records from 2137 cows first mated at 2 years of age and recorded through to 8.5 years of age were used to study the genetics of early and lifetime reproductive performance from two genotypes (1020 Brahman and 1117 Tropical Composite) in tropical Australian production systems. Regular ultrasound scanning of the reproductive tract, coupled with full recording of mating, calving and weaning histories, allowed a comprehensive evaluation of a range of reproductive traits. Results showed components traits of early reproductive performance had moderate to high heritabilities, especially in Brahmans. The heritability of lactation anoestrous interval in 3-year-old cows was 0.51 +/- 0.18 and 0.26 +/- 0.11 for Brahman and Tropical Composite, respectively. Heritabilities of binary reproductive output traits (conception rate, pregnancy rate, calving rate and weaning rate) from first and second matings were generally moderate to high on the underlying scale. Estimates ranged from 0.15 to 0.69 in Brahman and 0.15 to 0.34 in Tropical Composite, but were considerably lower when expressed on the observed scale, particularly for those traits with high mean levels. Heritabilities of lifetime reproduction traits were low, with estimates of 0.11 +/- 0.06 and 0.07 +/- 0.06 for lifetime annual weaning rate in Brahman and Tropical Composite, respectively. Significant differences in mean reproductive performance were observed between the two genotypes, especially for traits associated with anoestrus in first-lactation cows. Genetic correlations between early-in-life reproductive measures and lifetime reproduction traits were moderate to high. Genetic correlations between lactation anoestrous interval and lifetime annual weaning rate were -0.62 +/- 0.24 in Brahman and -0.87 +/- 0.32 in Tropical Composite. The results emphasise the substantial opportunity that exists to genetically improve weaning rates in tropical beef cattle breeds by focusing recording and selection on early-in-life female reproduction traits, particularly in Brahman for traits associated with lactation anoestrus.
Resumo:
Marine species generally have large population sizes, continuous distributions and high dispersal capacity. Despite this, they are often subdivided into separate populations, which are the basic units of fisheries management. For example, populations of some fisheries species across the deep water of the Timor Trench are genetically different, inferring minimal movement and interbreeding. When connectivity is higher than the Timor Trench example, but not so high that the populations become one, connectivity between populations is crinkled. Crinkled connectivity occurs when migration is above the threshold required to link populations genetically, but below the threshold for demographic links. In future, genetic estimates of connectivity over crinkled links could be uniquely combined with other data, such as estimates of population size and tagging and tracking data, to quantify demographic connectedness between these types of populations. Elasmobranch species may be ideal targets for this research because connectivity between populations is more likely to be crinkled than for finfish species. Fisheries stock-assessment models could be strengthened with estimates of connectivity to improve the strategic and sustainable harvesting of biological resources.
Resumo:
Despite international protection of white sharks Carcharodon carcharias, important conservation parameters such as abundance, population structure and genetic diversity are largely unknown. The tissue of 97 predominately juvenile white sharks sampled from spatially distant eastern and southwestern Australian coastlines was sequenced for the mitochondrial DNA (mtDNA) control region and genotyped with 6 nuclear-encoded microsatellite loci. MtDNA population structure was found between the eastern and southwestern coasts (F-ST = 0.142, p < 0.0001), implying female reproductive philopatry. This concurs with recent satellite and acoustic tracking findings which suggest the sustained presence of discrete east coast nursery areas. Furthermore, population subdivision was found between the same regions with biparentally inherited micro satellite markers (F-ST = 0.009, p < 0.05), suggesting that males may also exhibit some degree of reproductive philopatry; 5 sharks captured along the east coast had mtDNA haplotypes that resembled western Indian Ocean sharks more closely than Australian/New Zealand sharks, suggesting that transoceanic dispersal, or migration resulting in breeding, may occur sporadically. Our most robust estimate of contemporary genetic effective population size was low and close to thresholds at which adaptive potential may be lost. For a variety of reasons, these contemporary estimates were at least 1, possibly 2, orders of magnitude below our historical effective size estimates. Population decline could expose these genetically isolated populations to detrimental genetic effects. Regional Australian white shark conservation management units should be implemented until genetic population structure, size and diversity can be investigated in more detail.
Resumo:
Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.
Resumo:
In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.