934 resultados para Whole-Genome Association


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lichens are symbiotic organisms, which consist of the fungal partner and the photosynthetic partner, which can be either an alga or a cyanobacterium. In some lichen species the symbiosis is tripartite, where the relationship includes both an alga and a cyanobacterium alongside the primary symbiont, fungus. The lichen symbiosis is an evolutionarily old adaptation to life on land and many extant fungal species have evolved from lichenised ancestors. Lichens inhabit a wide range of habitats and are capable of living in harsh environments and on nutrient poor substrates, such as bare rocks, often enduring frequent cycles of drying and wetting. Most lichen species are desiccation tolerant, and they can survive long periods of dehydration, but can rapidly resume photosynthesis upon rehydration. The molecular mechanisms behind lichen desiccation tolerance are still largely uncharacterised and little information is available for any lichen species at the genomic or transcriptomic level. The emergence of the high-throughput next generation sequencing (NGS) technologies and the subsequent decrease in the cost of sequencing new genomes and transcriptomes has enabled non-model organism research on the whole genome level. In this doctoral work the transcriptome and genome of the grey reindeer lichen, Cladonia rangiferina, were sequenced, de novo assembled and characterised using NGS and traditional expressed sequence tag (EST) technologies. RNA extraction methods were optimised to improve the yield and quality of RNA extracted from lichen tissue. The effects of rehydration and desiccation on C. rangiferina gene expression on whole transcriptome level were studied and the most differentially expressed genes were identified. The secondary metabolites present in C. rangiferina decreased the quality – integrity, optical characteristics and utility for sensitive molecular biological applications – of the extracted RNA requiring an optimised RNA extraction method for isolating sufficient quantities of high-quality RNA from lichen tissue in a time- and cost-efficient manner. The de novo assembly of the transcriptome of C. rangiferina was used to produce a set of contiguous unigene sequences that were used to investigate the biological functions and pathways active in a hydrated lichen thallus. The de novo assembly of the genome yielded an assembly containing mostly genes derived from the fungal partner. The assembly was of sufficient quality, in size similar to other lichen-forming fungal genomes and included most of the core eukaryotic genes. Differences in gene expression were detected in all studied stages of desiccation and rehydration, but the largest changes occurred during the early stages of rehydration. The most differentially expressed genes did not have any annotations, making them potentially lichen-specific genes, but several genes known to participate in environmental stress tolerance in other organisms were also identified as differentially expressed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Genome sequence varies in numerous ways among individuals although the gross architecture is fixed for all humans. Retrotransposons create one of the most abundant structural variants in the human genome and are divided in many families, with certain members in some families, e.g., L1, Alu, SVA, and HERV-K, remaining active for transposition. Along with other types of genomic variants, retrotransponson-derived variants contribute to the whole spectrum of genome variants in humans. With the advancement of sequencing techniques, many human genomes are being sequenced at the individual level, fueling the comparative research on these variants among individuals. In this thesis, the evolution and functional impact of structural variations is examined primarily focusing on retrotransposons in the context of human evolution. The thesis comprises of three different studies on the topics that are presented in three data chapters. First, the recent evolution of all human specific AluYb members, representing the second most active subfamily of Alus, was tracked to identify their source/master copy using a novel approach. All human-specific AluYb elements from the reference genome were extracted, aligned with one another to construct clusters of similar copies and each cluster was analyzed to generate the evolutionary relationship between the members of the cluster. The approach resulted in identification of one major driver copy of all human specific Yb8 and the source copy of the Yb9 lineage. Three new subfamilies within the AluYb family – Yb8a1, Yb10 and Yb11 were also identified, with Yb11 being the youngest and most polymorphic. Second, an attempt to construct a relation between transposable elements (TEs) and tandem repeats (TRs) was made at a genome-wide scale for the first time. Upon sequence comparison, positional cross-checking and other relevant analyses, it was observed that over 20% of all TRs are derived from TEs. This result established the first connection between these two types of repetitive elements, and extends our appreciation for the impact of TEs on genomes. Furthermore, only 6% of these TE-derived TRs follow the already postulated initiation and expansion mechanisms, suggesting that the others are likely to follow a yet-unidentified mechanism. Third, by taking a combination of multiple computational approaches involving all types of genetic variations published so far including transposable elements, the first whole genome sequence of the most recent common ancestor of all modern human populations that diverged into different populations around 125,000-100,000 years ago was constructed. The study shows that the current reference genome sequence is 8.89 million base pairs larger than our common ancestor’s genome, contributed by a whole spectrum of genetic mechanisms. The use of this ancestral reference genome to facilitate the analysis of personal genomes was demonstrated using an example genome and more insightful recent evolutionary analyses involving the Neanderthal genome. The three data chapters presented in this thesis conclude that the tandem repeats and transposable elements are not two entirely distinctly isolated elements as over 20% TRs are actually derived from TEs. Certain subfamilies of TEs themselves are still evolving with the generation of newer subfamilies. The evolutionary analyses of all TEs along with other genomic variants helped to construct the genome sequence of the most recent common ancestor to all modern human populations which provides a better alternative to human reference genome and can be a useful resource for the study of personal genomics, population genetics, human and primate evolution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Les facteurs de transcription sont des protéines spécialisées qui jouent un rôle important dans différents processus biologiques tel que la différenciation, le cycle cellulaire et la tumorigenèse. Ils régulent la transcription des gènes en se fixant sur des séquences d’ADN spécifiques (éléments cis-régulateurs). L’identification de ces éléments est une étape cruciale dans la compréhension des réseaux de régulation des gènes. Avec l’avènement des technologies de séquençage à haut débit, l’identification de tout les éléments fonctionnels dans les génomes, incluant gènes et éléments cis-régulateurs a connu une avancée considérable. Alors qu’on est arrivé à estimer le nombre de gènes chez différentes espèces, l’information sur les éléments qui contrôlent et orchestrent la régulation de ces gènes est encore mal définie. Grace aux techniques de ChIP-chip et de ChIP-séquençage il est possible d’identifier toutes les régions du génome qui sont liées par un facteur de transcription d’intérêt. Plusieurs approches computationnelles ont été développées pour prédire les sites fixés par les facteurs de transcription. Ces approches sont classées en deux catégories principales: les algorithmes énumératifs et probabilistes. Toutefois, plusieurs études ont montré que ces approches génèrent des taux élevés de faux négatifs et de faux positifs ce qui rend difficile l’interprétation des résultats et par conséquent leur validation expérimentale. Dans cette thèse, nous avons ciblé deux objectifs. Le premier objectif a été de développer une nouvelle approche pour la découverte des sites de fixation des facteurs de transcription à l’ADN (SAMD-ChIP) adaptée aux données de ChIP-chip et de ChIP-séquençage. Notre approche implémente un algorithme hybride qui combine les deux stratégies énumérative et probabiliste, afin d’exploiter les performances de chacune d’entre elles. Notre approche a montré ses performances, comparée aux outils de découvertes de motifs existants sur des jeux de données simulées et des jeux de données de ChIP-chip et de ChIP-séquençage. SAMD-ChIP présente aussi l’avantage d’exploiter les propriétés de distributions des sites liés par les facteurs de transcription autour du centre des régions liées afin de limiter la prédiction aux motifs qui sont enrichis dans une fenêtre de longueur fixe autour du centre de ces régions. Les facteurs de transcription agissent rarement seuls. Ils forment souvent des complexes pour interagir avec l’ADN pour réguler leurs gènes cibles. Ces interactions impliquent des facteurs de transcription dont les sites de fixation à l’ADN sont localisés proches les uns des autres ou bien médier par des boucles de chromatine. Notre deuxième objectif a été d’exploiter la proximité spatiale des sites liés par les facteurs de transcription dans les régions de ChIP-chip et de ChIP-séquençage pour développer une approche pour la prédiction des motifs composites (motifs composés par deux sites et séparés par un espacement de taille fixe). Nous avons testé ce module pour prédire la co-localisation entre les deux demi-sites ERE qui forment le site ERE, lié par le récepteur des œstrogènes ERα. Ce module a été incorporé à notre outil de découverte de motifs SAMD-ChIP.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Most speciation events probably occur gradually, without complete and immediate reproductive isolation, but the full extent of gene flow between diverging species has rarely been characterized on a genome-wide scale. Documenting the extent and timing of admixture between diverging species can clarify the role of geographic isolation in speciation. Here we use new methodology to quantify admixture at different stages of divergence in Heliconius butterflies, based on whole-genome sequences of 31 individuals. Comparisons between sympatric and allopatric populations of H. melpomene, H. cydno, and H. timareta revealed a genome-wide trend of increased shared variation in sympatry, indicative of pervasive interspecific gene flow. Up to 40% of 100-kb genomic windows clustered by geography rather than by species, demonstrating that a very substantial fraction of the genome has been shared between sympatric species. Analyses of genetic variation shared over different time intervals suggested that admixture between these species has continued since early in speciation. Alleles shared between species during recent time intervals displayed higher levels of linkage disequilibrium than those shared over longer time intervals, suggesting that this admixture took place at multiple points during divergence and is probably ongoing. The signal of admixture was significantly reduced around loci controlling divergent wing patterns, as well as throughout the Z chromosome, consistent with strong selection for Müllerian mimicry and with known Z-linked hybrid incompatibility. Overall these results show that species divergence can occur in the face of persistent and genome-wide admixture over long periods of time.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Mean platelet volume (MPV) and platelet count (PLT) are highly heritable and tightly regulated traits. We performed a genome-wide association study for MPV and identified one SNP, rs342293, as having highly significant and reproducible association with MPV (per-G allele effect 0.016 +/- 0.001 log fL; P < 1.08 x 10(-24)) and PLT (per-G effect -4.55 +/- 0.80 10(9)/L; P < 7.19 x 10(-8)) in 8586 healthy subjects. Whole-genome expression analysis in the 1-MB region showed a significant association with platelet transcript levels for PIK3CG (n = 35; P = .047). The G allele at rs342293 was also associated with decreased binding of annexin V to platelets activated with collagen-related peptide (n = 84; P = .003). The region 7q22.3 identifies the first QTL influencing platelet volume, counts, and function in healthy subjects. Notably, the association signal maps to a chromosome region implicated in myeloid malignancies, indicating this site as an important regulatory site for hematopoiesis. The identification of loci regulating MPV by this and other studies will increase our insight in the processes of megakaryopoiesis and proplatelet formation, and it may aid the identification of genes that are somatically mutated in essential thrombocytosis. (Blood. 2009; 113: 3831-3837)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background. The anaerobic spirochaete Brachyspira pilosicoli causes enteric disease in avian, porcine and human hosts, amongst others. To date, the only available genome sequence of B. pilosicoli is that of strain 95/1000, a porcine isolate. In the first intra-species genome comparison within the Brachyspira genus, we report the whole genome sequence of B. pilosicoli B2904, an avian isolate, the incomplete genome sequence of B. pilosicoli WesB, a human isolate, and the comparisons with B. pilosicoli 95/1000. We also draw on incomplete genome sequences from three other Brachyspira species. Finally we report the first application of the high-throughput Biolog phenotype screening tool on the B. pilosicoli strains for detailed comparisons between genotype and phenotype. Results. Feature and sequence genome comparisons revealed a high degree of similarity between the three B. pilosicoli strains, although the genomes of B2904 and WesB were larger than that of 95/1000 (~2,765, 2.890 and 2.596 Mb, respectively). Genome rearrangements were observed which correlated largely with the positions of mobile genetic elements. Through comparison of the B2904 and WesB genomes with the 95/1000 genome, features that we propose are non-essential due to their absence from 95/1000 include a peptidase, glycine reductase complex components and transposases. Novel bacteriophages were detected in the newly-sequenced genomes, which appeared to have involvement in intra- and inter-species horizontal gene transfer. Phenotypic differences predicted from genome analysis, such as the lack of genes for glucuronate catabolism in 95/1000, were confirmed by phenotyping. Conclusions. The availability of multiple B. pilosicoli genome sequences has allowed us to demonstrate the substantial genomic variation that exists between these strains, and provides an insight into genetic events that are shaping the species. In addition, phenotype screening allowed determination of how genotypic differences translated to phenotype. Further application of such comparisons will improve understanding of the metabolic capabilities of Brachyspira species.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Numerous CCT domain genes are known to control flowering in plants. They belong to the CONSTANS-like (COL) and PREUDORESPONSE REGULATOR (PRR) gene families, which in addition to a CCT domain possess B-box or response-regulator domains, respectively. Ghd7 is the most recently identified COL gene to have a proven role in the control of flowering time in the Poaceae. However, as it lacks B-box domains, its inclusion within the COL gene family, technically, is incorrect. Here, we show Ghd7 belongs to a larger family of previously uncharacterized Poaceae genes which possess just a single CCT domain, termed here CCT MOTIF FAMILY (CMF) genes. We molecularly describe the CMF (and related COL and PRR) gene families in four sequenced Poaceae species, as well as in the draft genome assembly of barley (Hordeum vulgare). Genetic mapping of the ten barley CMF genes identified, as well as twelve previously unmapped HvCOL and HvPRR genes, finds the majority map to colinear positions relative to their Poaceae orthologues. Combined inter-/intra-species comparative and phylogenetic analysis of CMF, COL and PRR gene families indicates they evolved prior to the monocot/dicot divergence ~200 mya, with Poaceae CMF evolution described as the interplay between whole genome duplication in the ancestral cereal, and subsequent clade-specific mutation, deletion and duplication events. Given the proven role of CMF genes in the modulation of cereals flowering, the molecular, phylogenetic and comparative analysis of the Poaceae CMF, COL and PRR gene families presented here provides the foundation from which functional investigation can be undertaken.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The past years have shown an enormous advancement in sequencing and array-based technologies, producing supplementary or alternative views of the genome stored in various formats and databases. Their sheer volume and different data scope pose a challenge to jointly visualize and integrate diverse data types. We present AmalgamScope a new interactive software tool focusing on assisting scientists with the annotation of the human genome and particularly the integration of the annotation files from multiple data types, using gene identifiers and genomic coordinates. Supported platforms include next-generation sequencing and microarray technologies. The available features of AmalgamScope range from the annotation of diverse data types across the human genome to integration of the data based on the annotational information and visualization of the merged files within chromosomal regions or the whole genome. Additionally, users can define custom transcriptome library files for any species and use the file exchanging distant server options of the tool.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: The differential susceptibly hypothesis suggests that certain genetic variants moderate the effects of both negative and positive environments on mental health and may therefore be important predictors of response to psychological treatments. Nevertheless, the identification of such variants has so far been limited to preselected candidate genes. In this study we extended the differential susceptibility hypothesis from a candidate gene to a genome-wide approach to test whether a polygenic score of environmental sensitivity predicted response to Cognitive Behavioural Therapy (CBT) in children with anxiety disorders. Methods: We identified variants associated with environmental sensitivity using a novel method in which within-pair variability in emotional problems in 1026 monozygotic (MZ) twin pairs was examined as a function of the pairs’ genotype. We created a polygenic score of environmental sensitivity based on the whole-genome findings and tested the score as a moderator of parenting on emotional problems in 1,406 children and response to individual, group and brief parent-led CBT in 973 children with anxiety disorders. Results: The polygenic score significantly moderated the effects of parenting on emotional problems and the effects of treatment. Individuals with a high score responded significantly better to individual CBT than group CBT or brief parent-led CBT (remission rates: 70.9%, 55.5% and 41.6% respectively). Conclusions: Pending successful replication, our results should be considered exploratory. Nevertheless, if replicated, they suggest that individuals with the greatest environmental sensitivity may be more likely to develop emotional problems in adverse environments, but also benefit more from the most intensive types of treatment.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision.  Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes.  The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The water buffalo is vital to the lives of small farmers and to the economy of many countries worldwide. Not only are they draught animals, but they are also a source of meat, horns, skin and particularly the rich and precious milk that may be converted to creams, butter, yogurt and many cheeses. Genome analysis of water buffalo has advanced significantly in recent years. This review focuses on currently available genome resources in water buffalo in terms of cytogenetic characterization, whole genome mapping and next generation sequencing. No doubt, these resources indicate that genome science comes of age in the species and will provide knowledge and technologies to help optimize production potential, reproduction efficiency, product quality, nutritional value and resistance to diseases. As water buffalo and domestic cattle, both members of the Bovidae family, are closely related, the vast amount of cattle genetic/genomic resources might serve as shortcuts for the buffalo community to further advance genome science and biotechnologies in the species.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: New challenges are rising in the animal protein market, and one of the main world challenges is to produce more in shorter time, with better quality and in a sustainable way. Brazil is the largest beef exporter in volume hence the factors affecting the beef meat chain are of major concern in countrýs economy. An emerging class of biotechnological approaches, the molecular markers, is bringing new perspectives to face these challenges, particularly after the publication of the first complete livestock genome (bovine), which has triggered a massive initiative to put in practice the benefits of the so called the Post-Genomic Era. Review: This article aimed at showing the directions and insights in the application of molecular markers on livestock genetic improvement and reproduction as well at organizing the progress so far, pointing some perspectives of these emerging technologies in Brazilian ruminant production context. An overview on the nature of the main molecular markers explored in ruminant production is provided, which describes the molecular bases and detection approaches available for microsatellites (STR) and single nucleotide polymorphisms (SNP). A topic is dedicated to review the history of association studies between markers and important trait variation in livestock, showing the timeline starting on quantitative trait loci (QTL) identification using STR markers and ending in high resolution SNP panels to proceed whole genome scans for phenotype/genotype association. Also the article organizes this information to reveal how QTL prospection using STR could open ground to the feasibility of marker-assisted selection and why this approach is quickly being replaced by studies involving the application of genome-wide association using SNP research in a new concept called genomic selection. Conclusion: The world's scientific community is dedicating effort and resources to apply SNP information in livestock selection through the development of high density panels for genomic association studies, connecting molecular genetic data with phenotypes of economic interest. Once generated, this information can be used to take decisions in genetic improvement programs by selecting animals with the assistance of molecular markers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The pattern of global gene expression in Salmonella enterica serovar Typhimurium bacteria harvested from the chicken intestinal lumen (cecum) was compared with that of a late-log-phase LB broth culture using a whole-genome microarray. Levels of transcription, translation, and cell division in vivo were lower than those in vitro. S. Typhimurium appeared to be using carbon sources, such as propionate, 1,2-propanediol, and ethanolamine, in addition to melibiose and ascorbate, the latter possibly transformed to D-xylulose. Amino acid starvation appeared to be a factor during colonization. Bacteria in the lumen were non- or weakly motile and nonchemotactic but showed upregulation of a number of fimbrial and Salmonella pathogenicity island 3 (SPI-3) and 5 genes, suggesting a close physical association with the host during colonization. S. Typhimurium bacteria harvested from the cecal mucosa showed an expression profile similar to that of bacteria from the intestinal lumen, except that levels of transcription, translation, and cell division were higher and glucose may also have been used as a carbon source. © 2011, American Society for Microbiology.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this study was to identify single-nucleotide polymorphisms (SNPs) in buffaloes associated with milk yield and content, in addition to somatic cell scores based on the cross-species transferability of SNPs from cattle to buffalo. A total of 15,745 SNPs were analyzed, of which 1562 showed 1% significance and 4742 with 5% significance, which were associated for all traits studied. After application of Bonferroni's correction for multiple tests of the traits analyzed, we found 2 significant SNPs placed on cattle chromosomes BTA15 and BTA20, which are homologous to buffalo chromosomes BBU16 and BBU19, respectively. In this genome association study, we found several significant SNPs affecting buffalo milk production and quality. Furthermore, the use of the high-density bovine BeadChip was suitable for genomic analysis in buffaloes. Although extensive chromosome arm homology was described between cattle and buffalo, the exact chromosomal position of SNP markers associated with these economically important traits in buffalo can be determined only through buffalo genome sequencing.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Introduction: Genetic factors predisposing individuals to cancer remain elusive in the majority of patients with a familial or clinical history suggestive of hereditary breast cancer. Germline DNA copy number variation (CNV) has recently been implicated in predisposition to cancers such as neuroblastomas as well as prostate and colorectal cancer. We evaluated the role of germline CNVs in breast cancer susceptibility, in particular those with low population frequencies (rare CNVs), which are more likely to cause disease." Methods: Using whole-genome comparative genomic hybridization on microarrays, we screened a cohort of women fulfilling criteria for hereditary breast cancer who did not carry BRCA1/BRCA2 mutations. Results: The median numbers of total and rare CNVs per genome were not different between controls and patients. A total of 26 rare germline CNVs were identified in 68 cancer patients, however, a proportion that was significantly different (P = 0.0311) from the control group (23 rare CNVs in 100 individuals). Several of the genes affected by CNV in patients and controls had already been implicated in cancer. Conclusions: This study is the first to explore the contribution of germline CNVs to BRCA1/2-negative familial and early-onset breast cancer. The data suggest that rare CNVs may contribute to cancer predisposition in this small cohort of patients, and this trend needs to be confirmed in larger population samples.