976 resultados para GENOMIC DNA


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of molecular data to reconstruct the history of divergence and gene flow between populations of closely related taxa represents a challenging problem. It has been proposed that the long-standing debate about the geography of speciation can be resolved by comparing the likelihoods of a model of isolation with migration and a model of secondary contact. However, data are commonly only fit to a model of isolation with migration and rarely tested against the secondary contact alternative. Furthermore, most demographic inference methods have neglected variation in introgression rates and assume that the gene flow parameter (Nm) is similar among loci. Here, we show that neglecting this source of variation can give misleading results. We analysed DNA sequences sampled from populations of the marine mussels, Mytilus edulis and M. galloprovincialis, across a well-studied mosaic hybrid zone in Europe and evaluated various scenarios of speciation, with or without variation in introgression rates, using an Approximate Bayesian Computation (ABC) approach. Models with heterogeneous gene flow across loci always outperformed models assuming equal migration rates irrespective of the history of gene flow being considered. By incorporating this heterogeneity, the best-supported scenario was a long period of allopatric isolation during the first three-quarters of the time since divergence followed by secondary contact and introgression during the last quarter. By contrast, constraining migration to be homogeneous failed to discriminate among any of the different models of gene flow tested. Our simulations thus provide statistical support for the secondary contact scenario in the European Mytilus hybrid zone that the standard coalescent approach failed to confirm. Our results demonstrate that genomic variation in introgression rates can have profound impacts on the biological conclusions drawn from inference methods and needs to be incorporated in future studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Little is known about the relation between the genome organization and gene expression in Leishmania. Bioinformatic analysis can be used to predict genes and find homologies with known proteins. A model was proposed, in which genes are organized into large clusters and transcribed from only one strand, in the form of large polycistronic primary transcripts. To verify the validity of this model, we studied gene expression at the transcriptional, post-transcriptional and translational levels in a unique locus of 34kb located on chr27 and represented by cosmid L979. Sequence analysis revealed 115 ORFs on either DNA strand. Using computer programs developed for Leishmania genes, only nine of these ORFs, localized on the same strand, were predicted to code for proteins, some of which show homologies with known proteins. Additionally, one pseudogene, was identified. We verified the biological relevance of these predictions. mRNAs from nine predicted genes and proteins from seven were detected. Nuclear run-on analyses confirmed that the top strand is transcribed by RNA polymerase II and suggested that there is no polymerase entry site. Low levels of transcription were detected in regions of the bottom strand and stable transcripts were identified for four ORFs on this strand not predicted to be protein-coding. In conclusion, the transcriptional organization of the Leishmania genome is complex, raising the possibility that computer predictions may not be comprehensive.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The pericentric inversion on chromosome 16 [inv(16)(p13q22)] and related t(16;16)(p13;q22) are recurrent aberrations associated with acute myeloid leukemia (AML) M4 Eo. Both abberations result in a fusion of the core binding factor beta (CBFB) and smooth muscle myosin heavy chain gene (MYH11). A selected genomic 6.9-kb BamHl probe detects MYH11 DNA rearrangements in 18 of 19 inv(16)/t(16;16) patients tested using HindIII digested DNA. The rearranged fragments were not detectable after remission in two cases tested, while they were present after relapse in one of these two cases tested.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and aim of the study: Genomic gains and losses play a crucial role in the development and progression of DLBCL and are closely related to gene expression profiles (GEP), including the germinal center B-cell like (GCB) and activated B-cell like (ABC) cell of origin (COO) molecular signatures. To identify new oncogenes or tumor suppressor genes (TSG) involved in DLBCL pathogenesis and to determine their prognostic values, an integrated analysis of high-resolution gene expression and copy number profiling was performed. Patients and methods: Two hundred and eight adult patients with de novo CD20+ DLBCL enrolled in the prospective multicentric randomized LNH-03 GELA trials (LNH03-1B, -2B, -3B, 39B, -5B, -6B, -7B) with available frozen tumour samples, centralized reviewing and adequate DNA/RNA quality were selected. 116 patients were treated by Rituximab(R)-CHOP/R-miniCHOP and 92 patients were treated by the high dose (R)-ACVBP regimen dedicated to patients younger than 60 years (y) in frontline. Tumour samples were simultaneously analysed by high resolution comparative genomic hybridization (CGH, Agilent, 144K) and gene expression arrays (Affymetrix, U133+2). Minimal common regions (MCR), as defined by segments that affect the same chromosomal region in different cases, were delineated. Gene expression and MCR data sets were merged using Gene expression and dosage integrator algorithm (GEDI, Lenz et al. PNAS 2008) to identify new potential driver genes. Results: A total of 1363 recurrent (defined by a penetrance > 5%) MCRs within the DLBCL data set, ranging in size from 386 bp, affecting a single gene, to more than 24 Mb were identified by CGH. Of these MCRs, 756 (55%) showed a significant association with gene expression: 396 (59%) gains, 354 (52%) single-copy deletions, and 6 (67%) homozygous deletions. By this integrated approach, in addition to previously reported genes (CDKN2A/2B, PTEN, DLEU2, TNFAIP3, B2M, CD58, TNFRSF14, FOXP1, REL...), several genes targeted by gene copy abnormalities with a dosage effect and potential physiopathological impact were identified, including genes with TSG activity involved in cell cycle (HACE1, CDKN2C) immune response (CD68, CD177, CD70, TNFSF9, IRAK2), DNA integrity (XRCC2, BRCA1, NCOR1, NF1, FHIT) or oncogenic functions (CD79b, PTPRT, MALT1, AUTS2, MCL1, PTTG1...) with distinct distribution according to COO signature. The CDKN2A/2B tumor suppressor locus (9p21) was deleted homozygously in 27% of cases and hemizygously in 9% of cases. Biallelic loss was observed in 49% of ABC DLBCL and in 10% of GCB DLBCL. This deletion was strongly correlated to age and associated to a limited number of additional genetic abnormalities including trisomy 3, 18 and short gains/losses of Chr. 1, 2, 19 regions (FDR < 0.01), allowing to identify genes that may have synergistic effects with CDKN2A/2B inactivation. With a median follow-up of 42.9 months, only CDKN2A/2B biallelic deletion strongly correlates (FDR p.value < 0.01) to a poor outcome in the entire cohort (4y PFS = 44% [32-61] respectively vs. 74% [66-82] for patients in germline configuration; 4y OS = 53% [39-72] vs 83% [76-90]). In a Cox proportional hazard prediction of the PFS, CDKN2A/2B deletion remains predictive (HR = 1.9 [1.1-3.2], p = 0.02) when combined with IPI (HR = 2.4 [1.4-4.1], p = 0.001) and GCB status (HR = 1.3 [0.8-2.3], p = 0.31). This difference remains predictive in the subgroup of patients treated by R-CHOP (4y PFS = 43% [29-63] vs. 66% [55-78], p=0.02), in patients treated by R-ACVBP (4y PFS = 49% [28-84] vs. 83% [74-92], p=0.003), and in GCB (4y PFS = 50% [27-93] vs. 81% [73-90], p=0.02), or ABC/unclassified (5y PFS = 42% [28-61] vs. 67% [55-82] p = 0.009) molecular subtypes (Figure 1). Conclusion: We report for the first time an integrated genetic analysis of a large cohort of DLBCL patients included in a prospective multicentric clinical trial program allowing identifying new potential driver genes with pathogenic impact. However CDKN2A/2B deletion constitutes the strongest and unique prognostic factor of chemoresistance to R-CHOP, regardless the COO signature, which is not overcome by a more intensified immunochemotherapy. Patients displaying this frequent genomic abnormality warrant new and dedicated therapeutic approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Phosphorylation of transcription factors is a rapid and reversible process linking cell signaling and control of gene expression, therefore understanding how it controls the transcription factor functions is one of the challenges of functional genomics. We performed such analysis for the forkhead transcription factor FOXC2 mutated in human hereditary disease lymphedemadistichiasis and important for the development of venous and lymphatic valves and lymphatic collecting vessels. We found that FOXC2 is phosphorylated in a cell-cycle dependent manner on eight evolutionary conserved serine/threonine residues, seven of which are clustered within a 70 amino acid domain. Surprisingly, the mutation of phosphorylation sites or a complete deletion of the domain did not affect the transcriptional activity of FOXC2 in a synthetic reporter assay. However, overexpression of the wild type or phosphorylation-deficient mutant resulted in overlapping but distinct gene expression profiles suggesting that binding of FOXC2 to individual sites under physiological conditions is affected by phosphorylation. To gain a direct insight into the role of FOXC2 phosphorylation, we performed comparative genome-wide location analysis (ChIP-chip) of wild type and phosphorylation-deficient FOXC2 in primary lymphatic endothelial cells. The effect of loss of phosphorylation on FOXC2 binding to genomic sites ranged from no effect to nearly complete inhibition of binding, suggesting a mechanism for how FOXC2 transcriptional program can be differentially regulated depending on FOXC2 phosphorylation status. Based on these results, we propose an extension to the enhanceosome model, where a network of genomic context-dependent DNA-protein and protein-protein interactions not only distinguishes a functional site from a nonphysiological site, but also determines whether binding to the functional site can be regulated by phosphorylation. Moreover, our results indicate that FOXC2 may have different roles in quiescent versus proliferating lymphatic endothelial cells in vivo.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract : Transcriptional regulation is the result of a combination of positive and negative effectors, such as transcription factors, cofactors and chromatin modifiers. During my thesis project I studied chromatin association, and transcriptional and cell cycle regulatory functions of dHCF, the Drosophila homologue of the human protein HCF-1 (host cell factor-1). The human and Drosophila HCF proteins are synthesized as large polypeptides that are cleaved into two subunits (HCFN and HCFC), which remain associated with one another by non covalent interactions. Studies in mammalian cells over the past 20 years have been devoted to understanding the cellular functions of HCF-1 and have revealed that it is a key regulator of transcription and cell cycle regulation. In human cells, HCF-1 interacts with the histone methyltransferase Set1/Ash2 and MLL/Ash2 complexes and the histone deacetylase Sin3 complex, which are involved in transcriptional activation and repression, respectively. HCF-1 is also recruited to promoters to regulate G1 -to-S phase progression during the cell cycle by the activator transcription factors E2F1 and E2F3, and by the repressor transcription factor E2F4. HCF-1 protein structure and these interactions between HCP-1 and E2F transcriptional regulator proteins are also conserved in Drosophila. In this doctoral thesis, I use proliferating Drosophila SL2 cells to study both the genomic-binding sites of dHCF, using a combination of chromatin immunoprecipitation and ultra high throughput sequencing (ChIP-seq) analysis, and dHCF regulated genes, employing RNAi and microarray expression analysis. I show that dHCF is bound to over 7500 chromosomal sites in proliferating SL2 cells, and is located at +-200 bp relative to the transcriptional start sites of about 30% of Drosophila genes. There is also a direct relationship between dHCF promoter association and promoter- associated transcriptional activity. Thus, dHCF binding levels at promoters correlated directly with transcriptional activity. In contrast, expression studies showed that dHCF appears to be involved in both transcriptional activation and repression. Analysis of dHCF-binding sites identified nine dHCF-associated motifs, four of them linked dHCF to (i) two insulator proteins, GAGA and BEAF, (ii) the E-box motif, and (iii) a degenerated TATA-box. The dHCF-associated motifs allowed the organization of the dHCF-bound genes into five biological processes: differentiation, cell cycle and gene expression, regulation of endocytosis, and cellular localization. I further show that different mechanisms regulate dHCF association with chromatin. Despite that after dHCF cleavage the dHCFN and dHCFC subunits remain associated, the two subunits showed different affinities for chromatin and differential binding to a set of tested promoters, suggesting that dHCF could target specific promoters through each of the two subunits. Moreover, in addition to the interaction between dHCF and E2F transcription factors, the dHCF binding pattern is correlated with dE2F2 genomic 4 distribution. I show that dE2F factors are necessary for recruitment of dHCF to the promoter of a set of dHCF regulated genes. Therefore dHCF, as in mammals, is involved in regulation of G1 to S phase progression in collaboration with the dE2Fs transcription factors. In addition, gene expression arrays reveal that dHCF could indirectly regulate cell cycle progression by promoting expression of genes involved in gene expression and protein synthesis, and inhibiting expression of genes involved in cell-cell adhesion. Therefore, dHCF is an evolutionary conserved protein, which binds to many specific sites of the Drosophila genome via interaction with DNA of chromatin-binding proteins to regulate the expression of genes involved in many different cellular functions. Résumé : La regulation de la transcription est le résultat des effets positifs et négatifs des facteurs de transcription, cofacteurs et protéines effectrices qui modifient la chromatine. Pendant mon projet de thèse, j'ai étudié l'association a la chromatine, ainsi que la régulation de la transcription et du cycle cellulaire par dHCF, l'homologue chez la drosophile de la protéine humaine HCF-1 (host cell factor-1). Chez 1'humain et la V drosophile, les deux protéines HCF sont synthétisées sous la forme d'un long polypeptide, qui est ensuite coupé en deux sous-unités au centre de la protéine. Les deux sous-unités restent associées ensemble grâce a des interactions non-covalentes. Des études réalisées pendant les 20 dernières années ont permit d'établir que HCF-l et un facteur clé dans la régulation de la transcription et du cycle cellulaire. Dans les cellules humaines, HCF-1 active et réprime la transcription en interagissant avec des complexes de protéines qui activent la transcription en méthylant les histones (HMT), comme par Set1/Ash2 et MLL/Ash2, et d'autres complexes qui répriment la transcription et sont responsables de la déacétylation des histones (HDAC) comme la protéine Sin3. HCF-l est aussi recruté aux promoteurs par les activateurs de la transcription E2F l et E2F3a, et par le répresseur de la transcription E2F4 pour réguler la transition entre les phases G1 et S du cycle cellulaire. La structure de HCF-1 et les interactions entre HCF-l et les régulateurs de la transcription sont conservées chez la drosophile. Pendant ma these j'ai utilisé les cellules de la drosophile, SL2 en culture, pour étudier les endroits de liaisons de HCF-l à la chromatine, grâce a immunoprecipitation de la chromatine et du séquençage de l'ADN massif ainsi que les gènes régulés par dHCF 3 grâce a la technique de RNAi et des microarrays. Mes résultats on montré que dHCF se lie à environ 7565 endroits, et estimé a 1200 paire de bases autour des sites d'initiation de la transcription de 30% des gènes de la drosophile. J 'ai observe une relation entre dHCF et le niveau de la transcription. En effet, le niveau de liaison dHCF au promoteur corrèle avec l'activité de la transcription. Cependant, mes études d'expression ont montré que dHCF est implique dans le processus d'activation et mais aussi de répression de la transcription. L'analyse des séquences d'ADN liées par dHCF a révèle neuf motifs, quatre de ces motifs ont permis d'associer dl-ICF a deux protéines isolatrices GAGA et BEAF, au motif pour les E-boxes et a une TATA-box dégénérée. Les neuf motifs associes à dHCF ont permis d'associer les gènes lies par dHCF au promoteur a cinq processus biologiques: différentiation, cycle cellulaire, expression de gènes, régulation de l'endocytosis et la localisation cellulaire, J 'ai aussi montré qu'il y a plusieurs mécanismes qui régulent l'association de dHCF a la chromatine, malgré qu'après clivage, les deux sous-unites dHCFN and dHCFC, restent associées, elles montrent différentes affinités pour la chromatine et lient différemment un group de promoteurs, les résultats suggèrent que dHCF peut se lier aux promoteurs en utilisant chacune de ses sous-unitées. En plus de l'association de dHCF avec les facteurs de transcription dE2F s, la distribution de dHCF sur le génome corrèle avec celle du facteur de transcription dE2F2. J'ai aussi montré que les dE2Fs sont nécessaires pour le recrutement de dHCF aux promoteurs d'un sous-groupe de gènes régules par dHCF. Mes résultats ont aussi montré que chez la drosophile comme chez les humains, dl-ICF est implique dans la régulation de la progression de la phase G1 a la phase S du cycle cellulaire en collaboration avec dE2Fs. D'ailleurs, les arrays d'expression ont suggéré que dHCF pourrait réguler le cycle cellulaire de façon indirecte en activant l'expression de gènes impliqués dans l'expression génique et la synthèse de protéines, et en inhibant l'expression de gènes impliqués dans l'adhésion cellulaire. En conclusion, dHCF est une protéine, conservée dans l'évolution, qui se lie spécifiquement a beaucoup d'endroits du génome de Drosophile, grâce à l'interaction avec d'autres protéines, pour réguler l'expression des gènes impliqués dans plusieurs fonctions cellulaires.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The genomic era has revealed that the large repertoire of observed animal phenotypes is dependent on changes in the expression patterns of a finite number of genes, which are mediated by a plethora of transcription factors (TFs) with distinct specificities. The dimerization of TFs can also increase the complexity of a genetic regulatory network manifold, by combining a small number of monomers into dimers with distinct functions. Therefore, studying the evolution of these dimerizing TFs is vital for understanding how complexity increased during animal evolution. We focus on the second largest family of dimerizing TFs, the basic-region leucine zipper (bZIP), and infer when it expanded and how bZIP DNA-binding and dimerization functions evolved during the major phases of animal evolution. Specifically, we classify the metazoan bZIPs into 19 families and confirm the ancient nature of at least 13 of these families, predating the split of the cnidaria. We observe fixation of a core dimerization network in the last common ancestor of protostomes-deuterostomes. This was followed by an expansion of the number of proteins in the network, but no major dimerization changes in interaction partners, during the emergence of vertebrates. In conclusion, the bZIPs are an excellent model with which to understand how DNA binding and protein interactions of TFs evolved during animal evolution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. RESULTS: We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. CONCLUSION: VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Chemoreception is a widespread mechanism that is involved in critical biologic processes, including individual and social behavior. The insect peripheral olfactory system comprises three major multigene families: the olfactory receptor (Or), the gustatory receptor (Gr), and the odorant-binding protein (OBP) families. Members of the latter family establish the first contact with the odorants, and thus constitute the first step in the chemosensory transduction pathway.Results: Comparative analysis of the OBP family in 12 Drosophila genomes allowed the identification of 595 genes that encode putative functional and nonfunctional members in extant species, with 43 gene gains and 28 gene losses (15 deletions and 13 pseudogenization events). The evolution of this family shows tandem gene duplication events, progressive divergence in DNA and amino acid sequence, and prevalence of pseudogenization events in external branches of the phylogenetic tree. We observed that the OBP arrangement in clusters is maintained across the Drosophila species and that purifying selection governs the evolution of the family; nevertheless, OBP genes differ in their functional constraints levels. Finally, we detect that the OBP repertoire evolves more rapidly in the specialist lineages of the Drosophila melanogaster group (D. sechellia and D. erecta) than in their closest generalists.Conclusion: Overall, the evolution of the OBP multigene family is consistent with the birth-and-death model. We also found that members of this family exhibit different functional constraints, which is indicative of some functional divergence, and that they might be involved in some of the specialization processes that occurred through the diversification of the Drosophila genus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peroxisome proliferator activated receptors are ligand activated transcription factors belonging to the nuclear hormone receptor superfamily. Three cDNAs encoding such receptors have been isolated from Xenopus laevis (xPPAR alpha, beta, and gamma). Furthermore, the gene coding for xPPAR beta has been cloned, thus being the first member of this subfamily whose genomic organization has been solved. Functionally, xPPAR alpha as well as its mouse and rat homologs are thought to play an important role in lipid metabolism due to their ability to activate transcription of a reporter gene through the promoter of the acyl-CoA oxidase (ACO) gene. ACO catalyzes the rate limiting step in the peroxisomal beta-oxidation of fatty acids. Activation is achieved by the binding of xPPAR alpha on a regulatory element (DR1) found in the promoter region of this gene, xPPAR beta and gamma are also able to recognize the same type of element and are, as PPAR alpha, able to form heterodimers with retinoid X receptor. All three xPPARs appear to be activated by synthetic peroxisome proliferators as well as by naturally occurring fatty acids, suggesting that a common mode of action exists for all the members of this subfamily of nuclear hormone receptors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic islands (GEIs) are large DNA segments, present in most bacterial genomes, that are most likely acquired via horizontal gene transfer. Here, we study the self-transfer system of the integrative and conjugative element ICEclc of Pseudomonas knackmussii B13, which stands model for a larger group of ICE/GEI with syntenic core gene organization. Functional screening revealed that unlike conjugative plasmids and other ICEs ICEclc carries two separate origins of transfer, with different sequence context but containing a similar repeat motif. Conjugation experiments with GFP-labelled ICEclc variants showed that both oriTs are used for transfer and with indistinguishable efficiencies, but that having two oriTs results in an estimated fourfold increase of ICEclc transfer rates in a population compared with having a single oriT. A gene for a relaxase essential for ICEclc transfer was also identified, but in vivo strand exchange assays suggested that the relaxase processes both oriTs in a different manner. This unique dual origin of transfer system might have provided an evolutionary advantage for distribution of ICE, a hypothesis that is supported by the fact that both oriT regions are conserved in several GEIs related to ICEclc.