201 resultados para Human Genome Project.
Resumo:
Regulation of viral genome expression is the result of complex cooperation between viral proteins and host cell factors. We report here the characterization of a novel cellular factor sharing homology with the specific cysteine-rich C-terminal domain of the basic helix-loop-helix repressor protein I-mfa. The synthesis of this new factor, called HIC for Human I-mfa domain-Containing protein, is controlled at the translational level by two different codons, an ATG and an upstream non-ATG translational initiator, allowing the production of two protein isoforms, p32 and p40, respectively. We show that the HIC protein isoforms present different subcellular localizations, p32 being mainly distributed throughout the cytoplasm, whereas p40 is targeted to the nucleolus. Moreover, in trying to understand the function of HIC, we have found that both isoforms stimulate in T-cells the expression of a luciferase reporter gene driven by the human T-cell leukemia virus type I-long terminal repeat in the presence of the viral transactivator Tax. We demonstrate by mutagenesis that the I-mfa-like domain of HIC is involved in this regulation. Finally, we also show that HIC is able to down-regulate the luciferase expression from the human immunodeficiency virus type 1-long terminal repeat induced by the viral transactivator Tat. From these results, we propose that HIC and I-mfa represent two members of a new family of proteins regulating gene expression and characterized by a particular cysteine-rich C-terminal domain.
Resumo:
The use of comparative genomics to infer genome function relies on the understanding of how different components of the genome change over evolutionary time. The aim of such comparative analysis is to identify conserved, functionally transcribed sequences such as protein-coding genes and non-coding RNA genes, and other functional sequences such as regulatory regions, as well as other genomic features. Here, we have compared the entire human chromosome 21 with syntenic regions of the mouse genome, and have identified a large number of conserved blocks of unknown function. Although previous studies have made similar observations, it is unknown whether these conserved sequences are genes or not. Here we present an extensive experimental and computational analysis of human chromosome 21 in an effort to assign function to sequences conserved between human chromosome 21 (ref. 8) and the syntenic mouse regions. Our data support the presence of a large number of potentially functional non-genic sequences, probably regulatory and structural. The integration of the properties of the conserved components of human chromosome 21 to the rapidly accumulating functional data for this chromosome will improve considerably our understanding of the role of sequence conservation in mammalian genomes.
Resumo:
Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.
Resumo:
It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.
Resumo:
Recent genome-wide association (GWA) studies described 95 loci controlling serum lipid levels. These common variants explain ∼25% of the heritability of the phenotypes. To date, no unbiased screen for gene-environment interactions for circulating lipids has been reported. We screened for variants that modify the relationship between known epidemiological risk factors and circulating lipid levels in a meta-analysis of genome-wide association (GWA) data from 18 population-based cohorts with European ancestry (maximum N = 32,225). We collected 8 further cohorts (N = 17,102) for replication, and rs6448771 on 4p15 demonstrated genome-wide significant interaction with waist-to-hip-ratio (WHR) on total cholesterol (TC) with a combined P-value of 4.79×10(-9). There were two potential candidate genes in the region, PCDH7 and CCKAR, with differential expression levels for rs6448771 genotypes in adipose tissue. The effect of WHR on TC was strongest for individuals carrying two copies of G allele, for whom a one standard deviation (sd) difference in WHR corresponds to 0.19 sd difference in TC concentration, while for A allele homozygous the difference was 0.12 sd. Our findings may open up possibilities for targeted intervention strategies for people characterized by specific genomic profiles. However, more refined measures of both body-fat distribution and metabolic measures are needed to understand how their joint dynamics are modified by the newly found locus.
Resumo:
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Resumo:
The R package EasyStrata facilitates the evaluation and visualization of stratified genome-wide association meta-analyses (GWAMAs) results. It provides (i) statistical methods to test and account for between-strata difference as a means to tackle gene-strata interaction effects and (ii) extended graphical features tailored for stratified GWAMA results. The software provides further features also suitable for general GWAMAs including functions to annotate, exclude or highlight specific loci in plots or to extract independent subsets of loci from genome-wide datasets. It is freely available and includes a user-friendly scripting interface that simplifies data handling and allows for combining statistical and graphical functions in a flexible fashion. AVAILABILITY: EasyStrata is available for free (under the GNU General Public License v3) from our Web site www.genepi-regensburg.de/easystrata and from the CRAN R package repository cran.r-project.org/web/packages/EasyStrata/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Resumo:
Hemolytic disease of the newborn is an often fatal condition of some newborn babies due to the immunogenicity of their Rh D positive erythrocytes in the Rh D negative mother. This condition can be prevented by injecting anti-Rh D antibodies. The current source of these antibodies is blood from immunized human donors. In order to avoid problems with limited supply and donor safety, the Rh D project was set up to develop recombinant monoclonal anti-Rh D antibodies as a possible replacement. In a multidisciplinary collaboration between the Zentrallaboratorium Blutspendedienst (ZlB) of the Swiss Red Cross, the Center of Biotechnology of the University and the EPFL (CBUE), and the Institute of Chemical and Biochemical Engineering (EPFl), co-funded by the Swiss National Science Foundation and ZLB, a candidate monoclonal anti-Rh D antibody has been selected, expressed in CHO cells, and a manufacturing process for large-scale production has been developed.
Resumo:
As production and use of nanomaterials in commercial products grow it is imperative to ensure these materials are used safely with minimal unwanted impacts on human health or the environment. Foremost among the populations of potential concern are workers who handle nanomaterials in a variety of occupational settings, including university laboratories, industrial manufacturing plants and other institutions. Knowledge about prudent practices for handling nanomaterials is being developed by many groups around the world but may be communicated in a way that is difficult for practitioners to access or use. The GoodNanoGuide is a collaborative, open-access project aimed at creating an international forum for the development and discussion of prudent practices that can be used by researchers, workers and their representatives, occupational safety professionals, governmental officials and even the public. The GoodNanoGuide is easily accessed by anyone with access to a web browser and aims to become a living repository of good practices for the nanotechnology enterprise. Interested individuals are invited to learn more about the GoodNanoGuide at http://goodnanoguide.org.
Resumo:
Résumé: L'automatisation du séquençage et de l'annotation des génomes, ainsi que l'application à large échelle de méthodes de mesure de l'expression génique, génèrent une quantité phénoménale de données pour des organismes modèles tels que l'homme ou la souris. Dans ce déluge de données, il devient très difficile d'obtenir des informations spécifiques à un organisme ou à un gène, et une telle recherche aboutit fréquemment à des réponses fragmentées, voir incomplètes. La création d'une base de données capable de gérer et d'intégrer aussi bien les données génomiques que les données transcriptomiques peut grandement améliorer la vitesse de recherche ainsi que la qualité des résultats obtenus, en permettant une comparaison directe de mesures d'expression des gènes provenant d'expériences réalisées grâce à des techniques différentes. L'objectif principal de ce projet, appelé CleanEx, est de fournir un accès direct aux données d'expression publiques par le biais de noms de gènes officiels, et de représenter des données d'expression produites selon des protocoles différents de manière à faciliter une analyse générale et une comparaison entre plusieurs jeux de données. Une mise à jour cohérente et régulière de la nomenclature des gènes est assurée en associant chaque expérience d'expression de gène à un identificateur permanent de la séquence-cible, donnant une description physique de la population d'ARN visée par l'expérience. Ces identificateurs sont ensuite associés à intervalles réguliers aux catalogues, en constante évolution, des gènes d'organismes modèles. Cette procédure automatique de traçage se fonde en partie sur des ressources externes d'information génomique, telles que UniGene et RefSeq. La partie centrale de CleanEx consiste en un index de gènes établi de manière hebdomadaire et qui contient les liens à toutes les données publiques d'expression déjà incorporées au système. En outre, la base de données des séquences-cible fournit un lien sur le gène correspondant ainsi qu'un contrôle de qualité de ce lien pour différents types de ressources expérimentales, telles que des clones ou des sondes Affymetrix. Le système de recherche en ligne de CleanEx offre un accès aux entrées individuelles ainsi qu'à des outils d'analyse croisée de jeux de donnnées. Ces outils se sont avérés très efficaces dans le cadre de la comparaison de l'expression de gènes, ainsi que, dans une certaine mesure, dans la détection d'une variation de cette expression liée au phénomène d'épissage alternatif. Les fichiers et les outils de CleanEx sont accessibles en ligne (http://www.cleanex.isb-sib.ch/). Abstract: The automatic genome sequencing and annotation, as well as the large-scale gene expression measurements methods, generate a massive amount of data for model organisms. Searching for genespecific or organism-specific information througout all the different databases has become a very difficult task, and often results in fragmented and unrelated answers. The generation of a database which will federate and integrate genomic and transcriptomic data together will greatly improve the search speed as well as the quality of the results by allowing a direct comparison of expression results obtained by different techniques. The main goal of this project, called the CleanEx database, is thus to provide access to public gene expression data via unique gene names and to represent heterogeneous expression data produced by different technologies in a way that facilitates joint analysis and crossdataset comparisons. A consistent and uptodate gene nomenclature is achieved by associating each single gene expression experiment with a permanent target identifier consisting of a physical description of the targeted RNA population or the hybridization reagent used. These targets are then mapped at regular intervals to the growing and evolving catalogues of genes from model organisms, such as human and mouse. The completely automatic mapping procedure relies partly on external genome information resources such as UniGene and RefSeq. The central part of CleanEx is a weekly built gene index containing crossreferences to all public expression data already incorporated into the system. In addition, the expression target database of CleanEx provides gene mapping and quality control information for various types of experimental resources, such as cDNA clones or Affymetrix probe sets. The Affymetrix mapping files are accessible as text files, for further use in external applications, and as individual entries, via the webbased interfaces . The CleanEx webbased query interfaces offer access to individual entries via text string searches or quantitative expression criteria, as well as crossdataset analysis tools, and crosschip gene comparison. These tools have proven to be very efficient in expression data comparison and even, to a certain extent, in detection of differentially expressed splice variants. The CleanEx flat files and tools are available online at: http://www.cleanex.isbsib. ch/.
Resumo:
Plants have the ability to use the composition of incident light as a cue to adapt development and growth to their environment. Arabidopsis thaliana as well as many crops are best adapted to sunny habitats. When subjected to shade, these plants exhibit a variety of physiological responses collectively called shade avoidance syndrome (SAS). It includes increased growth of hypocotyl and petioles, decreased growth rate of cotyledons and reduced branching and crop yield. These responses are mainly mediated by phytochrome photoreceptors, which exist either in an active, far-red light (FR) absorbing or an inactive, red light (R) absorbing isoform. In direct sunlight, the R to FR light (R/FR) ratio is high and converts the phytochromes into their physiologically active state. The phytochromes interact with downstream transcription factors such as PHYTOCHROME INTERACTING FACTOR (PIF), which are subsequently degraded. Light filtered through a canopy is strongly depleted in R, which result in a low R/FR ratio and renders the phytochromes inactive. Protein levels of downstream transcription factors are stabilized, which initiates the expression of shade-induced genes such as HFR1, PIL1 or ATHB-2. In my thesis, I investigated transcriptional responses mediated by the SAS in whole Arabidopsis seedlings. Using microarray and chromatin immunoprecipitation data, we identified genome-wide PIF4 and PIF5 dependent shade regulated gene as well as putative direct target genes of PIF5. This revealed evidence for a direct regulatory link between phytochrome signaling and the growth promoting phytohormone auxin (IAA) at the level of biosynthesis, transport and signaling. Subsequently, it was shown, that free-IAA levels are upregulated in response to shade. It is assumed that shade-induced auxin production takes predominantly place in cotyledons of seedlings. This implies, that IAA is subsequently transported basipetally to the hypocotyl and enhances elongation growth. The importance of auxin transport for growth responses has been established by chemical and genetic approaches. To gain a better understanding of spatio-temporal transcriptional regulation of shade-induce auxin, I generated in a second project, an organ specific high throughput data focusing on cotyledon and hypocotyl of young Arabidopsis seedlings. Interestingly, both organs show an opposite growth regulation by shade. I first investigated the spatio-transcriptional regulation of auxin re- sponsive gene, in order to determine how broad gene expression pattern can be explained by the hypothesized movement of auxin from cotyledons to hypocotyls in shade. The analysis suggests, that several genes are indeed regulated according to our prediction and others are regulated in a more complex manner. In addition, analysis of gene families of auxin biosynthetic and transport components, lead to the identification of essential family members for shade-induced growth re- sponses, which were subsequently experimentally confirmed. Finally, the analysis of expression pattern identified several candidate genes, which possibly explain aspects of the opposite growth response of the different organs.
Resumo:
AIM: Heart disease is recognized as a consequence of dysregulation of cardiac gene regulatory networks. Previously, unappreciated components of such networks are the long non-coding RNAs (lncRNAs). Their roles in the heart remain to be elucidated. Thus, this study aimed to systematically characterize the cardiac long non-coding transcriptome post-myocardial infarction and to elucidate their potential roles in cardiac homoeostasis. METHODS AND RESULTS: We annotated the mouse transcriptome after myocardial infarction via RNA sequencing and ab initio transcript reconstruction, and integrated genome-wide approaches to associate specific lncRNAs with developmental processes and physiological parameters. Expression of specific lncRNAs strongly correlated with defined parameters of cardiac dimensions and function. Using chromatin maps to infer lncRNA function, we identified many with potential roles in cardiogenesis and pathological remodelling. The vast majority was associated with active cardiac-specific enhancers. Importantly, oligonucleotide-mediated knockdown implicated novel lncRNAs in controlling expression of key regulatory proteins involved in cardiogenesis. Finally, we identified hundreds of human orthologues and demonstrate that particular candidates were differentially modulated in human heart disease. CONCLUSION: These findings reveal hundreds of novel heart-specific lncRNAs with unique regulatory and functional characteristics relevant to maladaptive remodelling, cardiac function and possibly cardiac regeneration. This new class of molecules represents potential therapeutic targets for cardiac disease. Furthermore, their exquisite correlation with cardiac physiology renders them attractive candidate biomarkers to be used in the clinic.
Resumo:
Therapeutic nanoparticles (NPs) are used in nanomedicine as drug carriers or imaging agents, providing increased selectivity/specificity for diseased tissues. The first NPs in nanomedicine were developed for increasing the efficacy of known drugs displaying dose-limiting toxicity and poor bioavailability and for enhancing disease detection. Nanotechnologies have gained much interest owing to their huge potential for applications in industry and medicine. It is necessary to ensure and control the biocompatibility of the components of therapeutic NPs to guarantee that intrinsic toxicity does not overtake the benefits. In addition to monitoring their toxicity in vitro, in vivo and in silico, it is also necessary to understand their distribution in the human body, their biodegradation and excretion routes and dispersion in the environment. Therefore, a deep understanding of their interactions with living tissues and of their possible effects in the human (and animal) body is required for the safe use of nanoparticulate formulations. Obtaining this information was the main aim of the NanoTEST project, and the goals of the reports collected together in this special issue are to summarise the observations and results obtained by the participating research teams and to provide methodological tools for evaluating the biological impact of NPs.
Resumo:
Because natural selection is likely to act on multiple genes underlying a given phenotypic trait, we study here the potential effect of ongoing and past selection on the genetic diversity of human biological pathways. We first show that genes included in gene sets are generally under stronger selective constraints than other genes and that their evolutionary response is correlated. We then introduce a new procedure to detect selection at the pathway level based on a decomposition of the classical McDonald-Kreitman test extended to multiple genes. This new test, called 2DNS, detects outlier gene sets and takes into account past demographic effects and evolutionary constraints specific to gene sets. Selective forces acting on gene sets can be easily identified by a mere visual inspection of the position of the gene sets relative to their two-dimensional null distribution. We thus find several outlier gene sets that show signals of positive, balancing, or purifying selection but also others showing an ancient relaxation of selective constraints. The principle of the 2DNS test can also be applied to other genomic contrasts. For instance, the comparison of patterns of polymorphisms private to African and non-African populations reveals that most pathways show a higher proportion of nonsynonymous mutations in non-Africans than in Africans, potentially due to different demographic histories and selective pressures.
Resumo:
The second scientific meeting of the European systems genetics network for the study of complex genetic human disease using genetic reference populations (SYSGENET) took place at the Center for Cooperative Research in Biosciences in Bilbao, Spain, December 10-12, 2012. SYSGENET is funded by the European Cooperation in the Field of Scientific and Technological Research (COST) and represents a network of scientists in Europe that use mouse genetic reference populations (GRPs) to identify complex genetic factors influencing disease phenotypes (Schughart, Mamm Genome 21:331-336, 2010). About 50 researchers working in the field of systems genetics attended the meeting, which consisted of 27 oral presentations, a poster session, and a management committee meeting. Participants exchanged results, set up future collaborations, and shared phenotyping and data analysis methodologies. This meeting was particularly instrumental for conveying the current status of the US, Israeli, and Australian Collaborative Cross (CC) mouse GRP. The CC is an open source project initiated nearly a decade ago by members of the Complex Trait Consortium to aid the mapping of multigenetic traits (Threadgill, Mamm Genome 13:175-178, 2002). In addition, representatives of the International Mouse Phenotyping Consortium were invited to exchange ongoing activities between the knockout and complex genetics communities and to discuss and explore potential fields for future interactions.