983 resultados para Functional genomics


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The main focus of this thesis is the use of high-throughput sequencing technologies in functional genomics (in particular in the form of ChIP-seq, chromatin immunoprecipitation coupled with sequencing, and RNA-seq) and the study of the structure and regulation of transcriptomes. Some parts of it are of a more methodological nature while others describe the application of these functional genomic tools to address various biological problems. A significant part of the research presented here was conducted as part of the ENCODE (ENCyclopedia Of DNA Elements) Project.

The first part of the thesis focuses on the structure and diversity of the human transcriptome. Chapter 1 contains an analysis of the diversity of the human polyadenylated transcriptome based on RNA-seq data generated for the ENCODE Project. Chapter 2 presents a simulation-based examination of the performance of some of the most popular computational tools used to assemble and quantify transcriptomes. Chapter 3 includes a study of variation in gene expression, alternative splicing and allelic expression bias on the single-cell level and on a genome-wide scale in human lymphoblastoid cells; it also brings forward a number of critical to the practice of single-cell RNA-seq measurements methodological considerations.

The second part presents several studies applying functional genomic tools to the study of the regulatory biology of organellar genomes, primarily in mammals but also in plants. Chapter 5 contains an analysis of the occupancy of the human mitochondrial genome by TFAM, an important structural and regulatory protein in mitochondria, using ChIP-seq. In Chapter 6, the mitochondrial DNA occupancy of the TFB2M transcriptional regulator, the MTERF termination factor, and the mitochondrial RNA and DNA polymerases is characterized. Chapter 7 consists of an investigation into the curious phenomenon of the physical association of nuclear transcription factors with mitochondrial DNA, based on the diverse collections of transcription factor ChIP-seq datasets generated by the ENCODE, mouseENCODE and modENCODE consortia. In Chapter 8 this line of research is further extended to existing publicly available ChIP-seq datasets in plants and their mitochondrial and plastid genomes.

The third part is dedicated to the analytical and experimental practice of ChIP-seq. As part of the ENCODE Project, a set of metrics for assessing the quality of ChIP-seq experiments was developed, and the results of this activity are presented in Chapter 9. These metrics were later used to carry out a global analysis of ChIP-seq quality in the published literature (Chapter 10). In Chapter 11, the development and initial application of an automated robotic ChIP-seq (in which these metrics also played a major role) is presented.

The fourth part presents the results of some additional projects the author has been involved in, including the study of the role of the Piwi protein in the transcriptional regulation of transposon expression in Drosophila (Chapter 12), and the use of single-cell RNA-seq to characterize the heterogeneity of gene expression during cellular reprogramming (Chapter 13).

The last part of the thesis provides a review of the results of the ENCODE Project and the interpretation of the complexity of the biochemical activity exhibited by mammalian genomes that they have revealed (Chapters 15 and 16), an overview of the expected in the near future technical developments and their impact on the field of functional genomics (Chapter 14), and a discussion of some so far insufficiently explored research areas, the future study of which will, in the opinion of the author, provide deep insights into many fundamental but not yet completely answered questions about the transcriptional biology of eukaryotes and its regulation.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Clare, A. and King R.D. (2002) Machine learning of functional class from phenotype data. Bioinformatics 18(1) 160-166

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Nicastrin (NCSTN) is a component of the ?-secretase complex and therefore potentially a candidate risk gene for Alzheimer's disease. Here, we have developed a novel functional genomics methodology to express common locus haplotypes to assess functional differences. DNA recombination was used to engineer 5 bacterial artificial chromosomes (BACs) to each express a different haplotype of the NCSTN locus. Each NCSTN-BAC was delivered to knockout nicastrin (Ncstn(-/-)) cells and clonal NCSTN-BAC(+)/Ncstn(-/-) cell lines were created for functional analyses. We showed that all NCSTN-BAC haplotypes expressed nicastrin protein and rescued ?-secretase activity and amyloid beta (Aß) production in NCSTN-BAC(+)/Ncstn(-/-) lines. We then showed that genetic variation at the NCSTN locus affected alternative splicing in human postmortem brain tissue. However, there was no robust functional difference between clonal cell lines rescued by each of the 5 different haplotypes. Finally, there was no statistically significant association of NCSTN with disease risk in the 4 cohorts. We therefore conclude that it is unlikely that common variation at the NCSTN locus is a risk factor for Alzheimer's disease.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background Despite the frequent isolation of Salmonella enterica sub. enterica serovars Derby and Mbandaka from livestock in the UK and USA little is known about the biological processes maintaining their prevalence. Statistics for Salmonella isolations from livestock production in the UK show that S. Derby is most commonly associated with pigs and turkeys and S. Mbandaka with cattle and chickens. Here we compare the first sequenced genomes of S. Derby and S. Mbandaka as a basis for further analysis of the potential host adaptations that contribute to their distinct host species distributions. Results Comparative functional genomics using the RAST annotation system showed that predominantly mechanisms that relate to metabolite utilisation, in vivo and ex vivo persistence and pathogenesis distinguish S. Derby from S. Mbandaka. Alignment of the genome nucleotide sequences of S. Derby D1 and D2 and S. Mbandaka M1 and M2 with Salmonella pathogenicity islands (SPI) identified unique complements of genes associated with host adaptation. We also describe a new genomic island with a putative role in pathogenesis, SPI-23. SPI-23 is present in several S. enterica serovars, including S. Agona, S. Dublin and S. Gallinarum, it is absent in its entirety from S. Mbandaka. Conclusions We discovered a new 37 Kb genomic island, SPI-23, in the chromosome sequence of S. Derby, encoding 42 ORFS, ten of which are putative TTSS effector proteins. We infer from full-genome synonymous SNP analysis that these two serovars diverged, between 182kya and 625kya coinciding with the divergence of domestic pigs. The differences between the genomes of these serovars suggest they have been exposed to different stresses including, phage, transposons and prolonged externalisation. The two serovars possess distinct complements of metabolic genes; many of which cluster into pathways for catabolism of carbon sources.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Legumes develop root nodules from pluripotent stem cells in the rootpericycle in response to mitogenic activation by a decorated chitin-likenodulation factor synthesized in Rhizobium bacteria. The soybean genes encoding the receptor for such signals were cloned using map-based cloning approaches. Pluripotent cells in the root pericycle and the outer or inner cortex undergo repeated cell divisions to initiate a composite nodule primordium that develops to a functional nitrogen-fixing nodule. The process itself is autoregulated, leading to the characteristic nodulation of the upper root system. Autoregulation of nodulation (AON) in all legumes is controlled in part by a leucine-rich repeat receptor kinase gene (GmNARK). Mutations of GmNARK, and its other legume orthologues, result in abundant nodulation caused by the loss of a yet-undefined negative nodulation repressor system. AON receptor kinases are involved in perception of a long distance, root-derived signal, to negatively control nodule proliferation. GmNARK and LjHAR1 are expressed in phloem parenchyma. GmNARK kinase domain interacts with Kinase Associated Protein Phosphatase (KAPP). NARK gene expression did not mirror biological NARK activity in nodulation control, as q-RT-PCR in soybean revealed high NARK expression in roots, root tips, leaves, petioles, stems and hypocotyls, while shoot and root apical meristems were devoid of NARK RNA. High through-put transcript analysis in soybean leaf and root indicated that major genes involved in JA synthesis or response are preferentially down-regulated in leaf but not root of wild type, but not NARK mutants, suggesting that AON signaling may in part be controlled by events relating to hormone metabolism. Ethylene and abscisic acid insensitive mutants of L. japonicus are described. Nodulation in legumes has significance to global economies and ecologies, as the nitrogen input into the biosphere allows food, feed and biofuel production without the inherent costs associated with nitrogen fertilization [1]. Nodulation involves the production of a new organ capable of nitrogen fixation [2] and as such is an excellent system to study plant – microbe interaction, plant development, long distance signaling and functional genomics of stem cell proliferation [3, 4]. Concerted international effort over the last 20 years, using a combination of induced mutagenesis followed by gene discovery (forward genetics), and molecular/biochemical approaches revealed a complex developmental pathway that ‘loans’ genetic programs from various sources and orchestrates these into a novel contribution. We report our laboratory’s contribution to the present analysis in the field.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The last few years have seen the advent of high-throughput technologies to analyze various properties of the transcriptome and proteome of several organisms. The congruency of these different data sources, or lack thereof, can shed light on the mechanisms that govern cellular function. A central challenge for bioinformatics research is to develop a unified framework for combining the multiple sources of functional genomics information and testing associations between them, thus obtaining a robust and integrated view of the underlying biology. We present a graph theoretic approach to test the significance of the association between multiple disparate sources of functional genomics data by proposing two statistical tests, namely edge permutation and node label permutation tests. We demonstrate the use of the proposed tests by finding significant association between a Gene Ontology-derived "predictome" and data obtained from mRNA expression and phenotypic experiments for Saccharomyces cerevisiae. Moreover, we employ the graph theoretic framework to recast a surprising discrepancy presented in Giaever et al. (2002) between gene expression and knockout phenotype, using expression data from a different set of experiments.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The flood of new genomic sequence information together with technological innovations in protein structure determination have led to worldwide structural genomics (SG) initiatives. The goals of SG initiatives are to accelerate the process of protein structure determination, to fill in protein fold space and to provide information about the function of uncharacterized proteins. In the long-term, these outcomes are likely to impact on medical biotechnology and drug discovery, leading to a better understanding of disease as well as the development of new therapeutics. Here we describe the high throughput pipeline established at the University of Queensland in Australia. In this focused pipeline, the targets for structure determination are proteins that are expressed in mouse macrophage cells and that are inferred to have a role in innate immunity. The aim is to characterize the molecular structure and the biochemical and cellular function of these targets by using a parallel processing pipeline. The pipeline is designed to work with tens to hundreds of target gene products and comprises target selection, cloning, expression, purification, crystallization and structure determination. The structures from this pipeline will provide insights into the function of previously uncharacterized macrophage proteins and could lead to the validation of new drug targets for chronic obstructive pulmonary disease and arthritis. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND: Left ventricular (LV) hypertrophy is a risk factor for cardiovascular death, but the genetic factors determining LV size and predisposition to hypertrophy are not well understood. We have previously linked the quantitative trait locus cardiac mass 22 (Cm22) on chromosome 2 with cardiac hypertrophy independent of blood pressure in the spontaneously hypertensive rat. From an original cross of spontaneously hypertensive rat with F344 rats, we derived a normotensive polygenic model of spontaneous cardiac hypertrophy, the hypertrophic heart rat (HHR) and its control strain, the normal heart rat (NHR).

METHODS AND RESULTS: To identify the genes and molecular mechanisms underlying spontaneous LV hypertrophy we sequenced the HHR genome with special focus on quantitative trait locus Cm22. For correlative analyses of function, we measured global RNA transcripts in LV of neonatal HHR and NHR and 198 neonatal rats of an HHR × NHR F2 crossbred population. Only one gene within locus Cm22 was differentially expressed in the parental generation: tripartite motif-containing 55 (Trim55), with mRNA downregulation in HHR (P < 0.05) and reduced protein expression. Trim55 mRNA levels were negatively correlated with LV mass in the F2 cross (r = -0.16, P = 0.025). In exon nine of Trim55 in HHR, we found one missense mutation that functionally alters protein structure. This mutation was strongly associated with Trim55 mRNA expression in F2 rats (F = 10.35, P < 0.0001). Similarly, in humans, we found reduced Trim55 expression in hearts of subjects with idiopathic dilated cardiomyopathy.

CONCLUSION: Our study suggests that the Trim55 gene, located in Cm22, is a novel candidate gene for polygenic LV hypertrophy independent of blood pressure.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent studies of gene silencing in plants have revealed two RNA-mediated epigenetic processes, RNA-directed RNA degradation and RNA-directed DNA methylation. These natural processes have provided new avenues for developing high-efficiency, high-throughput technology for gene suppression in plants.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world’s poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16–45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Ankylosing spondylitis (AS) is a common, highly heritable, inflammatory arthropathy. In addition to being strongly associated with HLA-B27, a further 13 genes have been robustly associated with the disease. These genes highlight the involvement of the IL-23 pathway in disease pathogenesis, and indicate overlaps between the pathogenesis of AS, and of inflammatory bowel disease. Genetic associations in B27-positive and -negative disease are similar, with the main exception of association with ERAP1, which is restricted in association to B27-positive cases. This restriction, and the known function of ERAP1 in peptide trimming prior to HLA Class I presentation, indicates that HLA-B27 is likely to operate in AS by a mechanism involving aberrant peptide handling. These advances point to several potential novel therapeutic approaches in AS.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world’s poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16-45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

During the past ten years, large-scale transcript analysis using microarrays has become a powerful tool to identify and predict functions for new genes. It allows simultaneous monitoring of the expression of thousands of genes and has become a routinely used tool in laboratories worldwide. Microarray analysis will, together with other functional genomics tools, take us closer to understanding the functions of all genes in genomes of living organisms. Flower development is a genetically regulated process which has mostly been studied in the traditional model species Arabidopsis thaliana, Antirrhinum majus and Petunia hybrida. The molecular mechanisms behind flower development in them are partly applicable in other plant systems. However, not all biological phenomena can be approached with just a few model systems. In order to understand and apply the knowledge to ecologically and economically important plants, other species also need to be studied. Sequencing of 17 000 ESTs from nine different cDNA libraries of the ornamental plant Gerbera hybrida made it possible to construct a cDNA microarray with 9000 probes. The probes of the microarray represent all different ESTs in the database. From the gerbera ESTs 20% were unique to gerbera while 373 were specific to the Asteraceae family of flowering plants. Gerbera has composite inflorescences with three different types of flowers that vary from each other morphologically. The marginal ray flowers are large, often pigmented and female, while the central disc flowers are smaller and more radially symmetrical perfect flowers. Intermediate trans flowers are similar to ray flowers but smaller in size. This feature together with the molecular tools applied to gerbera, make gerbera a unique system in comparison to the common model plants with only a single kind of flowers in their inflorescence. In the first part of this thesis, conditions for gerbera microarray analysis were optimised including experimental design, sample preparation and hybridization, as well as data analysis and verification. Moreover, in the first study, the flower and flower organ-specific genes were identified. After the reliability and reproducibility of the method were confirmed, the microarrays were utilized to investigate transcriptional differences between ray and disc flowers. This study revealed novel information about the morphological development as well as the transcriptional regulation of early stages of development in various flower types of gerbera. The most interesting finding was differential expression of MADS-box genes, suggesting the existence of flower type-specific regulatory complexes in the specification of different types of flowers. The gerbera microarray was further used to profile changes in expression during petal development. Gerbera ray flower petals are large, which makes them an ideal model to study organogenesis. Six different stages were compared and specifically analysed. Expression profiles of genes related to cell structure and growth implied that during stage two, cells divide, a process which is marked by expression of histones, cyclins and tubulins. Stage 4 was found to be a transition stage between cell division and expansion and by stage 6 cells had stopped division and instead underwent expansion. Interestingly, at the last analysed stage, stage 9, when cells did not grow any more, the highest number of upregulated genes was detected. The gerbera microarray is a fully-functioning tool for large-scale studies of flower development and correlation with real-time RT-PCR results show that it is also highly sensitive and reliable. Gene expression data presented here will be a source for gene expression mining or marker gene discovery in the future studies that will be performed in the Gerbera Laboratory. The publicly available data will also serve the plant research community world-wide.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Protein Kinase-Like Non-kinases (PKLNKs), which are closely related to protein kinases, lack the crucial catalytic aspartate in the catalytic loop, and hence cannot function as protein kinase, have been analysed. Using various sensitive sequence analysis methods, we have recognized 82 PKLNKs from four higher eukaryotic organisms, namely, Homo sapiens, Mus musculus, Rattus norvegicus, and Drosophila melanogaster. On the basis of their domain combination and function, PKLNKs have been classified mainly into four categories: (1) Ligand binding PKLNKs, (2) PKLNKs with extracellular protein-protein interaction domain, (3) PKLNKs involved in dimerization, and (4) PKLNKs with cytoplasmic protein-protein interaction module. While members of the first two classes of PKLNKs have transmembrane domain tethered to the PKLNK domain, members of the other two classes of PKLNKs are cytoplasmic in nature. The current classification scheme hopes to provide a convenient framework to classify the PKLNKs from other eukaryotes which would be helpful in deciphering their roles in cellular processes.