940 resultados para Transcription factor binding site motifs


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a study towards elucidating the role of aromatases during puberty in female grey mullet, the cDNAs of the brain (muCyp19b) and ovarian (muCyp19a) aromatase were isolated by RT-PCR and their relative expression levels were determined by quantitative real-time RT-PCR. The muCyp19a ORF of 1515 bp encoded 505 predicted amino acid residues, while that of muCyp19b was 1485 bp and encoded 495 predicted amino acid residues. The expression level of muCyp19b significantly increased in the brain as puberty advanced; however, its expression level in the pituitary increased only slightly with pubertal development. In the ovary, the muCyp19a expression level markedly increased as puberty progressed. The promoter regions of the two genes were also isolated and their functionality evaluated in vitro using luciferase as the reporter gene. The muCyp19a promoter sequence (650 bp) contained a consensus TATA box and putative transcription factor binding sites, including two half EREs, an SF-1, an AhR/Arnt, a PR and two GATA-3s. The muCyp19b promoter sequence (2500 bp) showed consensus TATA and CCAAT boxes and putative transcription binding sites, namely: a PR, an ERE, a half ERE, a SP-1, two GATA-binding factor, one half GATA-1, two C/EBPs, a GRE, a NFkappaB, three STATs, a PPAR/RXR, an Ahr/Arnt and a CRE. Basal activity of serially deleted promoter constructs transiently transfected into COS-7, [alpha]T3 and TE671 cells demonstrated the enhancing and silencing roles of the putative transcription factor binding sites. Quinpirole, a dopamine agonist, significantly reduced the promoter activity of muCyp19b in TE671. The results suggest tissue-specific regulation of the muCyp19 genes and a putative alternative promoter for muCyp19b.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genome-wide association studies show strong evidence of association with endometriosis for markers on chromosome 1p36 spanning the potential candidate genes WNT4, CDC42 and LINC00339. WNT4 is involved in development of the uterus, and the expression of CDC42 and LINC00339 are altered in women with endometriosis. We conducted fine mapping to examine the role of coding variants in WNT4 and CDC42 and determine the key SNPs with strongest evidence of association in this region. We identified rare coding variants in WNT4 and CDC42 present only in endometriosis cases. The frequencies were low and cannot account for the common signal associated with increased risk of endometriosis. Genotypes for five common SNPs in the region of chromosome 1p36 show stronger association signals when compared with rs7521902 reported in published genome scans. Of these, three SNPs rs12404660, rs3820282, and rs55938609 were located in DNA sequences with potential functional roles including overlap with transcription factor binding sites for FOXA1, FOXA2, ESR1, and ESR2. Functional studies will be required to identify the gene or genes implicated in endometriosis risk.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Growth is a fundamental aspect of life cycle of all organisms. Body size varies highly in most animal groups, such as mammals. Moreover, growth of a multicellular organism is not uniform enlargement of size, but different body parts and organs grow to their characteristic sizes at different times. Currently very little is known about the molecular mechanisms governing this organ-specific growth. The genome sequencing projects have provided complete genomic DNA sequences of several species over the past decade. The amount of genomic sequence information, including sequence variants within species, is constantly increasing. Based on the universal genetic code, we can make sense of this sequence information as far as it codes proteins. However, less is known about the molecular mechanisms that control expression of genes, and about the variations in gene expression that underlie many pathological states in humans. This is caused in part by lack of information about the second genetic code that consists of the binding specificities of transcription factors and the combinatorial code by which transcription factor binding sites are assembled to form tissue-specific and/or ligand-regulated enhancer elements. This thesis presents a high-throughput assay for identification of transcription factor binding specificities, which were then used to measure the DNA binding profiles of transcription factors involved in growth control. We developed ‘enhancer element locator’, a computational tool, which can be used to predict functional enhancer elements. A genome-wide prediction of human and mouse enhancer elements generated a large database of enhancer elements. This database can be used to identify target genes of signaling pathways, and to predict activated transcription factors based on changes in gene expression. Predictions validated in transgenic mouse embryos revealed the presence of multiple tissue-specific enhancers in mouse c- and N-Myc genes, which has implications to organ specific growth control and tumor type specificity of oncogenes. Furthermore, we were able to locate a variation in a single nucleotide, which carries a susceptibility to colorectal cancer, to an enhancer element and propose a mechanism by which this SNP might be involved in generation of colorectal cancer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid recent increase in microarray-based gene expression studies in the corpus luteum (CL) utilizing macaque models gathered increasing volume of data in publically accessible microarray expression databases. Examining gene pathways in different functional states of CL may help to understand the factors that control luteal function and hence human fertility. Co-regulation of genes in microarray experiments may imply common transcriptional regulation by sequence-specific DNA-binding transcriptional factors. We have computationally analyzed the transcription factor binding sites (TFBS) in a previously reported macaque luteal microarray gene set (n = 15) that are common targets of luteotropin (luteinizing hormone (LH) and human chorionic gonadotropin (hCG)) and luteolysin (prostaglandin (PG) F-2 alpha). This in silico approach can reveal transcriptional networks that control these important genes which are representative of the interplay between luteotropic and luteolytic factors in the control of luteal function. Our computational analyses revealed 6 matrix families whose binding sites are significantly over-represented in promoters of these genes. The roles of these factors are discussed, which might help to understand the transcriptional regulatory network in the control of luteal function. These factors might be promising experimental targets for investigation of human luteal insufficiency. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transcriptional regulation enables adaptation in bacteria. Typically, only a few transcriptional events are well understood, leaving many others unidentified. The recent genome-wide identification of transcription factor binding sites in Mycobacterium tuberculosis has changed this by deciphering a molecular road-map of transcriptional control, indicating active events and their immediate downstream effects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

MOTIVATION: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. RESULTS: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs. AVAILABILITY: If interested in the code for the work presented in this article, please contact the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

TRAIL (Apo2 ligand) described as a type II transmembrame protein belonging to the TNF superfamily can induce apoptotic cell death in a variety of cell types. In the present study, a putative cDNA sequence encoding the 299 amino acids of TRAIL (GC-TRAIL) and its genomic organization were identified in grass carp Ctenopharyngodon idella. The predicted GC-TRAIL sequence showed 44 and 41% identities to chicken and human TRAILs, respectively. In a domain search, a tumor necrosis factor homology domain (THD) was identified in the C-terminal portion of TRAILs. The GC-TRAIL gene consists of five exons, with four intervening introns, spaced over approximately 4 kb of genomic sequence. Analysis of GC-TRAlL promoter region revealed the presence of a number of putative transcription factor binding sites, such as Sp1, NF-kappaB, AP-1, GATA, NFAT, HNF, STAT, P53 and IRFI sequences which are important for the expression of other TNF family members. Phylogenetic analysis placed GC-TRAIL and the putative zebrafish (Danio rerio) TRAIL obtained from searching the zebrafish database into one separate cluster near mammalian TRAIL genes, but apart from the reported zebrafish TRAIL-like protein, indicating that the GC-TRAIL is an authentic fish TRAIL. Expression analysis revealed that GC-TRAIL is expressed in many tissues, such as in gills, liver, trunk kidney, head kidney, intestine and spleen. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We report the cloning of a novel antimicrobial peptide gene, termed rtCATH_1, found in the rainbow trout, Oncorhynchus mykiss. The predicted 216-residue rtCATH_1 prepropeptide consists of three domains: a 22-residue signal peptide, a 128-residue cathelin-like region containing two identifiable cathelicidin family signatures, and a predicted 66-residue C-terminal cationic antimicrobial peptide. This predicted mature peptide was unique in possessing features of different known (mammalian) cathelicidin subgroups, such as the cysteine-bridged family and the specific amino-acid-rich family. The rtCATH_1 gene comprises four exons, as seen in all known mammalian cathelicidin genes, and several transcription factor binding sites known to be of relevance to host defenses were identified in the 5' flanking region. By Northern blot analysis, the expression of rtCATH_1 was detected in gill, head kidney, and spleen of bacterially challenged fish. Primary cultures of head kidney leukocytes from rainbow trout stimulated with lipopolysaccharide or poly(I (.) C) also expressed riCATH_1. A 36-residue peptide corresponding to the core part of the fish cathelicidin was chemically synthesized and shown to exhibit potent antimicrobial activity and a low hemolytic effect. Thus, rtCATH_1 represents a novel antimicrobial peptide gene belonging to the cathelicidin family and may play an important role in the innate immunity of rainbow trout.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Extracellular superoxide dismutase (ECSOD) is a major extracellular antioxidant enzyme that protects organs from damage by reactive oxygen species (ROS). We cloned a novel ECSOD from the bay scallop Argopecten irradians (AiECSOD) by 3' and 5' RACE. The full-length cDNA of AiECSOD was 893 bp with a 657 bp open reading frame encoding 218 amino acids. The deduced amino acid sequence contained a putative signal peptide of 20 amino acids, and sequence comparison showed that AiECSOD had low degree of homology to ECSODs of other organisms. The genomic length of the AiECSOD gene was about 5276 bp containing five exons and six introns. The promoter region contained many putative transcription factor binding sites such as c-Myb, Oct-1, Sp1, Kruppel-like, c-ETS, NF kappa B, GATA-1, AP-1, and Ubx binding sites. Furthermore, tissue-specific expressions of AiECSOD and temporal expressions of AiECSOD in haemocytes of bay scallops challenged with bacteria Vibrio anguillarum were quantified using qRT-PCR. High levels of expression were detected in haemocytes, but not in gonad and mantle. The expression of AiECSOD reached the highest level at 12 h post-injection with V. anguillarum and then returned to normal between 24 h and 48 h post-injection. These results indicated that AiECSOD was an inducible protein and that it may play an important role in the immune responses against V anguillarum. Crown Copyright (C) 2008 Published by Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Superoxide dismutases are an ubiquitous family of enzymes that function to efficiently catalyze the dismutation of superoxide anions. Two unique and highly compartmentalized bay scallop Argopecten irradians superoxide dismutases: MnSOD and ecCuZnSOD, have been molecularly characterized in our previous study. To complete characterize the SOD family in A. irradians, a novel intracellular copper/zinc SOD from the A. irradians (Ai-icCuZnSOD) was obtained and characterized. The full-length cDNA of Ai-icCuZnSOD was 1047 bp with a 459 bp open reading frame encoding 152 amino acids. The genomic length of the Ai-icCuZnSOD gene was about 4279 bp containing 4 exons and 3 introns. The promoter region containing many putative transcription factor binding sites were analyzed. Furthermore, quantitative reverse transcriptase real-time PCR (qRT-PCR) analysis indicated that the highest expression of the Ai-icCuZnSOD was detected in gill and the expression profiles in hemocytes of bay scallops challenged with bacteria Vibrio anguillarum and lipopolysaccharide (LPS) were different. The result presented an increased expression after injection with LPS whereas no significant changes were observed after V. anguillarum injection. A fusion protein containing Ai-icCuZnSOD was produced in vitro. The rAi-icCuZnSOD is a stable enzyme, retaining more than 80% of its activity between 10 and 60 degrees C and keeping above 88% of its activity at pH values between 5.8 and 9. Ai-icCuZnSOD is more stable under alkaline than acidic conditions. Crown Copyright (C) 2009 Published by Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The p75 neurotrophin receptor (p75NTR) is a member of the tumour necrosis factor superfamily, which relies on the recruitment of cytosolic protein partners - including the TNF receptor associated factor 6 (TRAF6) E3 ubiquitin ligase - to produce cellular responses such as apoptosis, survival, and inhibition of neurite outgrowth. Recently,p75NTR was also shown to undergo γ-secretase-mediated regulated intramembrane proteolysis, and the receptor ICD was found to migrate to the nucleus where it regulates gene transcription. Moreover, γ-secretase-mediated proteolysis was shown to be involved in glioblastoma cell migration and invasion. In this study we report that TRAF6-mediated K63-linked polyubiquitination at multiple or alternative lysine residues influences p75NTR-ICD stability in vitro. In addition, we found that TRAF6-mediated ubiquitination of p75NTR is not influenced by inhibition of dynamin. Moreover, we report beta-transducin repeats-containing protein (β-TrCP) as a novel E3- ligase that ubiquitinates p75NTR, which is independent of serine phosphorylation of the p75NTR destruction motif. In contrast to its influence on other substrates, co-expression of β-TrCP did not reduce p75NTR stability. We created U87-MG glioblastoma cell lines stably expressing wild type, γ-secretaseresistant and constitutively cleaved receptor, as well as the ICD-stabilized mutant K301R. Interestingly, only wild-type p75NTR induces increased glioblastoma cell migration, which could be reversed by application of γ-secretase inhibitor. Microarray and qRT-PCR analysis of mRNA transcripts in these cell lines yielded several promising genes that might be involved in glioblastoma cell migration and invasion, such as cadherin 11 and matrix metalloproteinase 12. Analysis of potential transcription factor binding sites revealed that transcription of these genes might be regulated by well known p75NTR signalling cascades such as NF-κB or JNK signalling, which are independent of γ-secretase-mediated cleavage of the receptor. In contrast, while p75NTR overexpression was confirmed in melanoma cell lines and a patient sample of melanoma metastasis to the brain, inhibition of γ-secretase did not influence melanoma cell migration. Collectively, this study provides several avenues to better understand the physiological importance of posttranslational modifications of p75NTR and the significance of the receptor in glioblastoma cell migration and invasion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider the problem of variable selection in regression modeling in high-dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high-dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree k. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences. © 2010 American Statistical Association.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gremlin, a cell growth and differentiation factor, promotes the development of diabetic nephropathy in animal models, but whether GREM1 gene variants associate with diabetic nephropathy is unknown. We comprehensively screened the 5' upstream region (including the predicted promoter), all exons, intron-exon boundaries, complete untranslated regions, and the 3' region downstream of the GREM1 gene. We identified 31 unique variants, including 24 with a minor allele frequency exceeding 5%, and 9 haplotype-tagging single nucleotide polymorphisms (htSNPs). We selected one additional variant that we predicted to alter transcription factor binding. We genotyped 709 individuals with type 1 diabetes of whom 267 had nephropathy (cases) and 442 had no evidence of kidney disease (controls). Three individual SNPs significantly associated with nephropathy at the 5% level, and two remained significant after adjustment for multiple testing. Subsequently, we genotyped a replicate population comprising 597 cases and 502 controls: this population supported an association with one of the SNPs (rs1129456; P = 0.0003). Combined analysis, adjusted for recruitment center (n = 8), suggested that the T allele conferred greater odds of nephropathy (OR 1.69; 95% CI 1.36 to 2.11). In summary, the GREM1 variant rs1129456 associates with diabetic nephropathy, perhaps explaining some of the genetic susceptibility to this condition.