709 resultados para Positional Weight Matrices
Resumo:
Gene silencing due to epigenetic mechanisms shows evidence of significant contributions to cancer development. We hypothesis that the genetic architecture based on retrotransposon elements surrounding the transcription start site, plays an important role in the suppression and promotion of DNA methylation. In our investigation we found a high rate of SINE and LINEs retrotransposon elements near the transcription start site of unmethylated genes when compared to methylated genes. The presence of these elements were positively associated with promoter methylation, contrary to logical expectations, due to the malicious effects of retrotransposon elements which insert themselves randomly into the genome causing possible loss of gene function. In our genome wide analysis of human genes, results suggested that 22% of the genes in cancer were predicted to be methylation-prone; in cancer these genes are generally down-regulated and function in the development process. In summary, our investigation validated our hypothesis and showed that these widespread genomic elements in cancer are highly associated with promoter DNA methylation and may further participate in influencing epigenetic regulation.
Resumo:
GeneID is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure. In the first step, splice sites, and start and stop codons are predicted and scored along the sequence using position weight matrices (PWMs). In the second step, exons are built from the sites. Exons are scored as the sum of the scores of the defining sites, plus the log-likelihood ratio of a Markov model for coding DNA. In the last step, from the set of predicted exons, the gene structure is assembled, maximizing the sum of the scores of the assembled exons. In this paper we describe the obtention of PWMs for sites, and the Markov model of coding DNA in Drosophila melanogaster. We also compare other models of coding DNA with the Markov model. Finally, we present and discuss the results obtained when GeneID is used to predict genes in the Adh region. These results show that the accuracy of GeneID predictions compares currently with that of other existing tools but that GeneID is likely to be more efficient in terms of speed and memory usage.
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
Inverse problems for dynamical system models of cognitive processes comprise the determination of synaptic weight matrices or kernel functions for neural networks or neural/dynamic field models, respectively. We introduce dynamic cognitive modeling as a three tier top-down approach where cognitive processes are first described as algorithms that operate on complex symbolic data structures. Second, symbolic expressions and operations are represented by states and transformations in abstract vector spaces. Third, prescribed trajectories through representation space are implemented in neurodynamical systems. We discuss the Amari equation for a neural/dynamic field theory as a special case and show that the kernel construction problem is particularly ill-posed. We suggest a Tikhonov-Hebbian learning method as regularization technique and demonstrate its validity and robustness for basic examples of cognitive computations.
Resumo:
A database (SpliceDB) of known mammalian splice site sequences has been developed. We extracted 43 337 splice pairs from mammalian divisions of the gene-centered Infogene database, including sites from incomplete or alternatively spliced genes. Known EST sequences supported 22 815 of them. After discarding sequences with putative errors and ambiguous location of splice junctions the verified dataset includes 22 489 entries. Of these, 98.71% contain canonical GT–AG junctions (22 199 entries) and 0.56% have non-canonical GC–AG splice site pairs. The remainder (0.73%) occurs in a lot of small groups (with a maximum size of 0.05%). We especially studied non-canonical splice sites, which comprise 3.73% of GenBank annotated splice pairs. EST alignments allowed us to verify only the exonic part of splice sites. To check the conservative dinucleotides we compared sequences of human non-canonical splice sites with sequences from the high throughput genome sequencing project (HTG). Out of 171 human non-canonical and EST-supported splice pairs, 156 (91.23%) had a clear match in the human HTG. They can be classified after sequence analysis as: 79 GC–AG pairs (of which one was an error that corrected to GC–AG), 61 errors corrected to GT–AG canonical pairs, six AT–AC pairs (of which two were errors corrected to AT–AC), one case was produced from a non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two other cases left of supported non-canonical splice pairs. The information about verified splice site sequences for canonical and non-canonical sites is presented in SpliceDB with the supporting evidence. We also built weight matrices for the major splice groups, which can be incorporated into gene prediction programs. SpliceDB is available at the computational genomic Web server of the Sanger Centre: http://genomic.sanger.ac.uk/spldb/SpliceDB.html and at http://www.softberry.com/spldb/SpliceDB.html.
Resumo:
rSNP_Guide is a novel curated database system for analysis of transcription factor (TF) binding to target sequences in regulatory gene regions altered by mutations. It accumulates experimental data on naturally occurring site variants in regulatory gene regions and site-directed mutations. This database system also contains the web tools for SNP analysis, i.e., active applet applying weight matrices to predict the regulatory site candidates altered by a mutation. The current version of the rSNP_Guide is supplemented by six sub-databases: (i) rSNP_DB, on DNA–protein interaction caused by mutation; (ii) SYSTEM, on experimental systems; (iii) rSNP_BIB, on citations to original publications; (iv) SAMPLES, on experimentally identified sequences of known regulatory sites; (v) MATRIX, on weight matrices of known TF sites; (vi) rSNP_Report, on characteristic examples of successful rSNP_Tools implementation. These databases are useful for the analysis of natural SNPs and site-directed mutations. The databases are available through the Web, http://wwwmgs.bionet.nsc.ru/mgs/systems/rsnp/.
Resumo:
Nelore is the major beef cattle breed in Brazil with more than 130 million heads. Genome-wide association studies (GWAS) are often used to associate markers and genomic regions to growth and meat quality traits that can be used to assist selection programs. An alternative methodology to traditional GWAS that involves the construction of gene network interactions, derived from results of several GWAS is the AWM (Association Weight Matrices)/PCIT (Partial Correlation and Information Theory). With the aim of evaluating the genetic architecture of Brazilian Nelore cattle, we used high-density SNP genotyping data (~770,000 SNP) from 780 Nelore animals comprising 34 half-sibling families derived from highly disseminated and unrelated sires from across Brazil. The AWM/PCIT methodology was employed to evaluate the genes that participate in a series of eight phenotypes related to growth and meat quality obtained from this Nelore sample.
Resumo:
We detail the automatic construction of R matrices corresponding to (the tensor products of) the (O-m\alpha(n)) families of highest-weight representations of the quantum superalgebras Uq[gl(m\n)]. These representations are irreducible, contain a free complex parameter a, and are 2(mn)-dimensional. Our R matrices are actually (sparse) rank 4 tensors, containing a total of 2(4mn) components, each of which is in general an algebraic expression in the two complex variables q and a. Although the constructions are straightforward, we describe them in full here, to fill a perceived gap in the literature. As the algorithms are generally impracticable for manual calculation, we have implemented the entire process in MATHEMATICA; illustrating our results with U-q [gl(3\1)]. (C) 2002 Published by Elsevier Science B.V.
Resumo:
Exchange matrices represent spatial weights as symmetric probability distributions on pairs of regions, whose margins yield regional weights, generally well-specified and known in most contexts. This contribution proposes a mechanism for constructing exchange matrices, derived from quite general symmetric proximity matrices, in such a way that the margin of the exchange matrix coincides with the regional weights. Exchange matrices generate in turn diffusive squared Euclidean dissimilarities, measuring spatial remoteness between pairs of regions. Unweighted and weighted spatial frameworks are reviewed and compared, regarding in particular their impact on permutation and normal tests of spatial autocorrelation. Applications include tests of spatial autocorrelation with diagonal weights, factorial visualization of the network of regions, multivariate generalizations of Moran's I, as well as "landscape clustering", aimed at creating regional aggregates both spatially contiguous and endowed with similar features.
Resumo:
The progressive degradation of resin-dentin bonds is due, in part, to the slow degradation of collagen fibrils in the hybrid layer by endogenous matrix metalloproteinases (MMPs) of the dentin matrix. In in vitro durability studies, the storage medium composition might be important because the optimum activity of MMPs requires both zinc and calcium. Objective. This study evaluated the effect of different storage media on changes in matrix stiffness, loss of dry weight or solubilization of collagen from demineralized dentin beams incubated in vitro for up to 60 days. Methods. Dentin beams (1 mm x 2 mm x 6 mm) were completely demineralized in 10% phosphoric acid. After baseline measurements of dry mass and elastic modulus (E) (3-point bending, 15% strain) the beams were divided into 5 groups (n = 11/group) and incubated at 37 degrees C in either media containing both zinc and calcium designated as complete medium (CM), calcium-free medium, zinc-free medium, a doubled-zinc medium or water. Beams were retested at 3, 7, 14, 30, and 60 days of incubation. The incubation media was hydrolyzed with HCl for the quantitation of hydroxyproline (HOP) as an index of solubilization of collagen by MMPs. Data were analyzed using repeated measures of ANOVA. Results. Both the storage medium and the storage time showed significant effects on E, mass loss and HOP release (p < 0.05). The incubation in CM resulted in relatively rapid and significant (p < 0.05) decreases in stiffness, and increasing amounts of mass loss. The HOP content of the experimental media also increased with incubation time but was significantly lower (p < 0.05) than in the control CM medium, the recommended storage medium. Conclusions. The storage solutions used to age resin-dentin bonds should be buffered solutions that contain both calcium and zinc. The common use of water as an aging medium may underestimate the hydrolytic activity of endogenous dentin MMPs. (c) 2010 Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.
Resumo:
Innovative composite materials made of continuous fibers embedded in mortar matrices have been recently received attention for externally bonded reinforcement of masonry structures. In this regards, application of natural fibers for strengthening of the repair mortars is attractive due to their low specific weight, sustainability and recycability. This paper presents experimental characterization of tensile and pull-out behavior of natural fibers embedded in two different mortar-based matrices. A lime-based and a geopolymeric-based mortar are used as sustainable and innovative matrices. The obtained experimental results and observations are presented and discussed.
Resumo:
Les amidons non modifiées et modifiés représentent un groupe d’excipients biodégradables et abondants particulièrement intéressant. Ils ont été largement utilisés en tant qu’excipients à des fins diverses dans des formulations de comprimés, tels que liants et/ou agents de délitement. Le carboxyméthylamidon sodique à haute teneur en amylose atomisé (SD HASCA) a été récemment proposé comme un excipient hydrophile à libération prolongée innovant dans les formes posologiques orales solides. Le carboxyméthylamidon sodique à haute teneur en amylose amorphe (HASCA) a d'abord été produit par l'éthérification de l'amidon de maïs à haute teneur en amylose avec le chloroacétate. HASCA a été par la suite séché par atomisation pour obtenir le SD HASCA. Ce nouvel excipient a montré des propriétés présentant certains avantages dans la production de formes galéniques à libération prolongée. Les comprimés matriciels produits à partir de SD HASCA sont peu coûteux, simples à formuler et faciles à produire par compression directe. Le principal objectif de cette recherche était de poursuivre le développement et l'optimisation des comprimés matriciels utilisant SD HASCA comme excipient pour des formulations orales à libération prolongée. A cet effet, des tests de dissolution simulant les conditions physiologiques du tractus gastro-intestinal les plus pertinentes, en tenant compte de la nature du polymère à l’étude, ont été utilisés pour évaluer les caractéristiques à libération prolongée et démontrer la performance des formulations SD HASCA. Une étude clinique exploratoire a également été réalisée pour évaluer les propriétés de libération prolongée de cette nouvelle forme galénique dans le tractus gastro-intestinal. Le premier article présenté dans cette thèse a évalué les propriétés de libération prolongée et l'intégrité physique de formulations contenant un mélange comprimé de principe actif, de chlorure de sodium et de SD HASCA, dans des milieux de dissolution biologiquement pertinentes. L'influence de différentes valeurs de pH acide et de temps de séjour dans le milieu acide a été étudiée. Le profil de libération prolongée du principe actif à partir d'une formulation de SD HASCA optimisée n'a pas été significativement affecté ni par la valeur de pH acide ni par le temps de séjour dans le milieu acide. Ces résultats suggèrent une influence limitée de la variabilité intra et interindividuelle du pH gastrique sur la cinétique de libération à partir de matrices de SD HASCA. De plus, la formulation optimisée a gardé son intégrité pendant toute la durée des tests de dissolution. L’étude in vivo exploratoire a démontré une absorption prolongée du principe actif après administration orale des comprimés matriciels de SD HASCA et a montré que les comprimés ne se sont pas désintégrés en passant par l'estomac et qu’ils ont résisté à l’hydrolyse par les α-amylases dans l'intestin. Le deuxième article présente le développement de comprimés SD HASCA pour une administration orale une fois par jour et deux fois par jour contenant du chlorhydrate de tramadol (100 mg et 200 mg). Ces formulations à libération prolongée ont présenté des valeurs de dureté élevées sans nécessiter l'ajout de liants, ce qui facilite la production et la manipulation des comprimés au niveau industriel. La force de compression appliquée pour produire les comprimés n'a pas d'incidence significative sur les profils de libération du principe actif. Le temps de libération totale à partir de comprimés SD HASCA a augmenté de manière significative avec le poids du comprimé et peut, de ce fait, être utilisé pour moduler le temps de libération à partir de ces formulations. Lorsque les comprimés ont été exposés à un gradient de pH et à un milieu à 40% d'éthanol, un gel très rigide s’est formé progressivement sur leur surface amenant à la libération prolongée du principe actif. Ces propriétés ont indiqué que SD HASCA est un excipient robuste pour la production de formes galéniques orales à libération prolongée, pouvant réduire la probabilité d’une libération massive de principe actif et, en conséquence, des effets secondaires, même dans le cas de co-administration avec une forte dose d'alcool. Le troisième article a étudié l'effet de α-amylase sur la libération de principe actif à partir de comprimés SD HASCA contenant de l’acétaminophène et du chlorhydrate de tramadol qui ont été développés dans les premières étapes de cette recherche (Acetaminophen SR et Tramadol SR). La modélisation mathématique a montré qu'une augmentation de la concentration d’α-amylase a entraîné une augmentation de l'érosion de polymère par rapport à la diffusion de principe actif comme étant le principal mécanisme contrôlant la libération de principe actif, pour les deux formulations et les deux temps de résidence en milieu acide. Cependant, même si le mécanisme de libération peut être affecté, des concentrations d’α-amylase allant de 0 UI/L à 20000 UI/L n'ont pas eu d'incidence significative sur les profils de libération prolongée à partir de comprimés SD HASCA, indépendamment de la durée de séjour en milieu acide, le principe actif utilisé, la teneur en polymère et la différente composition de chaque formulation. Le travail présenté dans cette thèse démontre clairement l'utilité de SD HASCA en tant qu'un excipient à libération prolongée efficace.
Resumo:
A high-resolution physical and genetic map of a major fruit weight quantitative trait locus (QTL), fw2.2, has been constructed for a region of tomato chromosome 2. Using an F2 nearly isogenic line mapping population (3472 individuals) derived from Lycopersicon esculentum (domesticated tomato) × Lycopersicon pennellii (wild tomato), fw2.2 has been placed near TG91 and TG167, which have an interval distance of 0.13 ± 0.03 centimorgan. The physical distance between TG91 and TG167 was estimated to be ≤ 150 kb by pulsed-field gel electrophoresis of tomato DNA. A physical contig composed of six yeast artificial chromosomes (YACs) and encompassing fw2.2 was isolated. No rearrangements or chimerisms were detected within the YAC contig based on restriction fragment length polymorphism analysis using YAC-end sequences and anchored molecular markers from the high-resolution map. Based on genetic recombination events, fw2.2 could be narrowed down to a region less than 150 kb between molecular markers TG91 and HSF24 and included within two YACs: YAC264 (210 kb) and YAC355 (300 kb). This marks the first time, to our knowledge, that a QTL has been mapped with such precision and delimited to a segment of cloned DNA. The fact that the phenotypic effect of the fw2.2 QTL can be mapped to a small interval suggests that the action of this QTL is likely due to a single gene. The development of the high-resolution genetic map, in combination with the physical YAC contig, suggests that the gene responsible for this QTL and other QTLs in plants can be isolated using a positional cloning strategy. The cloning of fw2.2 will likely lead to a better understanding of the molecular biology of fruit development and to the genetic engineering of fruit size characteristics.
Resumo:
Investigations were undertaken to study the role of the protein cross-linking enzyme tissue transglutaminase in changes associated with the extracellular matrix and in the cell death of human dermal fibroblasts following exposure to a solarium ultraviolet A source consisting of 98.8% ultraviolet A and 1.2% ultraviolet B. Exposure to nonlethal ultraviolet doses of 60 to 120 kJ per m2 resulted in increased tissue transglutaminase activity when measured either in cell homogenates, "in situ" by incorporation of fluorescein-cadaverine into the extracellular matrix or by changes in the epsilon(gamma-glutamyl) lysine cross-link. This increase in enzyme activity did not require de novo protein synthesis. Incorporation of fluorescein-cadaverine into matrix proteins was accompanied by the cross-linking of fibronectin and tissue transglutaminase into nonreducible high molecular weight polymers. Addition of exogenous tissue transglutaminase to cultured cells mimicking extensive cell leakage of the enzyme resulted in increased extracellular matrix deposition and a decreased rate of matrix turnover. Exposure of cells to 180 kJ per m2 resulted in 40% to 50% cell death with dying cells showing extensive tissue transglutaminase cross-linking of intracellular proteins and increased cross-linking of the surrounding extracellular matrix, the latter probably occurring as a result of cell leakage of tissue transglutaminase. These cells demonstrated negligible caspase activation and DNA fragmentation but maintained their cell morphology. In contrast, exposure of cells to 240 kJ per m2 resulted in increased cell death with caspase activation and some DNA fragmentation. These cells could be partially rescued from death by addition of caspase inhibitors. These data suggest that changes in cross-linking both in the intracellular and extracellular compartments elicited by tissue transglutaminase following exposure to ultraviolet provides a rapid tissue stabilization process following damage, but as such may be a contributory factor to the scarring process that results.