997 resultados para Computational Lexical Semantics
Resumo:
An active strain formulation for orthotropic constitutive laws arising in cardiac mechanics modeling is introduced and studied. The passive mechanical properties of the tissue are described by the Holzapfel-Ogden relation. In the active strain formulation, the Euler-Lagrange equations for minimizing the total energy are written in terms of active and passive deformation factors, where the active part is assumed to depend, at the cell level, on the electrodynamics and on the specific orientation of the cardiac cells. The well-posedness of the linear system derived from a generic Newton iteration of the original problem is analyzed and different mechanical activation functions are considered. In addition, the active strain formulation is compared with the classical active stress formulation from both numerical and modeling perspectives. Taylor-Hood and MINI finite elements are employed to discretize the mechanical problem. The results of several numerical experiments show that the proposed formulation is mathematically consistent and is able to represent the main key features of the phenomenon, while allowing savings in computational costs.
Resumo:
In silico screening has become a valuable tool in drug design, but some drug targets represent real challenges for docking algorithms. This is especially true for metalloproteins, whose interactions with ligands are difficult to parametrize. Our docking algorithm, EADock, is based on the CHARMM force field, which assures a physically sound scoring function and a good transferability to a wide range of systems, but also exhibits difficulties in case of some metalloproteins. Here, we consider the therapeutically important case of heme proteins featuring an iron core at the active site. Using a standard docking protocol, where the iron-ligand interaction is underestimated, we obtained a success rate of 28% for a test set of 50 heme-containing complexes with iron-ligand contact. By introducing Morse-like metal binding potentials (MMBP), which are fitted to reproduce density functional theory calculations, we are able to increase the success rate to 62%. The remaining failures are mainly due to specific ligand-water interactions in the X-ray structures. Testing of the MMBP on a second data set of non iron binders (14 cases) demonstrates that they do not introduce a spurious bias towards metal binding, which suggests that they may reliably be used also for cross-docking studies.
Resumo:
Biochemical systems are commonly modelled by systems of ordinary differential equations (ODEs). A particular class of such models called S-systems have recently gained popularity in biochemical system modelling. The parameters of an S-system are usually estimated from time-course profiles. However, finding these estimates is a difficult computational problem. Moreover, although several methods have been recently proposed to solve this problem for ideal profiles, relatively little progress has been reported for noisy profiles. We describe a special feature of a Newton-flow optimisation problem associated with S-system parameter estimation. This enables us to significantly reduce the search space, and also lends itself to parameter estimation for noisy data. We illustrate the applicability of our method by applying it to noisy time-course data synthetically produced from previously published 4- and 30-dimensional S-systems. In addition, we propose an extension of our method that allows the detection of network topologies for small S-systems. We introduce a new method for estimating S-system parameters from time-course profiles. We show that the performance of this method compares favorably with competing methods for ideal profiles, and that it also allows the determination of parameters for noisy profiles.
Resumo:
SUMMARY : The function of sleep for the organism is one of the most persistent and perplexing questions in biology. Current findings lead to the conclusion that sleep is primarily for the brain. In particular, a role for sleep in cognitive aspects of brain function is supported by behavioral evidence both in humans and animals. However, in spite of remarkable advancement in the understanding of the mechanisms underlying sleep generation and regulation, it has been proven difficult to determine the neurobiological mechanisms underlying the beneficial effect of sleep, and the detrimental impact of sleep loss, on learning and memory processes. In my thesis, I present results that lead to several critical steps forward in the link between sleep and cognitive function. My major result is the molecular identification and physiological analysis of a protein, the NR2A subunit of NMDA receptor (NMDAR), that confers sensitivity to sleep loss to the hippocampus, a brain structure classically involved in mnemonic processes. Specifically, I used a novel behavioral approach to achieve sleep deprivation in adult C57BL6/J mice, yet minimizing the impact of secondary factors associated with the procedure,.such as stress. By using in vitro electrophysiological analysis, I show, for the first time, that sleep loss dramatically affects bidirectional plasticity at CA3 to CA1 synapses in the hippocampus, a well established cellular model of learning and memory. 4-6 hours of sleep loss elevate the modification threshold for bidirectional synaptic plasticity (MT), thereby promoting long-term depression of CA3 to CA 1 synaptic strength after stimulation in the theta frequency range (5 Hz), and rendering long-term potentiation induction.more difficult. Remarkably, 3 hours of recovery sleep, after the deprivation, reset the MT at control values, thus re-establishing the normal proneness of synapses to undergo long-term plastic changes. At the molecular level, these functional changes are paralleled by a change in the NMDAR subunit composition. In particular, the expression of the NR2A subunit protein of NMDAR at CA3 to CA1 synapses is selectively and rapidly increased by sleep deprivation, whereas recovery sleep reset NR2A synaptic content to control levels. By using an array of genetic, pharmacological and computational approaches, I demonstrate here an obligatory role for NR2A-containing NMDARs in conveying the effect of sleep loss on CA3 to CAl MT. Moreover, I show that a genetic deletion of the NR2A subunit fully preserves hippocampal plasticity from the impact of sleep loss, whereas it does not alter sleepwake behavior and homeostatic response to sleep deprivation. As to the mechanism underlying the effects of the NR2A subunit on hippocampal synaptic plasticity, I show that the increased NR2A expression after sleep loss distinctly affects the contribution of synaptic and more slowly recruited NMDAR pools activated during plasticity-induction protocols. This study represents a major step forward in understanding the mechanistic basis underlying sleep's role for the brain. By showing that sleep and sleep loss affect neuronal plasticity by regulating the expression and function of a synaptic neurotransmitter receptor, I propose that an important aspect of sleep function could consist in maintaining and regulating protein redistribution and ion channel trafficking at central synapses. These findings provide a novel starting point for investigations into the connections between sleep and learning, and they may open novel ways for pharmacological control over hippocampal .function during periods of sleep restriction. RÉSUMÉ DU PROJET La fonction du sommeil pour l'organisme est une des questions les plus persistantes et difficiles dans la biologie. Les découvertes actuelles mènent à la conclusion que le sommeil est essentiel pour le cerveau. En particulier, le rôle du sommeil dans les aspects cognitifs est soutenu par des études comportementales tant chez les humains que chez les animaux. Cependant, malgré l'avancement remarquable dans la compréhension des mécanismes sous-tendant la génération et la régulation du sommeil, les mécanismes neurobiologiques qui pourraient expliquer l'effet favorable du sommeil sur l'apprentissage et la mémoire ne sont pas encore clairs. Dans ma thèse, je présente des résultats qui aident à clarifier le lien entre le sommeil et la fonction cognitive. Mon résultat le plus significatif est l'identification moléculaire et l'analyse physiologique d'une protéine, la sous-unité NR2A du récepteur NMDA, qui rend l'hippocampe sensible à la perte de sommeil. Dans cette étude, nous avons utilisé une nouvelle approche expérimentale qui nous a permis d'induire une privation de sommeil chez les souris C57BL6/J adultes, en minimisant l'impact de facteurs confondants comme, par exemple, le stress. En utilisant les techniques de l'électrophysiologie in vitro, j'ai démontré, pour la première fois, que la perte de sommeil est responsable d'affecter radicalement la plasticité bidirectionnelle au niveau des synapses CA3-CA1 de l'hippocampe. Cela correspond à un mécanisme cellulaire de l'apprentissage et de la mémoire bien établi. En particulier, 4-6 heures de privation de sommeil élèvent le seuil de modification pour la plasticité synaptique bidirectionnelle (SM). Comme conséquence, la dépression à long terme de la transmission synaptique est induite par la stimulation des fibres afférentes dans la bande de fréquences thêta (5 Hz), alors que la potentialisation à long terme devient plus difficile. D'autre part, 3 heures de sommeil de récupération sont suffisant pour rétablir le SM aux valeurs contrôles. Au niveau moléculaire, les changements de la plasticité synaptiques sont associés à une altération de la composition du récepteur NMDA. En particulier, l'expression synaptique de la protéine NR2A du récepteur NMDA est rapidement augmentée de manière sélective par la privation de sommeil, alors que le sommeil de récupération rétablit l'expression de la protéine au niveau contrôle. En utilisant des approches génétiques, pharmacologiques et computationnelles, j'ai démontré que les récepteurs NMDA qui expriment la sous-unité NR2A sont responsables de l'effet de la privation de sommeil sur le SM. De plus, nous avons prouvé qu'une délétion génétique de la sous-unité NR2A préserve complètement la plasticité synaptique hippocampale de l'impact de la perte de sommeil, alors que cette manipulation ne change pas les mécanismes de régulation homéostatique du sommeil. En ce qui concerne les mécanismes, j'ai .découvert que l'augmentation de l'expression de la sous-unité NR2A au niveau synaptique modifie les propriétés de la réponse du récepteur NMDA aux protocoles de stimulations utilisés pour induire la plasticité. Cette étude représente un pas en avant important dans la compréhension de la base mécaniste sous-tendant le rôle du sommeil pour le cerveau. En montrant que le sommeil et la perte de sommeil affectent la plasticité neuronale en régulant l'expression et la fonction d'un récepteur de la neurotransmission, je propose qu'un aspect important de la fonction du sommeil puisse être finalisé au règlement de la redistribution des protéines et du tracking des récepteurs aux synapses centraux. Ces découvertes fournissent un point de départ pour mieux comprendre les liens entre le sommeil et l'apprentissage, et d'ailleurs, ils peuvent ouvrir des voies pour des traitements pharmacologiques dans le .but de préserver la fonction hippocampale pendant les périodes de restriction de sommeil.
Resumo:
Background: The human chromosome 8p23.1 region contains a 3.8–4.5 Mb segment which can be found in different orientations (defined as genomic inversion) among individuals. The identification of single nucleotide polymorphisms (SNPs) tightly linked to the genomic orientation of a given region should be useful to indirectly evaluate the genotypes of large genomic orientations in the individuals. Results: We have identified 16 SNPs, which are in linkage disequilibrium (LD) with the 8p23.1 inversion as detected by fluorescent in situ hybridization (FISH). The variability of the 8p23.1 orientation in 150 HapMap samples was predicted using this set of SNPs and was verified by FISH in a subset of samples. Four genes (NEIL2, MSRA, CTSB and BLK) were found differentially expressed (p<0.0005) according to the orientation of the 8p23.1 region. Finally, we have found variable levels of mosaicism for the orientation of the 8p23.1 as determined by FISH. Conclusion: By means of dense SNP genotyping of the region, haplotype-based computational analyses and FISH experiments we could infer and verify the orientation status of alleles in the 8p23.1 region by detecting two short haplotype stretches at both ends of the inverted region, which are likely the relic of the chromosome in which the original inversion occurred. Moreover, an impact of 8p23.1 inversion on gene expression levels cannot be ruled out, since four genes from this region have statistically significant different expression levels depending on the inversion status. FISH results in lymphoblastoid cell lines suggest the presence of mosaicism regarding the 8p23.1 inversion.
Resumo:
One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.
Resumo:
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are “genomic fossils” valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome’s structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction (∼80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Resumo:
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic–stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to ∼2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3′-UTRs. While we estimate a significant false discovery rate of ∼50%–70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).
Resumo:
Annotation of protein-coding genes is a key goal of genome sequencing projects. In spite of tremendous recent advances in computational gene finding, comprehensive annotation remains a challenge. Peptide mass spectrometry is a powerful tool for researching the dynamic proteome and suggests an attractive approach to discover and validate protein-coding genes. We present algorithms to construct and efficiently search spectra against a genomic database, with no prior knowledge of encoded proteins. By searching a corpus of 18.5 million tandem mass spectra (MS/MS) from human proteomic samples, we validate 39,000 exons and 11,000 introns at the level of translation. We present translation-level evidence for novel or extended exons in 16 genes, confirm translation of 224 hypothetical proteins, and discover or confirm over 40 alternative splicing events. Polymorphisms are efficiently encoded in our database, allowing us to observe variant alleles for 308 coding SNPs. Finally, we demonstrate the use of mass spectrometry to improve automated gene prediction, adding 800 correct exons to our predictions using a simple rescoring strategy. Our results demonstrate that proteomic profiling should play a role in any genome sequencing project.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
The “one-gene, one-protein” rule, coined by Beadle and Tatum, has been fundamental to molecular biology. The rule implies that the genetic complexity of an organism depends essentially on its gene number. The discovery, however, that alternative gene splicing and transcription are widespread phenomena dramatically altered our understanding of the genetic complexity of higher eukaryotic organisms; in these, a limited number of genes may potentially encode a much larger number of proteins. Here we investigate yet another phenomenon that may contribute to generate additional protein diversity. Indeed, by relying on both computational and experimental analysis, we estimate that at least 4%–5% of the tandem gene pairs in the human genome can be eventually transcribed into a single RNA sequence encoding a putative chimeric protein. While the functional significance of most of these chimeric transcripts remains to be determined, we provide strong evidence that this phenomenon does not correspond to mere technical artifacts and that it is a common mechanism with the potential of generating hundreds of additional proteins in the human genome.
Resumo:
Descriptors based on Molecular Interaction Fields (MIF) are highly suitable for drug discovery, but their size (thousands of variables) often limits their application in practice. Here we describe a simple and fast computational method that extracts from a MIF a handful of highly informative points (hot spots) which summarize the most relevant information. The method was specifically developed for drug discovery, is fast, and does not require human supervision, being suitable for its application on very large series of compounds. The quality of the results has been tested by running the method on the ligand structure of a large number of ligand-receptor complexes and then comparing the position of the selected hot spots with actual atoms of the receptor. As an additional test, the hot spots obtained with the novel method were used to obtain GRIND-like molecular descriptors which were compared with the original GRIND. In both cases the results show that the novel method is highly suitable for describing ligand-receptor interactions and compares favorably with other state-of-the-art methods.
Resumo:
Understanding the molecular mechanisms responsible for the regulation of the transcriptome present in eukaryotic cells isone of the most challenging tasks in the postgenomic era. In this regard, alternative splicing (AS) is a key phenomenoncontributing to the production of different mature transcripts from the same primary RNA sequence. As a plethora ofdifferent transcript forms is available in databases, a first step to uncover the biology that drives AS is to identify thedifferent types of reflected splicing variation. In this work, we present a general definition of the AS event along with anotation system that involves the relative positions of the splice sites. This nomenclature univocally and dynamically assignsa specific ‘‘AS code’’ to every possible pattern of splicing variation. On the basis of this definition and the correspondingcodes, we have developed a computational tool (AStalavista) that automatically characterizes the complete landscape of ASevents in a given transcript annotation of a genome, thus providing a platform to investigate the transcriptome diversityacross genes, chromosomes, and species. Our analysis reveals that a substantial part—in human more than a quarter—ofthe observed splicing variations are ignored in common classification pipelines. We have used AStalavista to investigate andto compare the AS landscape of different reference annotation sets in human and in other metazoan species and found thatproportions of AS events change substantially depending on the annotation protocol, species-specific attributes, andcoding constraints acting on the transcripts. The AStalavista system therefore provides a general framework to conductspecific studies investigating the occurrence, impact, and regulation of AS.
Resumo:
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics 'In silico Biology: Gene Discovery and Systems Genomics', Atlanta, USA, 15-17 November, 2007.