22 resultados para Mining machinery
Resumo:
Résumé : La microautophagie du noyau est un processus découvert chez la levure S. cerevisiae qui vise la dégradation de portions nucléaires dans la lumière vacuolaire. Ce processus appelé PMN (de l'anglais Piecemeal Microautophagy of the Nucleus) est induit dans des conditions de stress cellulaire comme la privation de nutriments, mais également par l'utilisation d'une drogue : la rapamycine. La PMN est due à l'interaction directe d'une protéine de la membrane externe de l'enveloppe nucléaire Nvj1p, et d'une protéine de la membrane vacuolaire Vac8p. L'interaction de ces deux protéines forme la jonction noyau-vacuole. Cette jonction guide la formation d'une invagination, qui englobe et étire vers la lumière vacuolaire une partie du noyau sous la forme d'un sac. Il s'en suit la libération d'une vésicule dégradée par les hydrolases. Les mécanismes moléculaires intervenant à différentes étapes de ce processus sont inconnus. Le but de ma thèse est de mettre en évidence de nouveaux acteurs qui interviennent dans la PMN. Dans la première partie de cette étude, nous présentons une procédure de sélection à la recherche de candidats jouant un rôle dans la PMN. Cette sélection a été effectuée dans la collection de mutants commercialisée chez Euroscarf. La procédure reposait sur l'observation que le nucléole (représenté par Nop1p) est le substrat préférentiel de la PMN dans des expériences de microscopie faites après induction de la PMN avec la rapamycine. Nous avons ainsi transformé la collection de mutants avec un plasmide portant le marqueur du nucléole Noplp. Par la suite, nous avons cherché par microscopie les mutants incapables de transférer Nop1p du noyau à la vacuole. Nous avons trouvé 318 gènes présentant un défaut de transfert de Nop1p par PMN. Ces gènes ont été classés par grandes familles fonctionnelles et aussi par leur degré de défaut de PMN. Egalement dans cette partie de l'étude, nous avons décrit des mutants impliqués dans le processus, à des étapes différentes. Dans la seconde partie de l'étude, nous avons regardé l'implication et le rôle de la V-ATPase, (une pompe à protons de la membrane vacuolaire}, sélectionnée parmi les candidats, dans le processus de PMN. Les inhibiteurs de ce complexe, comme la concanamycineA, bloquent l'activité PMN et semblent affecter le processus à deux étapes différentes. D'un autre côté, les jonctions «noyau-vacuole »forment une barrière de diffusion au niveau de la membrane vacuolaire, de laquelle Vphlp, une protéine de la V-ATPase, est exclue.
Resumo:
Our view of the RNA polymerase III (Pol III) transcription machinery in mammalian cells arises mostly from studies of the RN5S (5S) gene, the Ad2 VAI gene, and the RNU6 (U6) gene, as paradigms for genes with type 1, 2, and 3 promoters. Recruitment of Pol III onto these genes requires prior binding of well-characterized transcription factors. Technical limitations in dealing with repeated genomic units, typically found at mammalian Pol III genes, have so far hampered genome-wide studies of the Pol III transcription machinery and transcriptome. We have localized, genome-wide, Pol III and some of its transcription factors. Our results reveal broad usage of the known Pol III transcription machinery and define a minimal Pol III transcriptome in dividing IMR90hTert fibroblasts. This transcriptome consists of some 500 actively transcribed genes including a few dozen candidate novel genes, of which we confirmed nine as Pol III transcription units by additional methods. It does not contain any of the microRNA genes previously described as transcribed by Pol III, but reveals two other microRNA genes, MIR886 (hsa-mir-886) and MIR1975 (RNY5, hY5, hsa-mir-1975), which are genuine Pol III transcription units.
Resumo:
BACKGROUND: The annotation of protein post-translational modifications (PTMs) is an important task of UniProtKB curators and, with continuing improvements in experimental methodology, an ever greater number of articles are being published on this topic. To help curators cope with this growing body of information we have developed a system which extracts information from the scientific literature for the most frequently annotated PTMs in UniProtKB. RESULTS: The procedure uses a pattern-matching and rule-based approach to extract sentences with information on the type and site of modification. A ranked list of protein candidates for the modification is also provided. For PTM extraction, precision varies from 57% to 94%, and recall from 75% to 95%, according to the type of modification. The procedure was used to track new publications on PTMs and to recover potential supporting evidence for phosphorylation sites annotated based on the results of large scale proteomics experiments. CONCLUSIONS: The information retrieval and extraction method we have developed in this study forms the basis of a simple tool for the manual curation of protein post-translational modifications in UniProtKB/Swiss-Prot. Our work demonstrates that even simple text-mining tools can be effectively adapted for database curation tasks, providing that a thorough understanding of the working process and requirements are first obtained. This system can be accessed at http://eagl.unige.ch/PTM/.
Resumo:
It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach's properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.
Resumo:
The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
Target identification for tractography studies requires solid anatomical knowledge validated by an extensive literature review across species for each seed structure to be studied. Manual literature review to identify targets for a given seed region is tedious and potentially subjective. Therefore, complementary approaches would be useful. We propose to use text-mining models to automatically suggest potential targets from the neuroscientific literature, full-text articles and abstracts, so that they can be used for anatomical connection studies and more specifically for tractography. We applied text-mining models to three structures: two well-studied structures, since validated deep brain stimulation targets, the internal globus pallidus and the subthalamic nucleus and, the nucleus accumbens, an exploratory target for treating psychiatric disorders. We performed a systematic review of the literature to document the projections of the three selected structures and compared it with the targets proposed by text-mining models, both in rat and primate (including human). We ran probabilistic tractography on the nucleus accumbens and compared the output with the results of the text-mining models and literature review. Overall, text-mining the literature could find three times as many targets as two man-weeks of curation could. The overall efficiency of the text-mining against literature review in our study was 98% recall (at 36% precision), meaning that over all the targets for the three selected seeds, only one target has been missed by text-mining. We demonstrate that connectivity for a structure of interest can be extracted from a very large amount of publications and abstracts. We believe this tool will be useful in helping the neuroscience community to facilitate connectivity studies of particular brain regions. The text mining tools used for the study are part of the HBP Neuroinformatics Platform, publicly available at http://connectivity-brainer.rhcloud.com/.