16 resultados para literature-data integration

em Universidade do Minho


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The MAP-i Doctoral Program of the Universities of Minho, Aveiro and Porto

Relevância:

90.00% 90.00%

Publicador:

Resumo:

DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. The integration of heterogeneous DNA microarray platforms comprehends a set of tasks that range from the re-annotation of the features used on gene expression, to data normalization and batch effect elimination. In this work, a complete methodology for gene expression data integration and application is proposed, which comprehends a transcript-based re-annotation process and several methods for batch effect attenuation. The integrated data will be used to select the best feature set and learning algorithm for a brain tumor classification case study. The integration will consider data from heterogeneous Agilent and Affymetrix platforms, collected from public gene expression databases, such as The Cancer Genome Atlas and Gene Expression Omnibus.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Under the framework of constraint based modeling, genome-scale metabolic models (GSMMs) have been used for several tasks, such as metabolic engineering and phenotype prediction. More recently, their application in health related research has spanned drug discovery, biomarker identification and host-pathogen interactions, targeting diseases such as cancer, Alzheimer, obesity or diabetes. In the last years, the development of novel techniques for genome sequencing and other high-throughput methods, together with advances in Bioinformatics, allowed the reconstruction of GSMMs for human cells. Considering the diversity of cell types and tissues present in the human body, it is imperative to develop tissue-specific metabolic models. Methods to automatically generate these models, based on generic human metabolic models and a plethora of omics data, have been proposed. However, their results have not yet been adequately and critically evaluated and compared. This work presents a survey of the most important tissue or cell type specific metabolic model reconstruction methods, which use literature, transcriptomics, proteomics and metabolomics data, together with a global template model. As a case study, we analyzed the consistency between several omics data sources and reconstructed distinct metabolic models of hepatocytes using different methods and data sources as inputs. The results show that omics data sources have a poor overlapping and, in some cases, are even contradictory. Additionally, the hepatocyte metabolic models generated are in many cases not able to perform metabolic functions known to be present in the liver tissue. We conclude that reliable methods for a priori omics data integration are required to support the reconstruction of complex models of human cells.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a mobile information system denominated as Vehicle-to-Anything Application (V2Anything App), and explains its conceptual aspects. This application is aimed at giving relevant information to Full Electric Vehicle (FEV) drivers, by supporting the integration of several sources of data in a mobile application, thus contributing to the deployment of the electric mobility process. The V2Anything App provides recommendations to the drivers about the FEV range autonomy, location of battery charging stations, information of the electricity market, and also a route planner taking into account public transportations and car or bike sharing systems. The main contributions of this application are related with the creation of an Information and Communication Technology (ICT) platform, recommender systems, data integration systems, driver profile, and personalized range prediction. Thus, it is possible to deliver relevant information to the FEV drivers related with the electric mobility process, electricity market, public transportation, and the FEV performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Developing and implementing data-oriented workflows for data migration processes are complex tasks involving several problems related to the integration of data coming from different schemas. Usually, they involve very specific requirements - every process is almost unique. Having a way to abstract their representation will help us to better understand and validate them with business users, which is a crucial step for requirements validation. In this demo we present an approach that provides a way to enrich incrementally conceptual models in order to support an automatic way for producing their correspondent physical implementation. In this demo we will show how B2K (Business to Kettle) system works transforming BPMN 2.0 conceptual models into Kettle data-integration executable processes, approaching the most relevant aspects related to model design and enrichment, model to system transformation, and system execution.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article compiles the main topics addressed by management systems (MSs) literature concerningMSs integration by performing a systematic literature review. In this paper, it is intended to present themain limitations of non-integratedmanagement systems (IMSs), the main motivations driving an IMS implementation, the major resistances faced, the most common resultant benefits, the suitable guidelines and standards and the critical success factors. In addition, this paper addresses the issues concerning integration strategies and models, the integration levels or degrees achieved by an IMS and the audit function in an integrated context. The motivations that drive companies to integrate their management subsystems, the obstacles faced and the benefits collected may have internal or external origins. The publishing of standards guiding companies on how to integrate their management subsystems has been done mainly at a national level. There are several models that could be used in order to support companies in their management subsystems integration processes, and a sequential or an all-in strategy may be adopted. Four audit typologies can be distinguished, and the adoption of any of these typologies should consider resource availability and audit team know-how, among other features.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper aims at developing a collision prediction model for three-leg junctions located in national roads (NR) in Northern Portugal. The focus is to identify factors that contribute for collision type crashes in those locations, mainly factors related to road geometric consistency, since literature is scarce on those, and to research the impact of three modeling methods: generalized estimating equations, random-effects negative binomial models and random-parameters negative binomial models, on the factors of those models. The database used included data published between 2008 and 2010 of 177 three-leg junctions. It was split in three groups of contributing factors which were tested sequentially for each of the adopted models: at first only traffic, then, traffic and the geometric characteristics of the junctions within their area of influence; and, lastly, factors which show the difference between the geometric characteristics of the segments boarding the junctionsâ area of influence and the segment included in that area were added. The choice of the best modeling technique was supported by the result of a cross validation made to ascertain the best model for the three sets of researched contributing factors. The models fitted with random-parameters negative binomial models had the best performance in the process. In the best models obtained for every modeling technique, the characteristics of the road environment, including proxy measures for the geometric consistency, along with traffic volume, contribute significantly to the number of collisions. Both the variables concerning junctions and the various national highway segments in their area of influence, as well as variations from those characteristics concerning roadway segments which border the already mentioned area of influence have proven their relevance and, therefore, there is a rightful need to incorporate the effect of geometric consistency in the three-leg junctions safety studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tese de Doutoramento em Ciências da Educação (área de especialização em Tecnologia Educativa).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To better understand the dynamic behavior of metabolic networks in a wide variety of conditions, the field of Systems Biology has increased its interest in the use of kinetic models. The different databases, available these days, do not contain enough data regarding this topic. Given that a significant part of the relevant information for the development of such models is still wide spread in the literature, it becomes essential to develop specific and powerful text mining tools to collect these data. In this context, this work has as main objective the development of a text mining tool to extract, from scientific literature, kinetic parameters, their respective values and their relations with enzymes and metabolites. The approach proposed integrates the development of a novel plug-in over the text mining framework @Note2. In the end, the pipeline developed was validated with a case study on Kluyveromyces lactis, spanning the analysis and results of 20 full text documents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb. 2016.00275

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genome-scale metabolic models are valuable tools in the metabolic engineering process, based on the ability of these models to integrate diverse sources of data to produce global predictions of organism behavior. At the most basic level, these models require only a genome sequence to construct, and once built, they may be used to predict essential genes, culture conditions, pathway utilization, and the modifications required to enhance a desired organism behavior. In this chapter, we address two key challenges associated with the reconstruction of metabolic models: (a) leveraging existing knowledge of microbiology, biochemistry, and available omics data to produce the best possible model; and (b) applying available tools and data to automate the reconstruction process. We consider these challenges as we progress through the model reconstruction process, beginning with genome assembly, and culminating in the integration of constraints to capture the impact of transcriptional regulation. We divide the reconstruction process into ten distinct steps: (1) genome assembly from sequenced reads; (2) automated structural and functional annotation; (3) phylogenetic tree-based curation of genome annotations; (4) assembly and standardization of biochemistry database; (5) genome-scale metabolic reconstruction; (6) generation of core metabolic model; (7) generation of biomass composition reaction; (8) completion of draft metabolic model; (9) curation of metabolic model; and (10) integration of regulatory constraints. Each of these ten steps is documented in detail.