2 resultados para Data and Information

em Repositório Institucional da Universidade de Aveiro - Portugal


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid evolution and proliferation of a world-wide computerized network, the Internet, resulted in an overwhelming and constantly growing amount of publicly available data and information, a fact that was also verified in biomedicine. However, the lack of structure of textual data inhibits its direct processing by computational solutions. Information extraction is the task of text mining that intends to automatically collect information from unstructured text data sources. The goal of the work described in this thesis was to build innovative solutions for biomedical information extraction from scientific literature, through the development of simple software artifacts for developers and biocurators, delivering more accurate, usable and faster results. We started by tackling named entity recognition - a crucial initial task - with the development of Gimli, a machine-learning-based solution that follows an incremental approach to optimize extracted linguistic characteristics for each concept type. Afterwards, Totum was built to harmonize concept names provided by heterogeneous systems, delivering a robust solution with improved performance results. Such approach takes advantage of heterogenous corpora to deliver cross-corpus harmonization that is not constrained to specific characteristics. Since previous solutions do not provide links to knowledge bases, Neji was built to streamline the development of complex and custom solutions for biomedical concept name recognition and normalization. This was achieved through a modular and flexible framework focused on speed and performance, integrating a large amount of processing modules optimized for the biomedical domain. To offer on-demand heterogenous biomedical concept identification, we developed BeCAS, a web application, service and widget. We also tackled relation mining by developing TrigNER, a machine-learning-based solution for biomedical event trigger recognition, which applies an automatic algorithm to obtain the best linguistic features and model parameters for each event type. Finally, in order to assist biocurators, Egas was developed to support rapid, interactive and real-time collaborative curation of biomedical documents, through manual and automatic in-line annotation of concepts and relations. Overall, the research work presented in this thesis contributed to a more accurate update of current biomedical knowledge bases, towards improved hypothesis generation and knowledge discovery.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information Visualization is gradually emerging to assist the representation and comprehension of large datasets about Higher Education Institutions, making the data more easily understood. The importance of gaining insights and knowledge regarding higher education institutions is little disputed. Within this knowledge, the emerging and urging area in need of a systematic understanding is the use of communication technologies, area that is having a transformative impact on educational practices worldwide. This study focused on the need to visually represent a dataset about how Portuguese Public Higher Education Institutions are using Communication Technologies as a support to teaching and learning processes. Project TRACER identified this need, regarding the Portuguese public higher education context, and carried out a national data collection. This study was developed within project TRACER, and worked with the dataset collected in order to conceptualize an information visualization tool U-TRACER®. The main goals of this study related to: conceptualization of the information visualization tool U-TRACER®, to represent the data collected by project TRACER; understand higher education decision makers perception of usefulness regarding the tool. The goals allowed us to contextualize the phenomenon of information visualization tools regarding higher education data, realizing the existing trends. The research undertaken was of qualitative nature, and followed the method of case study with four moments of data collection.The first moment regarded the conceptualization of the U-TRACER®, with two focus group sessions with Higher Education professionals, with the aim of defining the interaction features the U-TRACER® should offer. The second data collection moment involved the proposal of the graphical displays that would represent the dataset, which reading effectiveness was tested by end-users. The third moment involved the development of a usability test to the UTRACER ® performed by higher education professionals and which resulted in the proposal of improvements to the final prototype of the tool. The fourth moment of data collection involved conducting exploratory, semi-structured interviews, to the institutional decision makers regarding their perceived usefulness of the U-TRACER®. We consider that the results of this study contribute towards two moments of reflection. The challenges of involving end-users in the conceptualization of an information visualization tool; the relevance of effective visual displays for an effective communication of the data and information. The second relates to the reflection about how the higher education decision makers, stakeholders of the U-TRACER® tool, perceive usefulness of the tool, both for communicating their institutions data and for benchmarking exercises, as well as a support for decision processes. Also to reflect on the main concerns about opening up data about higher education institutions in a global market.