2 resultados para Codes and repertoires of language
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
The overwhelming amount and unprecedented speed of publication in the biomedical domain make it difficult for life science researchers to acquire and maintain a broad view of the field and gather all information that would be relevant for their research. As a response to this problem, the BioNLP (Biomedical Natural Language Processing) community of researches has emerged and strives to assist life science researchers by developing modern natural language processing (NLP), information extraction (IE) and information retrieval (IR) methods that can be applied at large-scale, to scan the whole publicly available biomedical literature and extract and aggregate the information found within, while automatically normalizing the variability of natural language statements. Among different tasks, biomedical event extraction has received much attention within BioNLP community recently. Biomedical event extraction constitutes the identification of biological processes and interactions described in biomedical literature, and their representation as a set of recursive event structures. The 2009–2013 series of BioNLP Shared Tasks on Event Extraction have given raise to a number of event extraction systems, several of which have been applied at a large scale (the full set of PubMed abstracts and PubMed Central Open Access full text articles), leading to creation of massive biomedical event databases, each of which containing millions of events. Sinece top-ranking event extraction systems are based on machine-learning approach and are trained on the narrow-domain, carefully selected Shared Task training data, their performance drops when being faced with the topically highly varied PubMed and PubMed Central documents. Specifically, false-positive predictions by these systems lead to generation of incorrect biomolecular events which are spotted by the end-users. This thesis proposes a novel post-processing approach, utilizing a combination of supervised and unsupervised learning techniques, that can automatically identify and filter out a considerable proportion of incorrect events from large-scale event databases, thus increasing the general credibility of those databases. The second part of this thesis is dedicated to a system we developed for hypothesis generation from large-scale event databases, which is able to discover novel biomolecular interactions among genes/gene-products. We cast the hypothesis generation problem as a supervised network topology prediction, i.e predicting new edges in the network, as well as types and directions for these edges, utilizing a set of features that can be extracted from large biomedical event networks. Routine machine learning evaluation results, as well as manual evaluation results suggest that the problem is indeed learnable. This work won the Best Paper Award in The 5th International Symposium on Languages in Biology and Medicine (LBM 2013).
Resumo:
SD card (Secure Digital Memory Card) is widely used in portable storage medium. Currently, latest researches on SD card, are mainly SD card controller based on FPGA (Field Programmable Gate Array). Most of them are relying on API interface (Application Programming Interface), AHB bus (Advanced High performance Bus), etc. They are dedicated to the realization of ultra high speed communication between SD card and upper systems. Studies about SD card controller, really play a vital role in the field of high speed cameras and other sub-areas of expertise. This design of FPGA-based file systems and SD2.0 IP (Intellectual Property core) does not only exhibit a nice transmission rate, but also achieve the systematic management of files, while retaining a strong portability and practicality. The file system design and implementation on a SD card covers the main three IP innovation points. First, the combination and integration of file system and SD card controller, makes the overall system highly integrated and practical. The popular SD2.0 protocol is implemented for communication channels. Pure digital logic design based on VHDL (Very-High-Speed Integrated Circuit Hardware Description Language), integrates the SD card controller in hardware layer and the FAT32 file system for the entire system. Secondly, the document management system mechanism makes document processing more convenient and easy. Especially for small files in batch processing, it can ease the pressure of upper system to frequently access and process them, thereby enhancing the overall efficiency of systems. Finally, digital design ensures the superior performance. For transmission security, CRC (Cyclic Redundancy Check) algorithm is for data transmission protection. Design of each module is platform-independent of macro cells, and keeps a better portability. Custom integrated instructions and interfaces may facilitate easily to use. Finally, the actual test went through multi-platform method, Xilinx and Altera FPGA developing platforms. The timing simulation and debugging of each module was covered. Finally, Test results show that the designed FPGA-based file system IP on SD card can support SD card, TF card and Micro SD with 2.0 protocols, and the successful implementation of systematic management for stored files, and supports SD bus mode. Data read and write rates in Kingston class10 card is approximately 24.27MB/s and 16.94MB/s.