5 resultados para Information Extraction

em AMS Tesi di Laurea - Alm@DL - Università di Bologna


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In questo lavoro si introducono i concetti di base di Natural Language Processing, soffermandosi su Information Extraction e analizzandone gli ambiti applicativi, le attività principali e la differenza rispetto a Information Retrieval. Successivamente si analizza il processo di Named Entity Recognition, focalizzando l’attenzione sulle principali problematiche di annotazione di testi e sui metodi per la valutazione della qualità dell’estrazione di entità. Infine si fornisce una panoramica della piattaforma software open-source di language processing GATE/ANNIE, descrivendone l’architettura e i suoi componenti principali, con approfondimenti sugli strumenti che GATE offre per l'approccio rule-based a Named Entity Recognition.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The central objective of research in Information Retrieval (IR) is to discover new techniques to retrieve relevant information in order to satisfy an Information Need. The Information Need is satisfied when relevant information can be provided to the user. In IR, relevance is a fundamental concept which has changed over time, from popular to personal, i.e., what was considered relevant before was information for the whole population, but what is considered relevant now is specific information for each user. Hence, there is a need to connect the behavior of the system to the condition of a particular person and his social context; thereby an interdisciplinary sector called Human-Centered Computing was born. For the modern search engine, the information extracted for the individual user is crucial. According to the Personalized Search (PS), two different techniques are necessary to personalize a search: contextualization (interconnected conditions that occur in an activity), and individualization (characteristics that distinguish an individual). This movement of focus to the individual's need undermines the rigid linearity of the classical model overtaken the ``berry picking'' model which explains that the terms change thanks to the informational feedback received from the search activity introducing the concept of evolution of search terms. The development of Information Foraging theory, which observed the correlations between animal foraging and human information foraging, also contributed to this transformation through attempts to optimize the cost-benefit ratio. This thesis arose from the need to satisfy human individuality when searching for information, and it develops a synergistic collaboration between the frontiers of technological innovation and the recent advances in IR. The search method developed exploits what is relevant for the user by changing radically the way in which an Information Need is expressed, because now it is expressed through the generation of the query and its own context. As a matter of fact the method was born under the pretense to improve the quality of search by rewriting the query based on the contexts automatically generated from a local knowledge base. Furthermore, the idea of optimizing each IR system has led to develop it as a middleware of interaction between the user and the IR system. Thereby the system has just two possible actions: rewriting the query, and reordering the result. Equivalent actions to the approach was described from the PS that generally exploits information derived from analysis of user behavior, while the proposed approach exploits knowledge provided by the user. The thesis went further to generate a novel method for an assessment procedure, according to the "Cranfield paradigm", in order to evaluate this type of IR systems. The results achieved are interesting considering both the effectiveness achieved and the innovative approach undertaken together with the several applications inspired using a local knowledge base.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hybrid vehicles represent the future for automakers, since they allow to improve the fuel economy and to reduce the pollutant emissions. A key component of the hybrid powertrain is the Energy Storage System, that determines the ability of the vehicle to store and reuse energy. Though electrified Energy Storage Systems (ESS), based on batteries and ultracapacitors, are a proven technology, Alternative Energy Storage Systems (AESS), based on mechanical, hydraulic and pneumatic devices, are gaining interest because they give the possibility of realizing low-cost mild-hybrid vehicles. Currently, most literature of design methodologies focuses on electric ESS, which are not suitable for AESS design. In this contest, The Ohio State University has developed an Alternative Energy Storage System design methodology. This work focuses on the development of driving cycle analysis methodology that is a key component of Alternative Energy Storage System design procedure. The proposed methodology is based on a statistical approach to analyzing driving schedules that represent the vehicle typical use. Driving data are broken up into power events sequence, namely traction and braking events, and for each of them, energy-related and dynamic metrics are calculated. By means of a clustering process and statistical synthesis methods, statistically-relevant metrics are determined. These metrics define cycle representative braking events. By using these events as inputs for the Alternative Energy Storage System design methodology, different system designs are obtained. Each of them is characterized by attributes, namely system volume and weight. In the last part the work, the designs are evaluated in simulation by introducing and calculating a metric related to the energy conversion efficiency. Finally, the designs are compared accounting for attributes and efficiency values. In order to automate the driving data extraction and synthesis process, a specific script Matlab based has been developed. Results show that the driving cycle analysis methodology, based on the statistical approach, allows to extract and synthesize cycle representative data. The designs based on cycle statistically-relevant metrics are properly sized and have satisfying efficiency values with respect to the expectations. An exception is the design based on the cycle worst-case scenario, corresponding to same approach adopted by the conventional electric ESS design methodologies. In this case, a heavy system with poor efficiency is produced. The proposed new methodology seems to be a valid and consistent support for Alternative Energy Storage System design.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ontology design and population -core aspects of semantic technologies- re- cently have become fields of great interest due to the increasing need of domain-specific knowledge bases that can boost the use of Semantic Web. For building such knowledge resources, the state of the art tools for ontology design require a lot of human work. Producing meaningful schemas and populating them with domain-specific data is in fact a very difficult and time-consuming task. Even more if the task consists in modelling knowledge at a web scale. The primary aim of this work is to investigate a novel and flexible method- ology for automatically learning ontology from textual data, lightening the human workload required for conceptualizing domain-specific knowledge and populating an extracted schema with real data, speeding up the whole ontology production process. Here computational linguistics plays a fundamental role, from automati- cally identifying facts from natural language and extracting frame of relations among recognized entities, to producing linked data with which extending existing knowledge bases or creating new ones. In the state of the art, automatic ontology learning systems are mainly based on plain-pipelined linguistics classifiers performing tasks such as Named Entity recognition, Entity resolution, Taxonomy and Relation extraction [11]. These approaches present some weaknesses, specially in capturing struc- tures through which the meaning of complex concepts is expressed [24]. Humans, in fact, tend to organize knowledge in well-defined patterns, which include participant entities and meaningful relations linking entities with each other. In literature, these structures have been called Semantic Frames by Fill- 6 Introduction more [20], or more recently as Knowledge Patterns [23]. Some NLP studies has recently shown the possibility of performing more accurate deep parsing with the ability of logically understanding the structure of discourse [7]. In this work, some of these technologies have been investigated and em- ployed to produce accurate ontology schemas. The long-term goal is to collect large amounts of semantically structured information from the web of crowds, through an automated process, in order to identify and investigate the cognitive patterns used by human to organize their knowledge.

Relevância:

30.00% 30.00%

Publicador: