144 resultados para NLP
Resumo:
Citation corpus composed by 85 articles taken randomly from ACL Anthology with a total of 2195 bibliography cites.
Resumo:
La gran cantidad de información disponible en Internet está dificultando cada vez más que los usuarios puedan digerir toda esa información, siendo actualmente casi impensable sin la ayuda de herramientas basadas en las Tecnologías del Lenguaje Humano (TLH), como pueden ser los recuperadores de información o resumidores automáticos. El interés de este proyecto emergente (y por tanto, su objetivo principal) viene motivado precisamente por la necesidad de definir y crear un marco tecnológico basado en TLH, capaz de procesar y anotar semánticamente la información, así como permitir la generación de información de forma automática, flexibilizando el tipo de información a presentar y adaptándola a las necesidades de los usuarios. En este artículo se proporciona una visión general de este proyecto, centrándonos en la arquitectura propuesta y el estado actual del mismo.
Resumo:
Today's generation of Internet devices has changed how users are interacting with media, from passive and unidirectional users to proactive and interactive. Users can use these devices to comment or rate a TV show and search for related information regarding characters, facts or personalities. This phenomenon is known as second screen. This paper describes SAM, an EU-funded research project that focuses on developing an advanced digital media delivery platform based on second screen interaction and content syndication within a social media context, providing open and standardised ways of characterising, discovering and syndicating digital assets. This work provides an overview of the project and its main objectives, focusing on the NLP challenges to be faced and the technologies developed so far.
Resumo:
A method for quantitative mineralogical analysis by ATR-FTIR has been developed. The method relies on the use of the main band of calcite as a reference for the normalization of the IR spectrum of a mineral sample. In this way, the molar absorptivity coefficient in the Lambert–Beer law and the components of a mixture in mole percentage can be calculated. The GAMS equation modeling environment and the NLP solver CONOPT (©ARKI Consulting and Development) were used to correlate the experimental data in the samples considered. Mixtures of different minerals and gypsum were used in order to measure the minimum band intensity that must be considered for calculations and the detection limit. Accordingly, bands of intensity lower than 0.01 were discarded. The detection limit for gypsum was about 7% (mol/total mole). Good agreement was obtained when this FTIR method was applied to ceramic tiles previously analyzed by X-ray diffraction (XRD) or mineral mixtures prepared in the lab.
Resumo:
In this work we present a semantic framework suitable of being used as support tool for recommender systems. Our purpose is to use the semantic information provided by a set of integrated resources to enrich texts by conducting different NLP tasks: WSD, domain classification, semantic similarities and sentiment analysis. After obtaining the textual semantic enrichment we would be able to recommend similar content or even to rate texts according to different dimensions. First of all, we describe the main characteristics of the semantic integrated resources with an exhaustive evaluation. Next, we demonstrate the usefulness of our resource in different NLP tasks and campaigns. Moreover, we present a combination of different NLP approaches that provide enough knowledge for being used as support tool for recommender systems. Finally, we illustrate a case of study with information related to movies and TV series to demonstrate that our framework works properly.
Resumo:
Nell'ambito dell'Intelligenza Artificiale uno dei problemi aperti e piu difficile da risolvere e la comprensione del linguaggio naturale. La complessita sintattica e la conoscenza che bisogna avere per comprendere riferimenti, relazioni e concetti impliciti rendono questo problema molto interessante e la sua risoluzione di grande importanza per lo sviluppo di applicazioni che possano interagire in modo diretto con le persone. Questo lavoro di tesi non pretende di studiare e trovare una soluzione completa al suddetto problema, ma si prefigge come obiettivo quello di comprendere e risolvere problemi matematici di tipo logico ed aritmetico scritti in lingua inglese. La difficolta del lavoro si riduce in quanto non si devono considerare gli infiniti ambiti conoscitivi e puo concentrarsi su un'unica interpretazione del testo: quella matematica. Nonostante questa semplificazione il problema da affrontare rimane di grande difficolta poiche e comunque necessario confrontarsi con la complessita del linguaggio naturale. Esempi di problemi matematici che si intende risolvere si possono trovare presso il sito web dell'Universita della Bocconi nella sezione dei Giochi Matematici. Questi problemi richiedono la conoscenza di concetti di logica, insiemistica e di algebra lineare per essere risolti. Il modello matematico che descrive questi problemi non e complesso ed una volta dedotto correttamente e di facile risoluzione tramite un risolutore automatico. La difficolta consiste nel comprendere correttamente il testo ed estrapolarne il giusto modello. Il lavoro che si andra ad esporre nel seguito parte dall'analisi del testo ed arriva fino alla risoluzione del quesito matematico. Si parte quindi da un'analisi del testo con scopo generale seguita da un'analisi semantica volta alla costruzione del modello matematico che andra poi risolto da un risolutore automatico che ne restituira il risultato finale.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-04
Resumo:
The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.
Resumo:
A method has been constructed for the solution of a wide range of chemical plant simulation models including differential equations and optimization. Double orthogonal collocation on finite elements is applied to convert the model into an NLP problem that is solved either by the VF 13AD package based on successive quadratic programming, or by the GRG2 package, based on the generalized reduced gradient method. This approach is termed simultaneous optimization and solution strategy. The objective functional can contain integral terms. The state and control variables can have time delays. Equalities and inequalities containing state and control variables can be included into the model as well as algebraic equations and inequalities. The maximum number of independent variables is 2. Problems containing 3 independent variables can be transformed into problems having 2 independent variables using finite differencing. The maximum number of NLP variables and constraints is 1500. The method is also suitable for solving ordinary and partial differential equations. The state functions are approximated by a linear combination of Lagrange interpolation polynomials. The control function can either be approximated by a linear combination of Lagrange interpolation polynomials or by a piecewise constant function over finite elements. The number of internal collocation points can vary by finite elements. The residual error is evaluated at arbitrarily chosen equidistant grid-points, thus enabling the user to check the accuracy of the solution between collocation points, where the solution is exact. The solution functions can be tabulated. There is an option to use control vector parameterization to solve optimization problems containing initial value ordinary differential equations. When there are many differential equations or the upper integration limit should be selected optimally then this approach should be used. The portability of the package has been addressed converting the package from V AX FORTRAN 77 into IBM PC FORTRAN 77 and into SUN SPARC 2000 FORTRAN 77. Computer runs have shown that the method can reproduce optimization problems published in the literature. The GRG2 and the VF I 3AD packages, integrated into the optimization package, proved to be robust and reliable. The package contains an executive module, a module performing control vector parameterization and 2 nonlinear problem solver modules, GRG2 and VF I 3AD. There is a stand-alone module that converts the differential-algebraic optimization problem into a nonlinear programming problem.
Resumo:
The semantic web vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language and an ontology as input, and returns answers drawn from one or more knowledge bases (KBs). We say that AquaLog is portable because the configuration time required to customize the system for a particular ontology is negligible. AquaLog presents an elegant solution in which different strategies are combined together in a novel way. It makes use of the GATE NLP platform, string metric algorithms, WordNet and a novel ontology-based relation similarity service to make sense of user queries with respect to the target KB. Moreover it also includes a learning component, which ensures that the performance of the system improves over the time, in response to the particular community jargon used by end users.
Resumo:
The paper reports on preliminary results of an ongoing research aiming at development of an automatic procedure for recognition of discourse-compositional structure of scientific and technical texts, which is required in many NLP applications. The procedure exploits as discourse markers various domain-independent words and expressions that are specific for scientific and technical texts and organize scientific discourse. The paper discusses features of scientific discourse and common scientific lexicon comprising such words and expressions. Methodological issues of development of a computer dictionary for common scientific lexicon are concerned; basic principles of its organization are described as well. Main steps of the discourse-analyzing procedure based on the dictionary and surface syntactical analysis are pointed out.
Resumo:
This paper presents an adaptive method using genetic algorithm to modify user’s queries, based on relevance judgments. This algorithm was adapted for the three well-known documents collections (CISI, NLP and CACM). The method is shown to be applicable to large text collections, where more relevant documents are presented to users in the genetic modification. The algorithm shows the effects of applying GA to improve the effectiveness of queries in IR systems. Further studies are planned to adjust the system parameters to improve its effectiveness. The goal is to retrieve most relevant documents with less number of non-relevant documents with respect to user's query in information retrieval system using genetic algorithm.
Resumo:
The Universal Networking Language (UNL) is an interlingua designed to be the base of several natural language processing systems aiming to support multilinguality in internet. One of the main components of the language is the dictionary of Universal Words (UWs), which links the vocabularies of the different languages involved in the project. As any NLP system, coverage and accuracy in its lexical resources are crucial for the development of the system. In this paper, the authors describes how a large coverage UWs dictionary was automatically created, based on an existent and well known resource like the English WordNet. Other aspects like implementation details and the evaluation of the final UW set are also depicted.
Resumo:
Some basic points from the automated creation of a Bulgarian WordNet – an analogue of the Princeton WordNet, are treated. The used computer tools, the received results and their estimation are discussed. A side effect from the proposed approach is the receiving of patterns for the Bulgarian syntactic analyzer.