967 resultados para Natural language techniques, Semantic spaces, Random projection, Documents
Resumo:
EmotiBlog is a corpus labelled with the homonymous annotation schema designed for detecting subjectivity in the new textual genres. Preliminary research demonstrated its relevance as a Machine Learning resource to detect opinionated data. In this paper we compare EmotiBlog with the JRC corpus in order to check the EmotiBlog robustness of annotation. For this research we concentrate on its coarse-grained labels. We carry out a deep ML experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC demonstrating the EmotiBlog validity as a resource for the SA task.
Resumo:
In this paper a multilingual method for event ordering based on temporal expression resolution is presented. This method has been implemented through the TERSEO system which consists of three main units: temporal expression recognizing, resolution of the coreference introduced by these expressions, and event ordering. By means of this system, chronological information related to events can be extracted from documental databases. This information is automatically added to the documental database in order to allow its use by question answering systems in those cases referring to temporality. The system has been evaluated obtaining results of 91 % precision and 71 % recall. For this, a blind evaluation process has been developed guaranteing a reliable annotation process that was measured through the kappa factor.
Resumo:
In this paper, a proposal of a multi-modal dialogue system oriented to multilingual question-answering is presented. This system includes the following ways of access: voice, text, avatar, gestures and signs language. The proposal is oriented to the question-answering task as a user interaction mechanism. The proposal here presented is in the first stages of its development phase and the architecture is presented for the first time on the base of the experiences in question-answering and dialogues previously developed. The main objective of this research work is the development of a solid platform that will permit the modular integration of the proposed architecture.
Resumo:
The main goal of this paper is to present the initial version of a Textile Chemical Ontology, to be used by textile professionals with the purpose of conceptualising and representing the banned and harmful chemical substances that are forbidden in this domain. After analysing different methodologies and determining that “Methontology” is the most appropriate for the purposes, this methodology is explored and applied to the domain. In this manner, an initial set of concepts are defined, together with their hierarchy and the relationships between them. This paper shows the benefits of using the ontology through a real use case in the context of Information Retrieval. The potentiality of the proposed ontology in this preliminary evaluation encourages extending the ontology with a higher number of concepts and relationships, and validating it within other Natural Language Processing applications.
Resumo:
Recent years have witnessed a surge of interest in computational methods for affect, ranging from opinion mining, to subjectivity detection, to sentiment and emotion analysis. This article presents a brief overview of the latest trends in the field and describes the manner in which the articles contained in the special issue contribute to the advancement of the area. Finally, we comment on the current challenges and envisaged developments of the subjectivity and sentiment analysis fields, as well as their application to other Natural Language Processing tasks and related domains.
imaxin|software: PLN aplicada a la mejora de la comunicación multilingüe de empresas e instituciones
Resumo:
imaxin|software es una empresa creada en 1997 por cuatro titulados en ingeniería informática cuyo objetivo ha sido el de desarrollar videojuegos multimedia educativos y procesamiento del lenguaje natural multilingüe. 17 años más tarde, hemos desarrollado recursos, herramientas y aplicaciones multilingües de referencia para diferentes lenguas: Portugués (Galicia, Portugal, Brasil, etc.), Español (España, Argentina, México, etc.), Inglés, Catalán y Francés. En este artículo haremos una descripción de aquellos principales hitos en relación a la incorporación de estas tecnologías PLN al sector industrial e institucional.
Resumo:
In the past years, an important volume of research in Natural Language Processing has concentrated on the development of automatic systems to deal with affect in text. The different approaches considered dealt mostly with explicit expressions of emotion, at word level. Nevertheless, expressions of emotion are often implicit, inferrable from situations that have an affective meaning. Dealing with this phenomenon requires automatic systems to have “knowledge” on the situation, and the concepts it describes and their interaction, to be able to “judge” it, in the same manner as a person would. This necessity motivated us to develop the EmotiNet knowledge base — a resource for the detection of emotion from text based on commonsense knowledge on concepts, their interaction and their affective consequence. In this article, we briefly present the process undergone to build EmotiNet and subsequently propose methods to extend the knowledge it contains. We further on analyse the performance of implicit affect detection using this resource. We compare the results obtained with EmotiNet to the use of alternative methods for affect detection. Following the evaluations, we conclude that the structure and content of EmotiNet are appropriate to address the automatic treatment of implicitly expressed affect, that the knowledge it contains can be easily extended and that overall, methods employing EmotiNet obtain better results than traditional emotion detection approaches.
Resumo:
Decision support systems (DSS) support business or organizational decision-making activities, which require the access to information that is internally stored in databases or data warehouses, and externally in the Web accessed by Information Retrieval (IR) or Question Answering (QA) systems. Graphical interfaces to query these sources of information ease to constrain dynamically query formulation based on user selections, but they present a lack of flexibility in query formulation, since the expressivity power is reduced to the user interface design. Natural language interfaces (NLI) are expected as the optimal solution. However, especially for non-expert users, a real natural communication is the most difficult to realize effectively. In this paper, we propose an NLI that improves the interaction between the user and the DSS by means of referencing previous questions or their answers (i.e. anaphora such as the pronoun reference in “What traits are affected by them?”), or by eliding parts of the question (i.e. ellipsis such as “And to glume colour?” after the question “Tell me the QTLs related to awn colour in wheat”). Moreover, in order to overcome one of the main problems of NLIs about the difficulty to adapt an NLI to a new domain, our proposal is based on ontologies that are obtained semi-automatically from a framework that allows the integration of internal and external, structured and unstructured information. Therefore, our proposal can interface with databases, data warehouses, QA and IR systems. Because of the high NL ambiguity of the resolution process, our proposal is presented as an authoring tool that helps the user to query efficiently in natural language. Finally, our proposal is tested on a DSS case scenario about Biotechnology and Agriculture, whose knowledge base is the CEREALAB database as internal structured data, and the Web (e.g. PubMed) as external unstructured information.
Resumo:
In the article relevance of system development for subject search using computational linguistics is considered. The basic principles of system functioning are defined. The principle of grammar development for information retrieval from the partially structured text in a natural language is considered. The ranging principle of results of information search is defined.
Resumo:
Includes bibliographical references.
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
Mycorthizae play a critical role in nutrient capture from soils. Arbuscular mycorrhizae (AM) and ectomycorrhizae (EM) are the most important mycorrhizae in agricultural and natural ecosystems. AM and EM fungi use inorganic NH4+ and NO3-, and most EM fungi are capable of using organic nitrogen. The heavier stable isotope N-15 is discriminated against during biogeochemical and biochemical processes. Differences in N-15 (atom%) or delta(15)N (parts per thousand) provide nitrogen movement information in an experimental system. A range of 20 to 50% of one-way N-transfer has been observed from legumes to nonlegumes. Mycorrhizal fungal mycelia can extend from one plant's roots to another plant's roots to form common mycorrhizal networks (CMNs). Individual species, genera, even families of plants can be interconnected by CMNs. They are capable of facilitating nutrient uptake and flux. Nutrients such as carbon, nitrogen and phosphorus and other elements may then move via either AM or EM networks from plant to plant. Both N-15 labeling and N-15 natural abundance techniques have been employed to trace N movement between plants interconnected by AM or EM networks. Fine mesh (25similar to45 mum) has been used to separate root systems and allow only hyphal penetration and linkages but no root contact between plants. In many studies, nitrogen from N-2-fixing mycorrhizal plants transferred to non-N-2-fixing mycorrhizal plants (one-way N-transfer). In a few studies, N is also transferred from non-N-2-fixing mycorrhizal plants to N-2-fixing mycorrhizal plants (two-way N-transfer). There is controversy about whether N-transfer is direct through CMNs, or indirect through the soil. The lack of convincing data underlines the need for creative, careful experimental manipulations. Nitrogen is crucial to productivity in most terrestrial ecosystems, and there are potential benefits of management in soil-plant systems to enhance N-transfer. Thus, two-way N-transfer warrants further investigation with many species and under field conditions.
Resumo:
Spoken word production is assumed to involve stages of processing in which activation spreads through layers of units comprising lexical-conceptual knowledge and their corresponding phonological word forms. Using high-field (4T) functional magnetic resonance imagine (fMRI), we assessed whether the relationship between these stages is strictly serial or involves cascaded-interactive processing, and whether central (decision/control) processing mechanisms are involved in lexical selection. Participants performed the competitor priming paradigm in which distractor words, named from a definition and semantically related to a subsequently presented target picture, slow picture-naming latency compared to that with unrelated words. The paradigm intersperses two trials between the definition and the picture to be named, temporally separating activation in the word perception and production networks. Priming semantic competitors of target picture names significantly increased activation in the left posterior temporal cortex, and to a lesser extent the left middle temporal cortex, consistent with the predictions of cascaded-interactive models of lexical access. In addition, extensive activation was detected in the anterior cingulate and pars orbitalis of the inferior frontal gyrus. The findings indicate that lexical selection during competitor priming is biased by top-down mechanisms to reverse associations between primed distractor words and target pictures to select words that meet the current goal of speech.
Resumo:
Automatic ontology building is a vital issue in many fields where they are currently built manually. This paper presents a user-centred methodology for ontology construction based on the use of Machine Learning and Natural Language Processing. In our approach, the user selects a corpus of texts and sketches a preliminary ontology (or selects an existing one) for a domain with a preliminary vocabulary associated to the elements in the ontology (lexicalisations). Examples of sentences involving such lexicalisation (e.g. ISA relation) in the corpus are automatically retrieved by the system. Retrieved examples are validated by the user and used by an adaptive Information Extraction system to generate patterns that discover other lexicalisations of the same objects in the ontology, possibly identifying new concepts or relations. New instances are added to the existing ontology or used to tune it. This process is repeated until a satisfactory ontology is obtained. The methodology largely automates the ontology construction process and the output is an ontology with an associated trained leaner to be used for further ontology modifications.
Resumo:
The fundamental failure of current approaches to ontology learning is to view it as single pipeline with one or more specific inputs and a single static output. In this paper, we present a novel approach to ontology learning which takes an iterative view of knowledge acquisition for ontologies. Our approach is founded on three open-ended resources: a set of texts, a set of learning patterns and a set of ontological triples, and the system seeks to maintain these in equilibrium. As events occur which disturb this equilibrium, actions are triggered to re-establish a balance between the resources. We present a gold standard based evaluation of the final output of the system, the intermediate output showing the iterative process and a comparison of performance using different seed input. The results are comparable to existing performance in the literature.