1000 resultados para Lenguajes y Sistemas Informáticos
Resumo:
En este artículo presentamos COMPENDIUM, una herramienta de generación de resúmenes de textos modular. Esta herramienta se compone de un módulo central con cinco etapas bien diferenciadas: i) análisis lingüístico; ii) detección de redundancia; iii) identificación del tópico; iv) detección de relevancia; y v) generación del resumen, y una serie de módulos adicionales que permiten incrementar las funcionalidades de la herramienta permitiendo la generación de distintos tipos de resúmenes, como por ejemplo orientados a un tema concreto. Realizamos una evaluación exhaustiva en dos dominios distintos (noticias de prensa y documentos sobre lugares turísticos) y analizamos diferentes tipos de resúmenes generados con COMPENDIUM (mono-documento, multi-documento, genéricos y orientados a un tema). Además, comparamos nuestro sistema con otros sistemas de generación de resúmenes actuales. Los resultados que se obtienen demuestran que la herramienta COMPENDIUM es capaz de generar resúmenes competitivos para los distintos tipos de resúmenes propuestos.
Resumo:
El análisis de textos de la Web 2.0 es un tema de investigación relevante hoy en día. Sin embargo, son muchos los problemas que se plantean a la hora de utilizar las herramientas actuales en este tipo de textos. Para ser capaces de medir estas dificultades primero necesitamos conocer los diferentes registros o grados de informalidad que podemos encontrar. Por ello, en este trabajo intentaremos caracterizar niveles de informalidad para textos en inglés en la Web 2.0 mediante técnicas de aprendizaje automático no supervisado, obteniendo resultados del 68 % en F1.
Resumo:
In this paper we present the enrichment of the Integration of Semantic Resources based in WordNet (ISR-WN Enriched). This new proposal improves the previous one where several semantic resources such as SUMO, WordNet Domains and WordNet Affects were related, adding other semantic resources such as Semantic Classes and SentiWordNet. Firstly, the paper describes the architecture of this proposal explaining the particularities of each integrated resource. After that, we analyze some problems related to the mappings of different versions and how we solve them. Moreover, we show the advantages that this kind of tool can provide to different applications of Natural Language Processing. Related to that question, we can demonstrate that the integration of semantic resources allows acquiring a multidimensional vision in the analysis of natural language.
Resumo:
El proyecto Araknion tiene como objetivo general dotar al español y al catalán de una infraestructura básica de recursos lingüísticos para el procesamiento semántico de corpus en el marco de la Web 2.0 sean de origen oral o escrito.
Resumo:
El objetivo general de este proyecto se centra en el estudio, desarrollo y experimentación de diferentes técnicas y sistemas basados en Tecnologías del Lenguaje Humano (TLH) para el desarrollo de la próxima generación de sistemas de procesamiento inteligente de la información digital (modelado, recuperación, tratamiento, comprensión y descubrimiento) afrontando los actuales retos de la comunicación digital. En este nuevo escenario, los sistemas deben incorporar capacidades de razonamiento que descubrirán la subjetividad de la información en todos sus contextos (espacial, temporal y emocional) analizando las diferentes dimensiones de uso (multilingualidad, multimodalidad y registro).
Resumo:
Proyecto emergente centrado en la detección e interpretación de metáforas con métodos no supervisados. Se presenta la caracterización del problema metafórico en Procesamiento del Lenguaje Natural, los fundamentos teóricos del proyecto y los primeros resultados.
Resumo:
Unidad 5. Herencia.
Resumo:
Enunciados y ficheros necesarios para la realización de las actividades prácticas de la asignatura Tecnologías de la Traducción.
Resumo:
Nowadays, data mining is based on low-level specications of the employed techniques typically bounded to a specic analysis platform. Therefore, data mining lacks a modelling architecture that allows analysts to consider it as a truly software-engineering process. Here, we propose a model-driven approach based on (i) a conceptual modelling framework for data mining, and (ii) a set of model transformations to automatically generate both the data under analysis (via data-warehousing technology) and the analysis models for data mining (tailored to a specic platform). Thus, analysts can concentrate on the analysis problem via conceptual data-mining models instead of low-level programming tasks related to the underlying-platform technical details. These tasks are now entrusted to the model-transformations scaffolding.
Resumo:
Data mining is one of the most important analysis techniques to automatically extract knowledge from large amount of data. Nowadays, data mining is based on low-level specifications of the employed techniques typically bounded to a specific analysis platform. Therefore, data mining lacks a modelling architecture that allows analysts to consider it as a truly software-engineering process. Bearing in mind this situation, we propose a model-driven approach which is based on (i) a conceptual modelling framework for data mining, and (ii) a set of model transformations to automatically generate both the data under analysis (that is deployed via data-warehousing technology) and the analysis models for data mining (tailored to a specific platform). Thus, analysts can concentrate on understanding the analysis problem via conceptual data-mining models instead of wasting efforts on low-level programming tasks related to the underlying-platform technical details. These time consuming tasks are now entrusted to the model-transformations scaffolding. The feasibility of our approach is shown by means of a hypothetical data-mining scenario where a time series analysis is required.
Resumo:
Geographic knowledge discovery (GKD) is the process of extracting information and knowledge from massive georeferenced databases. Usually the process is accomplished by two different systems, the Geographic Information Systems (GIS) and the data mining engines. However, the development of those systems is a complex task due to it does not follow a systematic, integrated and standard methodology. To overcome these pitfalls, in this paper, we propose a modeling framework that addresses the development of the different parts of a multilayer GKD process. The main advantages of our framework are that: (i) it reduces the design effort, (ii) it improves quality systems obtained, (iii) it is independent of platforms, (iv) it facilitates the use of data mining techniques on geo-referenced data, and finally, (v) it ameliorates the communication between different users.
Resumo:
Extracto de los apuntes referido al cálculo relacional y la perspectiva del modelo relacional desde el cálculo de predicados de primer orden.
Resumo:
Tesis doctoral con mención europea en procesamiento del lenguaje natural realizada en la Universidad de Alicante por Ester Boldrini bajo la dirección del Dr. Patricio Martínez-Barco. El acto de defensa de la tesis tuvo lugar en la Universidad de Alicante el 23 de enero de 2012 ante el tribunal formado por los doctores Manuel Palomar (Universidad de Alicante), Dr. Paloma Moreda (UA), Dr. Mariona Taulé (Universidad de Barcelona), Dr. Horacio Saggion (Universitat Pompeu Fabra) y Dr. Mike Thelwall (University of Wolverhampton). Calificación: Sobresaliente Cum Laude por unanimidad.
Resumo:
This paper outlines the approach adopted by the PLSI research group at University of Alicante in the PASCAL-2006 second Recognising Textual Entailment challenge. Our system is composed of several components. On the one hand, the first component performs the derivation of the logic forms of the text/hypothesis pairs and, on the other hand, the second component provides us with a similarity score given by the semantic relations between the derived logic forms. In order to obtain this score we apply several measures of similitude and relatedness based on the structure and content of WordNet.
Resumo:
The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.