1000 resultados para Lenguajes y Sistemas Informáticos
Resumo:
Recent years have witnessed a surge of interest in computational methods for affect, ranging from opinion mining, to subjectivity detection, to sentiment and emotion analysis. This article presents a brief overview of the latest trends in the field and describes the manner in which the articles contained in the special issue contribute to the advancement of the area. Finally, we comment on the current challenges and envisaged developments of the subjectivity and sentiment analysis fields, as well as their application to other Natural Language Processing tasks and related domains.
Resumo:
The literature states that project duration is affected by various scope factors. Using 168 building projects carried out in Spain, this paper uses the multiple regression analysis to develop a forecast model that allows estimating project duration of new builds. The proposed model uses project type, gross floor area (GFA), the cost/GFA relationship and number of floors as predictor variables. The research identified the logarithmic form of construction speed as the most appropriate response variable. GFA has greater influence than cost on project duration but both factors are necessary to achieve a forecast model with the highest accuracy. We developed an analysis to verify the stability of forecasted values and showed how a model with high values of fit and accuracy may display an anomalous behavior in the forecasted values. The sensitivity of the proposed forecast model was also analyzed versus the variability of construction costs.
Resumo:
Tuning compilations is the process of adjusting the values of a compiler options to improve some features of the final application. In this paper, a strategy based on the use of a genetic algorithm and a multi-objective scheme is proposed to deal with this task. Unlike previous works, we try to take advantage of the knowledge of this domain to provide a problem-specific genetic operation that improves both the speed of convergence and the quality of the results. The evaluation of the strategy is carried out by means of a case of study aimed to improve the performance of the well-known web server Apache. Experimental results show that a 7.5% of overall improvement can be achieved. Furthermore, the adaptive approach has shown an ability to markedly speed-up the convergence of the original strategy.
Resumo:
Material completo EIT
Resumo:
Business Intelligence (BI) applications have been gradually ported to the Web in search of a global platform for the consumption and publication of data and services. On the Internet, apart from techniques for data/knowledge management, BI Web applications need interfaces with a high level of interoperability (similar to the traditional desktop interfaces) for the visualisation of data/knowledge. In some cases, this has been provided by Rich Internet Applications (RIA). The development of these BI RIAs is a process traditionally performed manually and, given the complexity of the final application, it is a process which might be prone to errors. The application of model-driven engineering techniques can reduce the cost of development and maintenance (in terms of time and resources) of these applications, as they demonstrated by other types of Web applications. In the light of these issues, the paper introduces the Sm4RIA-B methodology, i.e., a model-driven methodology for the development of RIA as BI Web applications. In order to overcome the limitations of RIA regarding knowledge management from the Web, this paper also presents a new RIA platform for BI, called RI@BI, which extends the functionalities of traditional RIAs by means of Semantic Web technologies and B2B techniques. Finally, we evaluate the whole approach on a case study—the development of a social network site for an enterprise project manager.
Resumo:
Hospitals attached to the Spanish Ministry of Health are currently using the International Classification of Diseases 9 Clinical Modification (ICD9-CM) to classify health discharge records. Nowadays, this work is manually done by experts. This paper tackles the automatic classification of real Discharge Records in Spanish following the ICD9-CM standard. The challenge is that the Discharge Records are written in spontaneous language. We explore several machine learning techniques to deal with the classification problem. Random Forest resulted in the most competitive one, achieving an F-measure of 0.876.
Resumo:
The great amount of text produced every day in the Web turned it as one of the main sources for obtaining linguistic corpora, that are further analyzed with Natural Language Processing techniques. On a global scale, languages such as Portuguese - official in 9 countries - appear on the Web in several varieties, with lexical, morphological and syntactic (among others) differences. Besides, a unified spelling system for Portuguese has been recently approved, and its implementation process has already started in some countries. However, it will last several years, so different varieties and spelling systems coexist. Since PoS-taggers for Portuguese are specifically built for a particular variety, this work analyzes different training corpora and lexica combinations aimed at building a model with high-precision annotation in several varieties and spelling systems of this language. Moreover, this paper presents different dictionaries of the new orthography (Spelling Agreement) as well as a new freely available testing corpus, containing different varieties and textual typologies.
Resumo:
Proyecto emergente centrado en el tratamiento inteligente de información procedente de diversas fuentes tales como micro-blogs, blogs, foros, portales especializados, etc. La finalidad es generar conocimiento a partir de la información semántica recuperada. Como resultado se podrán determinar las necesidades de los usuarios o mejorar la reputación de diferentes organizaciones. En este artículo se describen los problemas abordados, la hipótesis de trabajo, las tareas a realizar y los objetivos parciales alcanzados.
Resumo:
El Trastorno de Espectro Autista (TEA) es un trastorno que impide el correcto desarrollo de funciones cognitivas, habilidades sociales y comunicativas en las personas. Un porcentaje significativo de personas con autismo presentan además dificultades en la comprensión lectora. El proyecto europeo FIRST está orientado a desarrollar una herramienta multilingüe llamada Open Book que utiliza Tecnologías del Lenguaje Humano para identificar obstáculos que dificultan la comprensión lectora de un documento. La herramienta ayuda a cuidadores y personas con autismo transformando documentos escritos a un formato más sencillo mediante la eliminación de dichos obstáculos identificados en el texto. En este artículo se presenta el proyecto FIRST así como la herramienta desarrollada Open Book.
Resumo:
El reciente crecimiento masivo de medios on-line y el incremento de los contenidos generados por los usuarios (por ejemplo, weblogs, Twitter, Facebook) plantea retos en el acceso e interpretación de datos multilingües de manera eficiente, rápida y asequible. El objetivo del proyecto TredMiner es desarrollar métodos innovadores, portables, de código abierto y que funcionen en tiempo real para generación de resúmenes y minería cross-lingüe de medios sociales a gran escala. Los resultados se están validando en tres casos de uso: soporte a la decisión en el dominio financiero (con analistas, empresarios, reguladores y economistas), monitorización y análisis político (con periodistas, economistas y políticos) y monitorización de medios sociales sobre salud con el fin de detectar información sobre efectos adversos a medicamentos.
Resumo:
Presentamos una herramienta basada en coocurrencias de fármaco-efecto para la detección de reacciones adversas e indicaciones en comentarios de usuarios procedentes de un foro médico en español. Además, se describe la construcción automática de la primera base de datos en español sobre indicaciones y efectos adversos de fármacos.
imaxin|software: PLN aplicada a la mejora de la comunicación multilingüe de empresas e instituciones
Resumo:
imaxin|software es una empresa creada en 1997 por cuatro titulados en ingeniería informática cuyo objetivo ha sido el de desarrollar videojuegos multimedia educativos y procesamiento del lenguaje natural multilingüe. 17 años más tarde, hemos desarrollado recursos, herramientas y aplicaciones multilingües de referencia para diferentes lenguas: Portugués (Galicia, Portugal, Brasil, etc.), Español (España, Argentina, México, etc.), Inglés, Catalán y Francés. En este artículo haremos una descripción de aquellos principales hitos en relación a la incorporación de estas tecnologías PLN al sector industrial e institucional.
Resumo:
This introduction provides an overview of the state-of-the-art technology in Applications of Natural Language to Information Systems. Specifically, we analyze the need for such technologies to successfully address the new challenges of modern information systems, in which the exploitation of the Web as a main data source on business systems becomes a key requirement. It will also discuss the reasons why Human Language Technologies themselves have shifted their focus onto new areas of interest very directly linked to the development of technology for the treatment and understanding of Web 2.0. These new technologies are expected to be future interfaces for the new information systems to come. Moreover, we will review current topics of interest to this research community, and will present the selection of manuscripts that have been chosen by the program committee of the NLDB 2011 conference as representative cornerstone research works, especially highlighting their contribution to the advancement of such technologies.
Resumo:
Automatic Text Summarization has been shown to be useful for Natural Language Processing tasks such as Question Answering or Text Classification and other related fields of computer science such as Information Retrieval. Since Geographical Information Retrieval can be considered as an extension of the Information Retrieval field, the generation of summaries could be integrated into these systems by acting as an intermediate stage, with the purpose of reducing the document length. In this manner, the access time for information searching will be improved, while at the same time relevant documents will be also retrieved. Therefore, in this paper we propose the generation of two types of summaries (generic and geographical) applying several compression rates in order to evaluate their effectiveness in the Geographical Information Retrieval task. The evaluation has been carried out using GeoCLEF as evaluation framework and following an Information Retrieval perspective without considering the geo-reranking phase commonly used in these systems. Although single-document summarization has not performed well in general, the slight improvements obtained for some types of the proposed summaries, particularly for those based on geographical information, made us believe that the integration of Text Summarization with Geographical Information Retrieval may be beneficial, and consequently, the experimental set-up developed in this research work serves as a basis for further investigations in this field.