11 resultados para Knowledge retrieval, Ontology, User information needs, User profiles, Information retrieval
em Universidad de Alicante
Resumo:
En este artículo se presenta un método para recomendar artículos científicos teniendo en cuenta su grado de generalidad o especificidad. Este enfoque se basa en la idea de que personas menos expertas en un tema preferirían leer artículos más generales para introducirse en el mismo, mientras que personas más expertas preferirían artículos más específicos. Frente a otras técnicas de recomendación que se centran en el análisis de perfiles de usuario, nuestra propuesta se basa puramente en el análisis del contenido. Presentamos dos aproximaciones para recomendar artículos basados en el modelado de tópicos (Topic Modelling). El primero de ellos se basa en la divergencia de tópicos que se dan en los documentos, mientras que el segundo se basa en la similitud que se dan entre estos tópicos. Con ambas medidas se consiguió determinar lo general o específico de un artículo para su recomendación, superando en ambos casos a un sistema de recuperación de información tradicional.
Resumo:
Paper submitted to the 44th European Congress of the European Regional Science Association, Porto, 25-29 August 2004.
Resumo:
The exponential increase of subjective, user-generated content since the birth of the Social Web, has led to the necessity of developing automatic text processing systems able to extract, process and present relevant knowledge. In this paper, we tackle the Opinion Retrieval, Mining and Summarization task, by proposing a unified framework, composed of three crucial components (information retrieval, opinion mining and text summarization) that allow the retrieval, classification and summarization of subjective information. An extensive analysis is conducted, where different configurations of the framework are suggested and analyzed, in order to determine which is the best one, and under which conditions. The evaluation carried out and the results obtained show the appropriateness of the individual components, as well as the framework as a whole. By achieving an improvement over 10% compared to the state-of-the-art approaches in the context of blogs, we can conclude that subjective text can be efficiently dealt with by means of our proposed framework.
Open business intelligence: on the importance of data quality awareness in user-friendly data mining
Resumo:
Citizens demand more and more data for making decisions in their daily life. Therefore, mechanisms that allow citizens to understand and analyze linked open data (LOD) in a user-friendly manner are highly required. To this aim, the concept of Open Business Intelligence (OpenBI) is introduced in this position paper. OpenBI facilitates non-expert users to (i) analyze and visualize LOD, thus generating actionable information by means of reporting, OLAP analysis, dashboards or data mining; and to (ii) share the new acquired information as LOD to be reused by anyone. One of the most challenging issues of OpenBI is related to data mining, since non-experts (as citizens) need guidance during preprocessing and application of mining algorithms due to the complexity of the mining process and the low quality of the data sources. This is even worst when dealing with LOD, not only because of the different kind of links among data, but also because of its high dimensionality. As a consequence, in this position paper we advocate that data mining for OpenBI requires data quality-aware mechanisms for guiding non-expert users in obtaining and sharing the most reliable knowledge from the available LOD.
Resumo:
La gran cantidad de información disponible en Internet está dificultando cada vez más que los usuarios puedan digerir toda esa información, siendo actualmente casi impensable sin la ayuda de herramientas basadas en las Tecnologías del Lenguaje Humano (TLH), como pueden ser los recuperadores de información o resumidores automáticos. El interés de este proyecto emergente (y por tanto, su objetivo principal) viene motivado precisamente por la necesidad de definir y crear un marco tecnológico basado en TLH, capaz de procesar y anotar semánticamente la información, así como permitir la generación de información de forma automática, flexibilizando el tipo de información a presentar y adaptándola a las necesidades de los usuarios. En este artículo se proporciona una visión general de este proyecto, centrándonos en la arquitectura propuesta y el estado actual del mismo.
Resumo:
Decision support systems (DSS) support business or organizational decision-making activities, which require the access to information that is internally stored in databases or data warehouses, and externally in the Web accessed by Information Retrieval (IR) or Question Answering (QA) systems. Graphical interfaces to query these sources of information ease to constrain dynamically query formulation based on user selections, but they present a lack of flexibility in query formulation, since the expressivity power is reduced to the user interface design. Natural language interfaces (NLI) are expected as the optimal solution. However, especially for non-expert users, a real natural communication is the most difficult to realize effectively. In this paper, we propose an NLI that improves the interaction between the user and the DSS by means of referencing previous questions or their answers (i.e. anaphora such as the pronoun reference in “What traits are affected by them?”), or by eliding parts of the question (i.e. ellipsis such as “And to glume colour?” after the question “Tell me the QTLs related to awn colour in wheat”). Moreover, in order to overcome one of the main problems of NLIs about the difficulty to adapt an NLI to a new domain, our proposal is based on ontologies that are obtained semi-automatically from a framework that allows the integration of internal and external, structured and unstructured information. Therefore, our proposal can interface with databases, data warehouses, QA and IR systems. Because of the high NL ambiguity of the resolution process, our proposal is presented as an authoring tool that helps the user to query efficiently in natural language. Finally, our proposal is tested on a DSS case scenario about Biotechnology and Agriculture, whose knowledge base is the CEREALAB database as internal structured data, and the Web (e.g. PubMed) as external unstructured information.
Resumo:
Comunicación presentada en las XVI Jornadas de Ingeniería del Software y Bases de Datos, JISBD 2011, A Coruña, 5-7 septiembre 2011.
Resumo:
Information Retrieval systems normally have to work with rather heterogeneous sources, such as Web sites or documents from Optical Character Recognition tools. The correct conversion of these sources into flat text files is not a trivial task since noise may easily be introduced as a result of spelling or typeset errors. Interestingly, this is not a great drawback when the size of the corpus is sufficiently large, since redundancy helps to overcome noise problems. However, noise becomes a serious problem in restricted-domain Information Retrieval specially when the corpus is small and has little or no redundancy. This paper devises an approach which adds noise-tolerance to Information Retrieval systems. A set of experiments carried out in the agricultural domain proves the effectiveness of the approach presented.
Resumo:
Information Technology and Communications (ICT) is presented as the main element in order to achieve more efficient and sustainable city resource management, while making sure that the needs of the citizens to improve their quality of life are satisfied. A key element will be the creation of new systems that allow the acquisition of context information, automatically and transparently, in order to provide it to decision support systems. In this paper, we present a novel distributed system for obtaining, representing and providing the flow and movement of people in densely populated geographical areas. In order to accomplish these tasks, we propose the design of a smart sensor network based on RFID communication technologies, reliability patterns and integration techniques. Contrary to other proposals, this system represents a comprehensive solution that permits the acquisition of user information in a transparent and reliable way in a non-controlled and heterogeneous environment. This knowledge will be useful in moving towards the design of smart cities in which decision support on transport strategies, business evaluation or initiatives in the tourism sector will be supported by real relevant information. As a final result, a case study will be presented which will allow the validation of the proposal.
Resumo:
Vacunas.org (http://www.vacunas.org), a website founded by the Spanish Association of Vaccinology offers a personalized service called Ask the Expert, which answers any questions posed by the public or health professionals about vaccines and vaccination. The aim of this study was to analyze the factors associated with questions on vaccination safety and determine the characteristics of questioners and the type of question asked during the period 2008–2010. A total of 1341 questions were finally included in the analysis. Of those, 30% were related to vaccine safety. Questions about pregnant women had 5.01 higher odds of asking about safety (95% CI 2.82–8.93) than people not belonging to any risk group. Older questioners (>50 years) were less likely to ask about vaccine safety compared to younger questioners (OR: 0.44, 95% CI 0.25–0.76). Questions made after vaccination or related to influenza (including H1N1) or travel vaccines were also associated with a higher likelihood of asking about vaccine safety. These results identify risk groups (pregnant women), population groups (older people) and some vaccines (travel and influenza vaccines, including H1N1) where greater efforts to provide improved, more-tailored vaccine information in general and on the Internet are required.
Resumo:
A single and very easy to use Graphical User Interface (GUI- MATLAB) based on the topological information contained in the Gibbs energy of mixing function has been developed as a friendly tool to check the coherence of NRTL parameters obtained in a correlation data procedure. Thus, the analysis of the GM/RT surface, the GM/RT for the binaries and the GM/RT in planes containing the tie lines should be necessary to validate the obtained parameters for the different models for correlating phase equlibrium data.