40 resultados para Text Mining

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of this Master Thesis is to discover more about Girona’s image as a tourism destination from different agents’ perspective and to study its differences on promotion or opinions. In order to meet this objective, three components of Girona’s destination image will be studied: attribute-based component, the holistic component, and the affective component. It is true that a lot of research has been done about tourism destination image, but it is less when we are talking about the destination of Girona. Some studies have already focused on Girona as a tourist destination, but they used a different type of sample and different methodological steps. This study is new among destination studies in the sense that it is based only on textual online data and it follows a methodology based on text-miming. Text-mining is a kind of methodology that allows people extract relevant information from texts. Also, after this information is extracted by this methodology, some statistical multivariate analyses are done with the aim of discovering more about Girona’s tourism image

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Customer Experience Management (CEM) se ha convertido en un factor clave para el éxito de las empresas. CEM gestiona todas las experiencias que un cliente tiene con un proveedor de servicios o productos. Es muy importante saber como se siente un cliente en cada contacto y entonces poder sugerir automáticamente la próxima tarea a realizar, simplificando tareas realizadas por personas. En este proyecto se desarrolla una solución para evaluar experiencias. Primero se crean servicios web que clasifican experiencias en estados emocionales dependiendo del nivel de satisfacción, interés, … Esto es realizado a través de minería de textos. Se procesa y clasifica información no estructurada (documentos de texto) que representan o describen las experiencias. Se utilizan métodos de aprendizaje supervisado. Esta parte es desarrollada con una arquitectura orientada a servicios (SOA) para asegurar el uso de estándares y que los servicios sean accesibles por cualquier aplicación. Estos servicios son desplegados en un servidor de aplicaciones. En la segunda parte se desarrolla dos aplicaciones basadas en casos reales. En esta fase Cloud computing es clave. Se utiliza una plataforma de desarrollo en línea para crear toda la aplicación incluyendo tablas, objetos, lógica de negocio e interfaces de usuario. Finalmente los servicios de clasificación son integrados a la plataforma asegurando que las experiencias son evaluadas y que las tareas de seguimiento son automáticamente creadas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper analyzes and evaluates, in the context of Ontology learning, some techniques to identify and extract candidate terms to classes of a taxonomy. Besides, this work points out some inconsistencies that may be occurring in the preprocessing of text corpus, and proposes techniques to obtain good terms candidate to classes of a taxonomy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: To enhance our understanding of complex biological systems like diseases we need to put all of the available data into context and use this to detect relations, pattern and rules which allow predictive hypotheses to be defined. Life science has become a data rich science with information about the behaviour of millions of entities like genes, chemical compounds, diseases, cell types and organs, which are organised in many different databases and/or spread throughout the literature. Existing knowledge such as genotype - phenotype relations or signal transduction pathways must be semantically integrated and dynamically organised into structured networks that are connected with clinical and experimental data. Different approaches to this challenge exist but so far none has proven entirely satisfactory. Results: To address this challenge we previously developed a generic knowledge management framework, BioXM™, which allows the dynamic, graphic generation of domain specific knowledge representation models based on specific objects and their relations supporting annotations and ontologies. Here we demonstrate the utility of BioXM for knowledge management in systems biology as part of the EU FP6 BioBridge project on translational approaches to chronic diseases. From clinical and experimental data, text-mining results and public databases we generate a chronic obstructive pulmonary disease (COPD) knowledge base and demonstrate its use by mining specific molecular networks together with integrated clinical and experimental data. Conclusions: We generate the first semantically integrated COPD specific public knowledge base and find that for the integration of clinical and experimental data with pre-existing knowledge the configuration based set-up enabled by BioXM reduced implementation time and effort for the knowledge base compared to similar systems implemented as classical software development projects. The knowledgebase enables the retrieval of sub-networks including protein-protein interaction, pathway, gene - disease and gene - compound data which are used for subsequent data analysis, modelling and simulation. Pre-structured queries and reports enhance usability; establishing their use in everyday clinical settings requires further simplification with a browser based interface which is currently under development.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: Reconstruction of genes and/or protein networks from automated analysis of the literature is one of the current targets of text mining in biomedical research. Some user-friendly tools already perform this analysis on precompiled databases of abstracts of scientific papers. Other tools allow expert users to elaborate and analyze the full content of a corpus of scientific documents. However, to our knowledge, no user friendly tool that simultaneously analyzes the latest set of scientific documents available on line and reconstructs the set of genes referenced in those documents is available. Results: This article presents such a tool, Biblio-MetReS, and compares its functioning and results to those of other user-friendly applications (iHOP, STRING) that are widely used. Under similar conditions, Biblio-MetReS creates networks that are comparable to those of other user friendly tools. Furthermore, analysis of full text documents provides more complete reconstructions than those that result from using only the abstract of the document. Conclusions: Literature-based automated network reconstruction is still far from providing complete reconstructions of molecular networks. However, its value as an auxiliary tool is high and it will increase as standards for reporting biological entities and relationships become more widely accepted and enforced. Biblio- MetReS is an application that can be downloaded from http://metres.udl.cat/. It provides an easy to use environment for researchers to reconstruct their networks of interest from an always up to date set of scientific documents.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recommender systems attempt to predict items in which a user might be interested, given some information about the user's and items' profiles. Most existing recommender systems use content-based or collaborative filtering methods or hybrid methods that combine both techniques (see the sidebar for more details). We created Informed Recommender to address the problem of using consumer opinion about products, expressed online in free-form text, to generate product recommendations. Informed recommender uses prioritized consumer product reviews to make recommendations. Using text-mining techniques, it maps each piece of each review comment automatically into an ontology

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Estudi elaborat a partir d’una estada a Xerox Research Centre Europe a Grenoble, França,entre juny i desembre del 2006. El projecte tradueïx termes tècnics anglesos a noruec. És asimètric perquè no tenim recursos lingüístics per a la llengua noruega, però solament per a l'anglès. S’ha desenvolupat i posat en pràctica mètodes que comprovaven contigüitat ("local reordering" i permutació selectiva) per a millorar el funcionament d’una eina anterior. Contigüitat és quan una paraula es traduïx en paraules múltiples, aquestes paraules han de ser adjacents en l'oració. A més, s’ha construït una taula de les operacions de recerca per als termes tècnics i s’ha integrat aquesta taula en un programa de demostració.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El "Lecciones de Historia Natural", publicat a Barcelona l'any 1820, és considerat el primer llibre de text sobre història natural original i publicat en llengua castellana. El seu autor, Agustí Yàñez, fou un personatge molt conegut de la societat barcelonina de la primera meitat del segle XIX i una peça principal en el desenvolupament de l'ensenyament universitari de la farmàcia i en la difusió de la ciència a la societat. L'estudi del seu llibre de text i el d'altres fonts permeten copsar com s'ensenyava la història natural al Col·legi de farmàcia de Barcelona en aquell període.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

En la presente memoria se detallan con exactitud los pasos y procesos realizados para construir una aplicación que posibilite el cruce de datos genéticos a partir de información contenida en bases de datos remotas. Desarrolla un estudio en profundidad del contenido y estructura de las bases de datos remotas del NCBI y del KEGG, documentando una minería de datos con el objetivo de extraer de ellas la información necesaria para desarrollar la aplicación de cruce de datos genéticos. Finalmente se establecen los programas, scripts y entornos gráficos que han sido implementados para la construcción y posterior puesta en marcha de la aplicación que proporciona la funcionalidad de cruce de la que es objeto este proyecto fin de carrera.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Estudi realitzat a partir d’una estada al Computer Science and Artificial Intelligence Lab, del Massachusetts Institute of Technology, entre 2006 i 2008. La recerca desenvolupada en aquest projecte se centra en mètodes d'aprenentatge automàtic per l'anàlisi sintàctica del llenguatge. Com a punt de partida, establim que la complexitat del llenguatge exigeix no només entendre els processos computacionals associats al llenguatge sinó també entendre com es pot aprendre automàticament el coneixement per a dur a terme aquests processos.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider a model with parameter phi, and an auxiliary model with parameter theta. Let phi be a randomly sampled from a given density over the known parameter space. Monte Carlo methods can be used to draw simulated data and compute the corresponding estimate of theta, say theta_tilde. A large set of tuples (phi, theta_tilde) can be generated in this manner. Nonparametric methods may be use to fit the function E(phi|theta_tilde=a), using these tuples. It is proposed to estimate phi using the fitted E(phi|theta_tilde=theta_hat), where theta_hat is the auxiliary estimate, using the real sample data. This is a consistent and asymptotically normally distributed estimator, under certain assumptions. Monte Carlo results for dynamic panel data and vector autoregressions show that this estimator can have very attractive small sample properties. Confidence intervals can be constructed using the quantiles of the phi for which theta_tilde is close to theta_hat. Such confidence intervals are found to have very accurate coverage.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aquesta memoria resumeix el treball de final de carrera d’Enginyeria Superior d’Informàtica. Explicarà les principals raons que han motivat el projecte així com exemples que il·lustren l’aplicació resultant. En aquest cas el software intentarà resoldre la actual necessitat que hi ha de tenir dades de Ground Truth per als algoritmes de segmentació de text per imatges de color complexes. Tots els procesos seran explicats en els diferents capítols partint de la definició del problema, la planificació, els requeriments i el disseny fins a completar la il·lustració dels resultats del programa i les dades de Ground Truth resultants.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consumer reviews, opinions and shared experiences in the use of a product is a powerful source of information about consumer preferences that can be used in recommender systems. Despite the importance and value of such information, there is no comprehensive mechanism that formalizes the opinions selection and retrieval process and the utilization of retrieved opinions due to the difficulty of extracting information from text data. In this paper, a new recommender system that is built on consumer product reviews is proposed. A prioritizing mechanism is developed for the system. The proposed approach is illustrated using the case study of a recommender system for digital cameras

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Extracción de conocimiento de los log generados por un servidor web aplicando técnicas de minería de datos.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this project a research both in finding predictors via clustering techniques and in reviewing the Data Mining free software is achieved. The research is based in a case of study, from where additionally to the KDD free software used by the scientific community; a new free tool for pre-processing the data is presented. The predictors are intended for the e-learning domain as the data from where these predictors have to be inferred are student qualifications from different e-learning environments. Through our case of study not only clustering algorithms are tested but also additional goals are proposed.