77 resultados para Wordnet
Resumo:
In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.
Resumo:
La presente herramienta informática constituye un software que es capaz concebir una red semántica con los siguientes recursos: WordNet versión 1.6 y 2.0, WordNet Affects versión 1.0 y 1.1, WordNet Domain versión 2.0, SUMO, Semantic Classes y Senti WordNet versión 3.0, todos integrados y relacionados en una única base de conocimiento. Utilizando estos recursos, ISR-WN cuenta con funcionalidades añadidas que permiten la exploración de dicha red de un modo simple aplicando funciones tanto como de recorrido como de búsquedas textuales. Mediante la interrogación de dicha red semántica es posible obtener información para enriquecer textos, como puede ser obtener las definiciones de aquellas palabras que son de uso común en determinados Dominios en general, dominios emocionales, y otras conceptualizaciones, además de conocer de un determinado sentido de una palabra su valoración proporcionada por el recurso SentiWordnet de positividad, negatividad y objetividad sentimental. Toda esta información puede ser utilizada en tareas de procesamiento del lenguaje natural como: • Desambiguación del Sentido de las Palabras, • Detección de la Polaridad Sentimental • Análisis Semántico y Léxico para la obtención de conceptos relevantes en una frase según el tipo de recurso implicado. Esta herramienta tiene como base el idioma inglés y se encuentra disponible como una aplicación de Windows la cual dispone de un archivo de instalación el cual despliega en el ordenador de residencia las librerías necesarias para su correcta utilización. Además de la interfaz de usuario ofrecida, esta herramienta puede ser utilizada como API (Application Programming Interface) por otras aplicaciones.
Resumo:
Some basic points from the automated creation of a Bulgarian WordNet – an analogue of the Princeton WordNet, are treated. The used computer tools, the received results and their estimation are discussed. A side effect from the proposed approach is the receiving of patterns for the Bulgarian syntactic analyzer.
Resumo:
In recent years, there has been an increas-ing interest in learning a distributed rep-resentation of word sense. Traditional context clustering based models usually require careful tuning of model parame-ters, and typically perform worse on infre-quent word senses. This paper presents a novel approach which addresses these lim-itations by first initializing the word sense embeddings through learning sentence-level embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned represen-tations outperform the publicly available embeddings on 2 out of 4 metrics in the word similarity task, and 6 out of 13 sub tasks in the analogical reasoning task.
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
The cross-sections of the Social Web and the Semantic Web has put folksonomy in the spot light for its potential in overcoming knowledge acquisition bottleneck and providing insight for "wisdom of the crowds". Folksonomy which comes as the results of collaborative tagging activities has provided insight into user's understanding about Web resources which might be useful for searching and organizing purposes. However, collaborative tagging vocabulary poses some challenges since tags are freely chosen by users and may exhibit synonymy and polysemy problem. In order to overcome these challenges and boost the potential of folksonomy as emergence semantics we propose to consolidate the diverse vocabulary into a consolidated entities and concepts. We propose to extract a tag ontology by ontology learning process to represent the semantics of a tagging community. This paper presents a novel approach to learn the ontology based on the widely used lexical database WordNet. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. We provide empirical evaluations by using the semantic information contained in the ontology in a tag recommendation experiment. The results show that by using the semantic relationships on the ontology the accuracy of the tag recommender has been improved.
Resumo:
Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. One of the most popular web personalization systems is recommender systems. In recommender systems choosing user information that can be used to profile users is very crucial for user profiling. In Web 2.0, one facility that can help users organize Web resources of their interest is user tagging systems. Exploring user tagging behavior provides a promising way for understanding users’ information needs since tags are given directly by users. However, free and relatively uncontrolled vocabulary makes the user self-defined tags lack of standardization and semantic ambiguity. Also, the relationships among tags need to be explored since there are rich relationships among tags which could provide valuable information for us to better understand users. In this paper, we propose a novel approach for learning tag ontology based on the widely used lexical database WordNet for capturing the semantics and the structural relationships of tags. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. To personalize further, clustering of users is performed to generate a more accurate ontology for a particular group of users. In order to evaluate the usefulness of the tag ontology, we use the tag ontology in a pilot tag recommendation experiment for improving the recommendation performance by exploiting the semantic information in the tag ontology. The initial result shows that the personalized information has improved the accuracy of the tag recommendation.
Resumo:
FinnWordNet is a wordnet for Finnish that complies with the format of the Princeton WordNet (PWN) (Fellbaum, 1998). It was built by translating the PrincetonWordNet 3.0 synsets into Finnish by human translators. It is open source and contains 117000 synsets. The Finnish translations were inserted into the PWN structure resulting in a bilingual lexical database. In natural language processing (NLP), wordnets have been used for infusing computers with semantic knowledge assuming that humans already have a sufficient amount of this knowledge. In this paper we present a case study of using wordnets as an electronic dictionary. We tested whether native Finnish speakers benefit from using a wordnet while completing English sentence completion tasks. We found that using either an English wordnet or a bilingual English Finnish wordnet significantly improves performance in the task. This should be taken into account when setting standards and comparing human and computer performance on these tasks.
Resumo:
This research shows a new approach and development of a design methodology, based on the perspective of meanings. In this study the design process is explored as a development of the structure of meanings. The processes of search and evaluation of meanings form the foundations of developing this structure. In order to facilitate the use and operation of the meanings, the WordNet lexical database and an existing visualization of WordNet — Visuwords — is used for the process of meaning search. The basic tool used for evaluation process is the WordNet::Similarity software, measuring the relatedness of meanings in the database. In this way it is measuring the degree of interconnections between different meanings. This kind of search and evaluation techniques are later on incorporated into our methodology of the structure of meanings to support the design process. The measures of relatedness of meanings are developed as convergence criteria for application in the processes of evaluation. Further on, the methodology for the structure of meanings developed here is used to construct meanings in a verification of product design. The steps of the design methodology, including the search and evaluation processes involved in developing the structure of the meanings, are elucidated. The choices, made by the designer in terms of meanings are supported by consequent searches and evaluations of meanings to be implemented in the designed product. In conclusion, the paper presents directions for developing and further extensions of the proposed design methodology.
Resumo:
This paper describes three novel techniques to automatically evaluate sentence extract summaries. Two of these techniques called FuSE and DeFuSE evaluate the quality of the generated extract summary based on the degree of similarity to the model summary. They use a fuzzy set theoretic basis to generate a match score. DeFuSE is an enhancement to FuSE and uses WordNet based hypernymy structures to detect similarity between sentences at abstracted levels. The third technique focuses on quantifying the quality of an extract summary based on the difficulty in generating such a summary. Advantages of these techniques are described with examples.
Resumo:
The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.
Resumo:
In this article we describe the methodology developed for the semiautomatic annotation of EPEC-RolSem, a Basque corpus labeled at predicate level following the PropBank-VerbNet model. The methodology presented is the product of detailed theoretical study of the semantic nature of verbs in Basque and of their similarities and differences with verbs in other languages. As part of the proposed methodology, we are creating a Basque lexicon on the PropBank-VerbNet model that we have named the Basque Verb Index (BVI). Our work thus dovetails the general trend toward building lexicons from tagged corpora that is clear in work conducted for other languages. EPEC-RolSem and BVI are two important resources for the computational semantic processing of Basque; as far as the authors are aware, they are also the first resources of their kind developed for Basque. In addition, each entry in BVI is linked to the corresponding verb-entry in well-known resources like PropBank, VerbNet, WordNet, Levin’s Classification and FrameNet. We have also implemented several automatic processes to aid in creating and annotating the BVI, including processes designed to facilitate the task of manual annotation.
Resumo:
158 p. : graf.
Resumo:
El objetivo del proyecto es crear una aplicación Android usando la base de conocimiento multilingüe Multilingual Central Repository 3.0 (MCR 3.0).