995 resultados para Semantic classes


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Topic detection and tracking (TDT) is an area of information retrieval research the focus of which revolves around news events. The problems TDT deals with relate to segmenting news text into cohesive stories, detecting something new, previously unreported, tracking the development of a previously reported event, and grouping together news that discuss the same event. The performance of the traditional information retrieval techniques based on full-text similarity has remained inadequate for online production systems. It has been difficult to make the distinction between same and similar events. In this work, we explore ways of representing and comparing news documents in order to detect new events and track their development. First, however, we put forward a conceptual analysis of the notions of topic and event. The purpose is to clarify the terminology and align it with the process of news-making and the tradition of story-telling. Second, we present a framework for document similarity that is based on semantic classes, i.e., groups of words with similar meaning. We adopt people, organizations, and locations as semantic classes in addition to general terms. As each semantic class can be assigned its own similarity measure, document similarity can make use of ontologies, e.g., geographical taxonomies. The documents are compared class-wise, and the outcome is a weighted combination of class-wise similarities. Third, we incorporate temporal information into document similarity. We formalize the natural language temporal expressions occurring in the text, and use them to anchor the rest of the terms onto the time-line. Upon comparing documents for event-based similarity, we look not only at matching terms, but also how near their anchors are on the time-line. Fourth, we experiment with an adaptive variant of the semantic class similarity system. The news reflect changes in the real world, and in order to keep up, the system has to change its behavior based on the contents of the news stream. We put forward two strategies for rebuilding the topic representations and report experiment results. We run experiments with three annotated TDT corpora. The use of semantic classes increased the effectiveness of topic tracking by 10-30\% depending on the experimental setup. The gain in spotting new events remained lower, around 3-4\%. The anchoring the text to a time-line based on the temporal expressions gave a further 10\% increase the effectiveness of topic tracking. The gains in detecting new events, again, remained smaller. The adaptive systems did not improve the tracking results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot with-out environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot’s behaviour during navigation tasks. The system is made available to the community as a ROS module.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A straightforward computation of the list of the words (the `tail words' of the list) that are distributionally most similar to a given word (the `head word' of the list) leads to the question: How semantically similar to the head word are the tail words; that is: how similar are their meanings to its meaning? And can we do better? The experiment was done on nearly 18,000 most frequent nouns in a Finnish newsgroup corpus. These nouns are considered to be distributionally similar to the extent that they occur in the same direct dependency relations with the same nouns, adjectives and verbs. The extent of the similarity of their computational representations is quantified with the information radius. The semantic classification of head-tail pairs is intuitive; some tail words seem to be semantically similar to the head word, some do not. Each such pair is also associated with a number of further distributional variables. Individually, their overlap for the semantic classes is large, but the trained classification-tree models have some success in using combinations to predict the semantic class. The training data consists of a random sample of 400 head-tail pairs with the tail word ranked among the 20 distributionally most similar to the head word, excluding names. The models are then tested on a random sample of another 100 such pairs. The best success rates range from 70% to 92% of the test pairs, where a success means that the model predicted my intuitive semantic class of the pair. This seems somewhat promising when distributional similarity is used to capture semantically similar words. This analysis also includes a general discussion of several different similarity formulas, arranged in three groups: those that apply to sets with graded membership, those that apply to the members of a vector space, and those that apply to probability mass functions.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

There exists an enormous gap between low-level visual feature and high-level semantic information, and the accuracy of content-based image classification and retrieval depends greatly on the description of low-level visual features. Taking this into consideration, a novel texture and edge descriptor is proposed in this paper, which can be represented with a histogram. Furthermore, with the incorporation of the color, texture and edge histograms searnlessly, the images are grouped into semantic classes using a support vector machine (SVM). Experiment results show that the combination descriptor is more discriminative than other feature descriptors such as Gabor texture.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we present the enrichment of the Integration of Semantic Resources based in WordNet (ISR-WN Enriched). This new proposal improves the previous one where several semantic resources such as SUMO, WordNet Domains and WordNet Affects were related, adding other semantic resources such as Semantic Classes and SentiWordNet. Firstly, the paper describes the architecture of this proposal explaining the particularities of each integrated resource. After that, we analyze some problems related to the mappings of different versions and how we solve them. Moreover, we show the advantages that this kind of tool can provide to different applications of Natural Language Processing. Related to that question, we can demonstrate that the integration of semantic resources allows acquiring a multidimensional vision in the analysis of natural language.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Scene classification based on latent Dirichlet allocation (LDA) is a more general modeling method known as a bag of visual words, in which the construction of a visual vocabulary is a crucial quantization process to ensure success of the classification. A framework is developed using the following new aspects: Gaussian mixture clustering for the quantization process, the use of an integrated visual vocabulary (IVV), which is built as the union of all centroids obtained from the separate quantization process of each class, and the usage of some features, including edge orientation histogram, CIELab color moments, and gray-level co-occurrence matrix (GLCM). The experiments are conducted on IKONOS images with six semantic classes (tree, grassland, residential, commercial/industrial, road, and water). The results show that the use of an IVV increases the overall accuracy (OA) by 11 to 12% and 6% when it is implemented on the selected and all features, respectively. The selected features of CIELab color moments and GLCM provide a better OA than the implementation over CIELab color moment or GLCM as individuals. The latter increases the OA by only ∼2 to 3%. Moreover, the results show that the OA of LDA outperforms the OA of C4.5 and naive Bayes tree by ∼20%. © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE) [DOI: 10.1117/1.JRS.8.083690]

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Using the technique of multiple distinctive collexeme analysis, this paper seeks to determine the verbs that are distinctively associated with the non-finite verb slot of English periphrastic causative constructions. Not only does the analysis reveal that the various causative constructions are attracted to essentially different verbs, but by examining how these verbs fall into semantic classes, it also hints at subtle differences in meaning between the constructions. In addition, the paper shows how the technique of multiple distinctive collexeme analysis can be usefully combined with other, complementary methods, and briefly discusses a number of factors which influence the results of multiple distinctive collexeme analysis and should therefore ideally be taken into account.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La presente herramienta informática constituye un software que es capaz concebir una red semántica con los siguientes recursos: WordNet versión 1.6 y 2.0, WordNet Affects versión 1.0 y 1.1, WordNet Domain versión 2.0, SUMO, Semantic Classes y Senti WordNet versión 3.0, todos integrados y relacionados en una única base de conocimiento. Utilizando estos recursos, ISR-WN cuenta con funcionalidades añadidas que permiten la exploración de dicha red de un modo simple aplicando funciones tanto como de recorrido como de búsquedas textuales. Mediante la interrogación de dicha red semántica es posible obtener información para enriquecer textos, como puede ser obtener las definiciones de aquellas palabras que son de uso común en determinados Dominios en general, dominios emocionales, y otras conceptualizaciones, además de conocer de un determinado sentido de una palabra su valoración proporcionada por el recurso SentiWordnet de positividad, negatividad y objetividad sentimental. Toda esta información puede ser utilizada en tareas de procesamiento del lenguaje natural como: • Desambiguación del Sentido de las Palabras, • Detección de la Polaridad Sentimental • Análisis Semántico y Léxico para la obtención de conceptos relevantes en una frase según el tipo de recurso implicado. Esta herramienta tiene como base el idioma inglés y se encuentra disponible como una aplicación de Windows la cual dispone de un archivo de instalación el cual despliega en el ordenador de residencia las librerías necesarias para su correcta utilización. Además de la interfaz de usuario ofrecida, esta herramienta puede ser utilizada como API (Application Programming Interface) por otras aplicaciones.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper introduces a quantitative method for identifying newly emerging word forms in large time-stamped corpora of natural language and then describes an analysis of lexical emergence in American social media using this method based on a multi-billion word corpus of Tweets collected between October 2013 and November 2014. In total 29 emerging word forms, which represent various semantic classes, grammatical parts-of speech, and word formations processes, were identified through this analysis. These 29 forms are then examined from various perspectives in order to begin to better understand the process of lexical emergence.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The proposal of teaching-service integration from work experience brings a challenge to the professionals involved in health services: to combine their healthcare practice to the preparation of new professionals in accordance with the national health model. In Recife, the assistance network is known as school network, since it provides all its health equipment for Higher Education Institutions, in special for professionals who work as preceptors, making this activity an important component of the services network. The objective of the present study was to analyze preceptorship experience herein Multidisciplinary Residences in Health in the look of health professionals. This is a qualitative descriptive study, involving physicians, dentists, and nurses that have worked as preceptors for at least two years in multidisciplinary residency linked to two Higher Education Institutions. A semistructured interview was used as research instrument and data were processed by using the software Alceste 4.9. Results indicated four semantic classes which were divided into two axis. Axis 1, composed of class 4, and Axis 2, composed of classes 3, 2 and 1. Categorization considered the relation between classes. It was observed that in class 4 work overload is a dilemma for professional participation in preceptorship. This is noted by the words manage, time, patient, give, and complicated. However, it is also observed that the preceptorship involves positive learning and teaching actions, reinforced by the words say, explain, and discuss. Class 2 shows the preceptorship as an experience exchange, a positive moment that provides theoretical upgrade to the preceptor, associated to the professional practices performed by the binominal preceptor-student in health services and communities. In this perspective, everyone is benefited since preceptorship is structured according to dynamic aspects of knowledge, experienced in settings permeated by people´s health necessities. In class 3, potentialities of this practice are shown, and personal compromise is the main reason of acting as a preceptor in this network of education/attention, demonstrated in the words reason, formation, to like and professionals. Last, but not the least, class 1 suggests the importance of preceptorship and one of the strategies to create the National Politics of Humanization, from the teachingservice-community integration, observed in the words: arrives, university, fundamental, manner, partnership, service, and student. Besides, it rates perspectives and challenges for the improvement of the preceptorship in health services. Integrating teaching and service can enhance the proposals of changes concerning the healthcare model practiced in services, but this relation is still superficial. The preceptor is an actor in action, playing real life roles, and that is when he becomes essential to seek training with the profile defended in the proposed training of a professional who is capable of learning to learn

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Professional masters were created in Brazil in the 1990s in response to social changes in the world of work and aim to train high-level professionals with own profile for various activities of society and the productive sector. They are up in more innovative mode of graduate studies in Brazil, and therefore lack of legitimacy of their identity, which raises the need for discussions to get further information and outline the characteristics of this postgraduate modality. You want to build new understandings about their peculiarities starting from the perspective of students from the Professional Masters in Health Net Northeast Family Training in Family Health, and not only according to the similarities and differences with the academic master. This study aims to understand the meanings attributed by masters training in that course. This is a qualitative, exploratory study. The subjects are 100 students in training in 2013, distributed among the six institutions nucleation Network Northeast Training in Family Health. To collect information desk research was conducted in institutional records of all students, as well as interviews. We interviewed 15 students, distributed in the six nucleation institutions. Information obtained through recorded interviews were transcribed and resulted in two analytical corpus subsequently submitted to Alceste © 4.9 software for identification of semantic classes. It can be concluded that the course provided a redefinition of professional practices in the Family Health Strategy, considering the organizational context of primary care in the Northeast and the specifics of the health work. Even before the student body difficulties related to ownership of research methods, and the very active methodology of problem-based learning, the course effectively contributed to the improvement of work processes in primary care, valuing teamwork and allowing the acquisition of new scientific knowledge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The challenges of maintaining a building such as the Sydney Opera House are immense and are dependent upon a vast array of information. The value of information can be enhanced by its currency, accessibility and the ability to correlate data sets (integration of information sources). A building information model correlated to various information sources related to the facility is used as definition for a digital facility model. Such a digital facility model would give transparent and an integrated access to an array of datasets and obviously would support Facility Management processes. In order to construct such a digital facility model, two state-of-the-art Information and Communication technologies are considered: an internationally standardized building information model called the Industry Foundation Classes (IFC) and a variety of advanced communication and integration technologies often referred to as the Semantic Web such as the Resource Description Framework (RDF) and the Web Ontology Language (OWL). This paper reports on some technical aspects for developing a digital facility model focusing on Sydney Opera House. The proposed digital facility model enables IFC data to participate in an ontology driven, service-oriented software environment. A proof-of-concept prototype has been developed demonstrating the usability of IFC information to collaborate with Sydney Opera House’s specific data sources using semantic web ontologies.