41 resultados para Annotation informatisée
em Universidad de Alicante
Resumo:
Automatic video segmentation plays a vital role in sports videos annotation. This paper presents a fully automatic and computationally efficient algorithm for analysis of sports videos. Various methods of automatic shot boundary detection have been proposed to perform automatic video segmentation. These investigations mainly concentrate on detecting fades and dissolves for fast processing of the entire video scene without providing any additional feedback on object relativity within the shots. The goal of the proposed method is to identify regions that perform certain activities in a scene. The model uses some low-level feature video processing algorithms to extract the shot boundaries from a video scene and to identify dominant colours within these boundaries. An object classification method is used for clustering the seed distributions of the dominant colours to homogeneous regions. Using a simple tracking method a classification of these regions to active or static is performed. The efficiency of the proposed framework is demonstrated over a standard video benchmark with numerous types of sport events and the experimental results show that our algorithm can be used with high accuracy for automatic annotation of active regions for sport videos.
Resumo:
This paper presents a semi-parametric Algorithm for parsing football video structures. The approach works on a two interleaved based process that closely collaborate towards a common goal. The core part of the proposed method focus perform a fast automatic football video annotation by looking at the enhance entropy variance within a series of shot frames. The entropy is extracted on the Hue parameter from the HSV color system, not as a global feature but in spatial domain to identify regions within a shot that will characterize a certain activity within the shot period. The second part of the algorithm works towards the identification of dominant color regions that could represent players and playfield for further activity recognition. Experimental Results shows that the proposed football video segmentation algorithm performs with high accuracy.
Resumo:
Aportaciones sobre la investigación de los destinos turísticos litorales mediterráneos, vertidas en el marco las jornadas de intercambio y transferencia de resultados celebradas en mayo de 2010, con la participación del Grupo de Investigación sobre Sostenibilidad y Territorio (GIST) de la Universidad de les Illes Balears, el Grupo de Investigación en Análisis Territorial y Estudios Turísticos (GRATET) de la Universidad Rovira i Virgili y el Grupo de Investigación en Planificación y Gestión Sostenible del Turismo de la Universidad de Alicante. Durante la mismas se debatieron y avanzaron planteamientos teóricos, metodológicos y aplicados acerca de la implantación territorial del turismo en el litoral Mediterráneo español.
Resumo:
The importance of the new textual genres such as blogs or forum entries is growing in parallel with the evolution of the Social Web. This paper presents two corpora of blog posts in English and in Spanish, annotated according to the EmotiBlog annotation scheme. Furthermore, we created 20 factual and opinionated questions for each language and also the Gold Standard for their answers in the corpus. The purpose of our work is to study the challenges involved in a mixed fact and opinion question answering setting by comparing the performance of two Question Answering (QA) systems as far as mixed opinion and factual setting is concerned. The first one is open domain, while the second one is opinion-oriented. We evaluate separately the two systems in both languages and propose possible solutions to improve QA systems that have to process mixed questions.
Resumo:
This paper shows a system about the recognition of temporal expressions in Spanish and the resolution of their temporal reference. For the identification and recognition of temporal expressions we have based on a temporal expression grammar and for the resolution on an inference engine, where we have the information necessary to do the date operation based on the recognized expressions. For further information treatment, the output is proposed by means of XML tags in order to add standard information of the resolution obtained. Different kinds of annotation of temporal expressions are explained in another articles [WILSON2001][KATZ2001]. In the evaluation of our proposal we have obtained successful results.
Resumo:
This paper shows an empirical study about the anaphoric accessibility space in Spanish dialogues. According to this study, antecedents of pronominal and adjectival anaphors can almost always (95.9%) be found in the noun phrases set taken from spaces defined using a structure based on adjacency pairs. Furthermore, a proposal of a reliable annotation scheme for Spanish dialogues is presented in order to define this anaphoric accessibility space. Using this annotation scheme, anaphora resolution algorithms can locate the adequate set of anaphor antecedent candidates.
Resumo:
Preliminary research demonstrated the EmotiBlog annotated corpus relevance as a Machine Learning resource to detect subjective data. In this paper we compare EmotiBlog with the JRC Quotes corpus in order to check the robustness of its annotation. We concentrate on its coarse-grained labels and carry out a deep Machine Learning experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC Quotes corpus demonstrating the EmotiBlog validity as a resource for the SA task.
Resumo:
The development of the Web 2.0 led to the birth of new textual genres such as blogs, reviews or forum entries. The increasing number of such texts and the highly diverse topics they discuss make blogs a rich source for analysis. This paper presents a comparative study on open domain and opinion QA systems. A collection of opinion and mixed fact-opinion questions in English is defined and two Question Answering systems are employed to retrieve the answers to these queries. The first one is generic, while the second is specific for emotions. We comparatively evaluate and analyze the systems’ results, concluding that opinion Question Answering requires the use of specific resources and methods.
Resumo:
The exponential growth of the subjective information in the framework of the Web 2.0 has led to the need to create Natural Language Processing tools able to analyse and process such data for multiple practical applications. They require training on specifically annotated corpora, whose level of detail must be fine enough to capture the phenomena involved. This paper presents EmotiBlog – a fine-grained annotation scheme for subjectivity. We show the manner in which it is built and demonstrate the benefits it brings to the systems using it for training, through the experiments we carried out on opinion mining and emotion detection. We employ corpora of different textual genres –a set of annotated reported speech extracted from news articles, the set of news titles annotated with polarity and emotion from the SemEval 2007 (Task 14) and ISEAR, a corpus of real-life self-expressed emotion. We also show how the model built from the EmotiBlog annotations can be enhanced with external resources. The results demonstrate that EmotiBlog, through its structure and annotation paradigm, offers high quality training data for systems dealing both with opinion mining, as well as emotion detection.
Resumo:
This paper presents the first version of EmotiBlog, an annotation scheme for emotions in non-traditional textual genres such as blogs or forums. We collected a corpus composed by blog posts in three languages: English, Spanish and Italian and about three topics of interest. Subsequently, we annotated our collection and carried out the inter-annotator agreement and a ten-fold cross-validation evaluation, obtaining promising results. The main aim of this research is to provide a finer-grained annotation scheme and annotated data that are essential to perform evaluation focused on checking the quality of the created resources.
Resumo:
This paper presents a preliminary study in which Machine Learning experiments applied to Opinion Mining in blogs have been carried out. We created and annotated a blog corpus in Spanish using EmotiBlog. We evaluated the utility of the features labelled firstly carrying out experiments with combinations of them and secondly using the feature selection techniques, we also deal with several problems, such as the noisy character of the input texts, the small size of the training set, the granularity of the annotation scheme and the language object of our study, Spanish, with less resource than English. We obtained promising results considering that it is a preliminary study.
Resumo:
EmotiBlog is a corpus labelled with the homonymous annotation schema designed for detecting subjectivity in the new textual genres. Preliminary research demonstrated its relevance as a Machine Learning resource to detect opinionated data. In this paper we compare EmotiBlog with the JRC corpus in order to check the EmotiBlog robustness of annotation. For this research we concentrate on its coarse-grained labels. We carry out a deep ML experimentation also with the inclusion of lexical resources. The results obtained show a similarity with the ones obtained with the JRC demonstrating the EmotiBlog validity as a resource for the SA task.
Resumo:
In this paper a multilingual method for event ordering based on temporal expression resolution is presented. This method has been implemented through the TERSEO system which consists of three main units: temporal expression recognizing, resolution of the coreference introduced by these expressions, and event ordering. By means of this system, chronological information related to events can be extracted from documental databases. This information is automatically added to the documental database in order to allow its use by question answering systems in those cases referring to temporality. The system has been evaluated obtaining results of 91 % precision and 71 % recall. For this, a blind evaluation process has been developed guaranteing a reliable annotation process that was measured through the kappa factor.
Resumo:
The McCabe-Thiele method is a classical approximate graphical method for the conceptual design of binary distillation columns which is still widely used, mainly for didactical purposes, though it is also valuable for quick preliminary calculations. Nevertheless, no complete description of the method has been found and situations such as different thermal feed conditions, multiple feeds, possibilities to extract by-products or to add or remove heat, are not always considered. In the present work we provide a systematic analysis of such situations by developing the generalized equations for: a) the operating lines (OL) of each sector, and b) the changeover line that provides the connection between two consecutive trays of the corresponding sectors separated by a lateral stream of feed, product, or a heat removal or addition.
Resumo:
IARG-AnCora tiene como objetivo la anotación con papeles temáticos de los argumentos implícitos de las nominalizaciones deverbales en el corpus AnCora. Estos corpus servirán de base para los sistemas de etiquetado automático de roles semánticos basados en técnicas de aprendizaje automático. Los analizadores semánticos son componentes básicos en las aplicaciones actuales de las tecnologías del lenguaje, en las que se quiere potenciar una comprensión más profunda del texto para realizar inferencias de más alto nivel y obtener así mejoras cualitativas en los resultados.