921 resultados para Processing wikipedia data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The robotics is one of the most active areas. We also need to join a large number of disciplines to create robots. With these premises, one problem is the management of information from multiple heterogeneous sources. Each component, hardware or software, produces data with different nature: temporal frequencies, processing needs, size, type, etc. Nowadays, technologies and software engineering paradigms such as service-oriented architectures are applied to solve this problem in other areas. This paper proposes the use of these technologies to implement a robotic control system based on services. This type of system will allow integration and collaborative work of different elements that make up a robotic system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Gaia-ESO Survey is a large public spectroscopic survey that aims to derive radial velocities and fundamental parameters of about 105 Milky Way stars in the field and in clusters. Observations are carried out with the multi-object optical spectrograph FLAMES, using simultaneously the medium-resolution (R ~ 20 000) GIRAFFE spectrograph and the high-resolution (R ~ 47 000) UVES spectrograph. In this paper we describe the methods and the software used for the data reduction, the derivation of the radial velocities, and the quality control of the FLAMES-UVES spectra. Data reduction has been performed using a workflow specifically developed for this project. This workflow runs the ESO public pipeline optimizing the data reduction for the Gaia-ESO Survey, automatically performs sky subtraction, barycentric correction and normalisation, and calculates radial velocities and a first guess of the rotational velocities. The quality control is performed using the output parameters from the ESO pipeline, by a visual inspection of the spectra and by the analysis of the signal-to-noise ratio of the spectra. Using the observations of the first 18 months, specifically targets observed multiple times at different epochs, stars observed with both GIRAFFE and UVES, and observations of radial velocity standards, we estimated the precision and the accuracy of the radial velocities. The statistical error on the radial velocities is σ ~ 0.4 km s-1 and is mainly due to uncertainties in the zero point of the wavelength calibration. However, we found a systematic bias with respect to the GIRAFFE spectra (~0.9 km s-1) and to the radial velocities of the standard stars (~0.5 km s-1) retrieved from the literature. This bias will be corrected in the future data releases, when a common zero point for all the set-ups and instruments used for the survey is be established.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

3D sensors provides valuable information for mobile robotic tasks like scene classification or object recognition, but these sensors often produce noisy data that makes impossible applying classical keypoint detection and feature extraction techniques. Therefore, noise removal and downsampling have become essential steps in 3D data processing. In this work, we propose the use of a 3D filtering and down-sampling technique based on a Growing Neural Gas (GNG) network. GNG method is able to deal with outliers presents in the input data. These features allows to represent 3D spaces, obtaining an induced Delaunay Triangulation of the input space. Experiments show how the state-of-the-art keypoint detectors improve their performance using GNG output representation as input data. Descriptors extracted on improved keypoints perform better matching in robotics applications as 3D scene registration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In autumn 2012, the new release 05 (RL05) of monthly geopotencial spherical harmonics Stokes coefficients (SC) from GRACE (Gravity Recovery and Climate Experiment) mission was published. This release reduces the noise in high degree and order SC, but they still need to be filtered. One of the most common filtering processing is the combination of decorrelation and Gaussian filters. Both of them are parameters dependent and must be tuned by the users. Previous studies have analyzed the parameters choice for the RL05 GRACE data for oceanic applications, and for RL04 data for global application. This study updates the latter for RL05 data extending the statistics analysis. The choice of the parameters of the decorrelation filter has been optimized to: (1) balance the noise reduction and the geophysical signal attenuation produced by the filtering process; (2) minimize the differences between GRACE and model-based data; (3) maximize the ratio of variability between continents and oceans. The Gaussian filter has been optimized following the latter criteria. Besides, an anisotropic filter, the fan filter, has been analyzed as an alternative to the Gauss filter, producing better statistics.