27 resultados para Topics Extraction
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Estudi elaborat a partir d’una estada a Xerox Research Centre Europe a Grenoble, França,entre juny i desembre del 2006. El projecte tradueïx termes tècnics anglesos a noruec. És asimètric perquè no tenim recursos lingüístics per a la llengua noruega, però solament per a l'anglès. S’ha desenvolupat i posat en pràctica mètodes que comprovaven contigüitat ("local reordering" i permutació selectiva) per a millorar el funcionament d’una eina anterior. Contigüitat és quan una paraula es traduïx en paraules múltiples, aquestes paraules han de ser adjacents en l'oració. A més, s’ha construït una taula de les operacions de recerca per als termes tècnics i s’ha integrat aquesta taula en un programa de demostració.
Resumo:
This work covers two aspects. First, it generally compares and summarizes the similarities and differences of state of the art feature detector and descriptor and second it presents a novel approach of detecting intestinal content (in particular bubbles) in capsule endoscopy images. Feature detectors and descriptors providing invariance to change of perspective, scale, signal-noise-ratio and lighting conditions are important and interesting topics in current research and the number of possible applications seems to be numberless. After analysing a selection of in the literature presented approaches, this work investigates in their suitability for applications information extraction in capsule endoscopy images. Eventually, a very good performing detector of intestinal content in capsule endoscopy images is presented. A accurate detection of intestinal content is crucial for all kinds of machine learning approaches and other analysis on capsule endoscopy studies because they occlude the field of view of the capsule camera and therefore those frames need to be excluded from analysis. As a so called “byproduct” of this investigation a graphical user interface supported Feature Analysis Tool is presented to execute and compare the discussed feature detectors and descriptor on arbitrary images, with configurable parameters and visualized their output. As well the presented bubble classifier is part of this tool and if a ground truth is available (or can also be generated using this tool) a detailed visualization of the validation result will be performed.
Resumo:
The two main alternative methods used to identify key sectors within the input-output approach, the Classical Multiplier method (CMM) and the Hypothetical Extraction method (HEM), are formally and empirically compared in this paper. Our findings indicate that the main distinction between the two approaches stems from the role of the internal effects. These internal effects are quantified under the CMM while under the HEM only external impacts are considered. In our comparison, we find, however that CMM backward measures are more influenced by within-block effects than the proposed forward indices under this approach. The conclusions of this comparison allow us to develop a hybrid proposal that combines these two existing approaches. This hybrid model has the advantage of making it possible to distinguish and disaggregate external effects from those that a purely internal. This proposal has also an additional interest in terms of policy implications. Indeed, the hybrid approach may provide useful information for the design of ''second best'' stimulus policies that aim at a more balanced perspective between overall economy-wide impacts and their sectoral distribution.
Analysis and evaluation of techniques for the extraction of classes in the ontology learning process
Resumo:
This paper analyzes and evaluates, in the context of Ontology learning, some techniques to identify and extract candidate terms to classes of a taxonomy. Besides, this work points out some inconsistencies that may be occurring in the preprocessing of text corpus, and proposes techniques to obtain good terms candidate to classes of a taxonomy.
Resumo:
In this paper we present a description of the role of definitional verbal patterns for the extraction of semantic relations. Several studies show that semantic relations can be extracted from analytic definitions contained in machine-readable dictionaries (MRDs). In addition, definitions found in specialised texts are a good starting point to search for different types of definitions where other semantic relations occur. The extraction of definitional knowledge from specialised corpora represents another interesting approach for the extraction of semantic relations. Here, we present a descriptive analysis of definitional verbal patterns in Spanish and the first steps towards the development of a system for the automatic extraction of definitional knowledge.
Resumo:
Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time andefforts in the first steps of Sentiment Analysis. In this paper we present a methodology based onlinguistic cues that allows us to automatically discover, extract and label subjective adjectivesthat should be collected in a domain-based polarity lexicon. For this purpose, we designed abootstrapping algorithm that, from a small set of seed polar adjectives, is capable to iterativelyidentify, extract and annotate positive and negative adjectives. Additionally, the methodautomatically creates lists of highly subjective elements that change their prior polarity evenwithin the same domain. The algorithm proposed reached a precision of 97.5% for positiveadjectives and 71.4% for negative ones in the semantic orientation identification task.
Resumo:
This work briefly analyses the difficulties to adopt the Semantic Web, and in particular proposes systems to know the present level of migration to the different technologies that make up the Semantic Web. It focuses on the presentation and description of two tools, DigiDocSpider and DigiDocMetaEdit, designed with the aim of verifYing, evaluating, and promoting its implementation.
Resumo:
Monetary policy is conducted in an environment of uncertainty. This paper sets upa model where the central bank uses real-time data from the bond market togetherwith standard macroeconomic indicators to estimate the current state of theeconomy more efficiently, while taking into account that its own actions influencewhat it observes. The timeliness of bond market data allows for quicker responsesof monetary policy to disturbances compared to the case when the central bankhas to rely solely on collected aggregate data. The information content of theterm structure creates a link between the bond market and the macroeconomythat is novel to the literature. To quantify the importance of the bond market asa source of information, the model is estimated on data for the United Statesand Australia using Bayesian methods. The empirical exercise suggests that thereis some information in the US term structure that helps the Federal Reserve toidentify shocks to the economy on a timely basis. Australian bond prices seemto be less informative than their US counterparts, perhaps because Australia is arelatively small and open economy.
Resumo:
We report results from a randomized policy experiment designed to test whether increasedaudit risk deters rent extraction in local public procurement and service delivery in Brazil. Ourestimates suggest that temporarily increasing annual audit risk by about 20 percentage pointsreduced the proportion of irregular local procurement processes by about 17 percentage points.This reduction was driven entirely by irregularities involving mismanagement or corruption. Incontrast, we find no evidence that increased audit risk affected the quality of publicly providedpreventive and primary health care services -measured based on user satisfaction surveys- orcompliance with national regulations of the conditional cash transfer program "Bolsa Família".
Resumo:
We estimate the effect of state judiciary presence on rent extraction in Brazilian local governments.We measure rents as irregularities related to waste or corruption uncovered by auditors.Our unique dataset at the level of individual inspections allows us to separately examine extensiveand intensive margins of rent extraction. The identification strategy is based on an institutionalrule of state judiciary branches according to which prosecutors and judges tend to be assigned tothe most populous among contiguous counties forming a judiciary district. Our research designexploits this rule by comparing counties that are largest in their district to counties with identicalpopulation size from other districts in the same state, where they are not the most populous. IVestimates suggest that state judiciary presence reduces the share of inspections with irregularitiesrelated to waste or corruption by about 10 percent or 0.3 standard deviations. In contrast, we findno effect on the intensive margin of rent extraction. Finally, our estimates suggest that judicialpresence reduces rent extraction only for first-term mayors.
Resumo:
Two concentration methods for fast and routine determination of caffeine (using HPLC-UV detection) in surface, and wastewater are evaluated. Both methods are based on solid-phase extraction (SPE) concentration with octadecyl silica sorbents. A common “offline” SPE procedure shows that quantitative recovery of caffeine is obtained with 2 mL of an elution mixture solvent methanol-water containing at least 60% methanol. The method detection limit is 0.1 μg L−1 when percolating 1 L samples through the cartridge. The development of an “online” SPE method based on a mini-SPE column, containing 100 mg of the same sorbent, directly connected to the HPLC system allows the method detection limit to be decreased to 10 ng L−1 with a sample volume of 100 mL. The “offline” SPE method is applied to the analysis of caffeine in wastewater samples, whereas the “on-line” method is used for analysis in natural waters from streams receiving significant water intakes from local wastewater treatment plants
Resumo:
Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques
Resumo:
The current state of regional and urban science has been much discussed and a number of studies have speculated on possible future trends in the development of the discipline. However, there has been little empirical analysis of current publication patterns in regional and urban journals. This paper studies the kinds of topics, techniques and data used in articles published in nine top international journals during the 1990s with the aim of identifying current trends in this research field
Resumo:
The current state of regional and urban science has been much discussed and a number of studies have speculated on possible future trends in the development of the discipline. However, there has been little empirical analysis of current publication patterns in regional and urban journals. This paper studies the kinds of topics, techniques and data used in articles published in nine top international journals during the 1990s with the aim of identifying current trends in this research field