12 resultados para Mining extraction model
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
En aquest treball es recopilen i estudien, des d’una perspectiva etnopaleontològica, les aportacions i influencies exercides pels fòssils en relació al patrimoni onomàstic toponímic relacionat amb cavitats càrstiques de l’àmbit geogràfic de les Illes Balears. Generalment es tracta de microtopònims moderns o recents, en vies de popularització i/o tradicionalització (neotopònims), establerts pels científics que estudien les coves i/o esportistes del món de l’espeleologia (topocultismes). Es poden distingir entre espeleotopònims de primer ordre (quan es refereixen a cavitats completes) i de segon ordre (quan es refereixen a un sector d’una cavitat). També es realitza una primera aproximació als topònims referits a mines d’extracció de carbó fòssil (antracotopònims).
Resumo:
Consider a model with parameter phi, and an auxiliary model with parameter theta. Let phi be a randomly sampled from a given density over the known parameter space. Monte Carlo methods can be used to draw simulated data and compute the corresponding estimate of theta, say theta_tilde. A large set of tuples (phi, theta_tilde) can be generated in this manner. Nonparametric methods may be use to fit the function E(phi|theta_tilde=a), using these tuples. It is proposed to estimate phi using the fitted E(phi|theta_tilde=theta_hat), where theta_hat is the auxiliary estimate, using the real sample data. This is a consistent and asymptotically normally distributed estimator, under certain assumptions. Monte Carlo results for dynamic panel data and vector autoregressions show that this estimator can have very attractive small sample properties. Confidence intervals can be constructed using the quantiles of the phi for which theta_tilde is close to theta_hat. Such confidence intervals are found to have very accurate coverage.
Resumo:
The two main alternative methods used to identify key sectors within the input-output approach, the Classical Multiplier method (CMM) and the Hypothetical Extraction method (HEM), are formally and empirically compared in this paper. Our findings indicate that the main distinction between the two approaches stems from the role of the internal effects. These internal effects are quantified under the CMM while under the HEM only external impacts are considered. In our comparison, we find, however that CMM backward measures are more influenced by within-block effects than the proposed forward indices under this approach. The conclusions of this comparison allow us to develop a hybrid proposal that combines these two existing approaches. This hybrid model has the advantage of making it possible to distinguish and disaggregate external effects from those that a purely internal. This proposal has also an additional interest in terms of policy implications. Indeed, the hybrid approach may provide useful information for the design of ''second best'' stimulus policies that aim at a more balanced perspective between overall economy-wide impacts and their sectoral distribution.
Resumo:
Model predictiu basat en xarxes bayesianes que permet identificar els pacients amb major risc d'ingrés a un hospital segons una sèrie d'atributs de dades demogràfiques i clíniques.
Analysis and evaluation of techniques for the extraction of classes in the ontology learning process
Resumo:
This paper analyzes and evaluates, in the context of Ontology learning, some techniques to identify and extract candidate terms to classes of a taxonomy. Besides, this work points out some inconsistencies that may be occurring in the preprocessing of text corpus, and proposes techniques to obtain good terms candidate to classes of a taxonomy.
Resumo:
The objective of the PANACEA ICT-2007.2.2 EU project is to build a platform that automates the stages involved in the acquisition,production, updating and maintenance of the large language resources required by, among others, MT systems. The development of a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web is one of the most innovative building blocks of PANACEA. The CAC, which is the first stage in the PANACEA pipeline for building Language Resources, adopts an efficient and distributed methodology to crawl for web documents with rich textual content in specific languages and predefined domains. The CAC includes modules that can acquire parallel data from sites with in-domain content available in more than one language. In order to extrinsically evaluate the CAC methodology, we have conducted several experiments that used crawled parallel corpora for the identification and extraction of parallel sentences using sentence alignment. The corpora were then successfully used for domain adaptation of Machine Translation Systems.
Resumo:
Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time andefforts in the first steps of Sentiment Analysis. In this paper we present a methodology based onlinguistic cues that allows us to automatically discover, extract and label subjective adjectivesthat should be collected in a domain-based polarity lexicon. For this purpose, we designed abootstrapping algorithm that, from a small set of seed polar adjectives, is capable to iterativelyidentify, extract and annotate positive and negative adjectives. Additionally, the methodautomatically creates lists of highly subjective elements that change their prior polarity evenwithin the same domain. The algorithm proposed reached a precision of 97.5% for positiveadjectives and 71.4% for negative ones in the semantic orientation identification task.
Resumo:
Monetary policy is conducted in an environment of uncertainty. This paper sets upa model where the central bank uses real-time data from the bond market togetherwith standard macroeconomic indicators to estimate the current state of theeconomy more efficiently, while taking into account that its own actions influencewhat it observes. The timeliness of bond market data allows for quicker responsesof monetary policy to disturbances compared to the case when the central bankhas to rely solely on collected aggregate data. The information content of theterm structure creates a link between the bond market and the macroeconomythat is novel to the literature. To quantify the importance of the bond market asa source of information, the model is estimated on data for the United Statesand Australia using Bayesian methods. The empirical exercise suggests that thereis some information in the US term structure that helps the Federal Reserve toidentify shocks to the economy on a timely basis. Australian bond prices seemto be less informative than their US counterparts, perhaps because Australia is arelatively small and open economy.
Resumo:
Exclusive J/Psi electroproduction is studied in the framework of the analytic S-matrix theory. The differential and integrated elastic cross sections are calculated using the modified dual amplitude with Mandelstam analyticity model. The model is applied to the description of the available experimental data and proves to be valid in a wide region of the kinematical variables s, t, and Q(2). Our amplitude can be used also as a universal background parametrization for the extraction of tiny resonance signals.
Resumo:
Background: oscillatory activity, which can be separated in background and oscillatory burst pattern activities, is supposed to be representative of local synchronies of neural assemblies. Oscillatory burst events should consequently play a specific functional role, distinct from background EEG activity – especially for cognitive tasks (e.g. working memory tasks), binding mechanisms and perceptual dynamics (e.g. visual binding), or in clinical contexts (e.g. effects of brain disorders). However extracting oscillatory events in single trials, with a reliable and consistent method, is not a simple task. Results: in this work we propose a user-friendly stand-alone toolbox, which models in a reasonable time a bump time-frequency model from the wavelet representations of a set of signals. The software is provided with a Matlab toolbox which can compute wavelet representations before calling automatically the stand-alone application. Conclusion: The tool is publicly available as a freeware at the address: http:// www.bsp.brain.riken.jp/bumptoolbox/toolbox_home.html
Resumo:
Social interactions are a very important component in people"s lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Times" Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The links" weights are a measure of the"influence" a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network.
Resumo:
The ability to recognize a shape is linked to figure-ground (FG) organization. Cell preferences appear to be correlated across contrast-polarity reversals and mirror reversals of polygon displays, but not so much across FG reversals. Here we present a network structure which explains both shape-coding by simulated IT cells and suppression of responses to FG reversed stimuli. In our model FG segregation is achieved before shape discrimination, which is itself evidenced by the difference in spiking onsets of a pair of output cells. The studied example also includes feature extraction and illustrates a classification of binary images depending on the dominance of vertical or horizontal borders.