Biblioteca Digital

955 resultados para Temporal Information Extraction

PCA Tomography: how to extract information from data cubes

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.

Content extraction: Identifying the main content in HTML documents

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Except the article forming the main content most HTML documents on the WWW contain additional contents such as navigation menus, design elements or commercial banners. In the context of several applications it is necessary to draw the distinction between main and additional content automatically. Content extraction and template detection are the two approaches to solve this task. This thesis gives an extensive overview of existing algorithms from both areas. It contributes an objective way to measure and evaluate the performance of content extraction algorithms under different aspects. These evaluation measures allow to draw the first objective comparison of existing extraction solutions. The newly introduced content code blurring algorithm overcomes several drawbacks of previous approaches and proves to be the best content extraction algorithm at the moment. An analysis of methods to cluster web documents according to their underlying templates is the third major contribution of this thesis. In combination with a localised crawling process this clustering analysis can be used to automatically create sets of training documents for template detection algorithms. As the whole process can be automated it allows to perform template detection on a single document, thereby combining the advantages of single and multi document algorithms.

Named Entity Extraction: la piattaforma Gate/Annie

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In questo lavoro si introducono i concetti di base di Natural Language Processing, soffermandosi su Information Extraction e analizzandone gli ambiti applicativi, le attività principali e la differenza rispetto a Information Retrieval. Successivamente si analizza il processo di Named Entity Recognition, focalizzando l’attenzione sulle principali problematiche di annotazione di testi e sui metodi per la valutazione della qualità dell’estrazione di entità. Infine si fornisce una panoramica della piattaforma software open-source di language processing GATE/ANNIE, descrivendone l’architettura e i suoi componenti principali, con approfondimenti sugli strumenti che GATE offre per l'approccio rule-based a Named Entity Recognition.

Temporal PageRank

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The our reality is characterized by a constant progress and, to follow that, people need to stay up to date on the events. In a world with a lot of existing news, search for the ideal ones may be difficult, because the obstacles that make it arduous will be expanded more and more over time, due to the enrichment of data. In response, a great help is given by Information Retrieval, an interdisciplinary branch of computer science that deals with the management and the retrieval of the information. An IR system is developed to search for contents, contained in a reference dataset, considered relevant with respect to the need expressed by an interrogative query. To satisfy these ambitions, we must consider that most of the developed IR systems rely solely on textual similarity to identify relevant information, defining them as such when they include one or more keywords expressed by the query. The idea studied here is that this is not always sufficient, especially when it's necessary to manage large databases, as is the web. The existing solutions may generate low quality responses not allowing, to the users, a valid navigation through them. The intuition, to overcome these limitations, has been to define a new concept of relevance, to differently rank the results. So, the light was given to Temporal PageRank, a new proposal for the Web Information Retrieval that relies on a combination of several factors to increase the quality of research on the web. Temporal PageRank incorporates the advantages of a ranking algorithm, to prefer the information reported by web pages considered important by the context itself in which they reside, and the potential of techniques belonging to the world of the Temporal Information Retrieval, exploiting the temporal aspects of data, describing their chronological contexts. In this thesis, the new proposal is discussed, comparing its results with those achieved by the best known solutions, analyzing its strengths and its weaknesses.

Performance on auditory and visual temporal informationprocessing is related to psychometric intelligence

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The present study investigated the relationship between psychometric intelligence and temporal resolution power (TRP) as simultaneously assessed by auditory and visual psychophysical timing tasks. In addition, three different theoretical models of the functional relationship between TRP and psychometric intelligence as assessed by means of the Adaptive Matrices Test (AMT) were developed. To test the validity of these models, structural equation modeling was applied. Empirical data supported a hierarchical model that assumed auditory and visual modality-specific temporal processing at a first level and amodal temporal processing at a second level. This second-order latent variable was substantially correlated with psychometric intelligence. Therefore, the relationship between psychometric intelligence and psychophysical timing performance can be explained best by a hierarchical model of temporal information processing.

Adaptive spatio-temporal filter for low-cost camera depth maps

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we present an adaptive spatio-temporal filter that aims to improve low-cost depth camera accuracy and stability over time. The proposed system is composed by three blocks that are used to build a reliable depth map of static scenes. An adaptive joint-bilateral filter is used to obtain consistent depth maps by jointly considering depth and video information and by adapting its parameters to different levels of estimated noise. Kalman filters are used to reduce the temporal random fluctuations of the measurements. Finally an interpolation algorithm is used to obtain consistent depth maps in the regions where the depth information is not available. Results show that this approach allows to considerably improve the depth maps quality by considering spatio-temporal information and by adapting its parameters to different levels of noise.

Efficient spatio-temporal hole filling strategy for Kinect depth maps

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we present an efficient hole filling strategy that improves the quality of the depth maps obtained with the Microsoft Kinect device. The proposed approach is based on a joint-bilateral filtering framework that includes spatial and temporal information. The missing depth values are obtained applying iteratively a joint-bilateral filter to their neighbor pixels. The filter weights are selected considering three different factors: visual data, depth information and a temporal-consistency map. Video and depth data are combined to improve depth map quality in presence of edges and homogeneous regions. Finally, the temporal-consistency map is generated in order to track the reliability of the depth measurements near the hole regions. The obtained depth values are included iteratively in the filtering process of the successive frames and the accuracy of the hole regions depth values increases while new samples are acquired and filtered

Context-sensitive synaptic plasticity and temporal-to-spatial transformations in hippocampal slices

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Hippocampal slices are used to show that, as a temporal input pattern of activity flows through a neuronal layer, a temporal-to-spatial transformation takes place. That is, neurons can respond selectively to the first or second of a pair of input pulses, thus transforming different temporal patterns of activity into the activity of different neurons. This is demonstrated using associative long-term potentiation of polysynaptic CA1 responses as an activity-dependent marker: by depolarizing a postsynaptic CA1 neuron exclusively with the first or second of a pair of pulses from the dentate gyrus, it is possible to “tag” different subpopulations of CA3 neurons. This technique allows sampling of a population of neurons without recording simultaneously from multiple neurons. Furthermore, it reflects a biologically plausible mechanism by which single neurons may develop selective responses to time-varying stimuli and permits the induction of context-sensitive synaptic plasticity. These experimental results support the view that networks of neurons are intrinsically able to process temporal information and that it is not necessary to invoke the existence of internal clocks or delay lines for temporal processing on the time scale of tens to hundreds of milliseconds.

Recognizing and tagging temporal expressions in Spanish

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper shows a system about the recognition of temporal expressions in Spanish and the resolution of their temporal reference. For the identification and recognition of temporal expressions we have based on a temporal expression grammar and for the resolution on an inference engine, where we have the information necessary to do the date operation based on the recognized expressions. For further information treatment, the output is proposed by means of XML tags in order to add standard information of the resolution obtained. Different kinds of annotation of temporal expressions are explained in another articles [WILSON2001][KATZ2001]. In the evaluation of our proposal we have obtained successful results.

Applying semantic knowledge to the automatic processing of temporal expressions and events in natural language

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.

How do world-class cricket batsmen anticipate a bowler's intention?

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Four experiments are reported that examine the ability of cricket batsmen of different skill levels to pick up advance information to anticipate the type and length of balls bowled by swing and spin bowlers. The information available upon which to make the predictive judgements was manipulated through a combination of temporal occlusion of the display and selective occlusion or presentation of putative anticipatory cues. In addition to a capability to pick up advance information from the same cues used by intermediate and low-skilled players, highly skilled players demonstrated the additional, unique capability to pick up advance information from some specific early cues (especially bowling hand and arm cues) to which the less skilled players were not attuned. The acquisition of expert perceptual-motor skill appears to involve not only refinement of information extraction but also progression to the use of earlier, kinematically relevant sources of information.

Biomedical events extraction using the hidden vector state model

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective: Biomedical events extraction concerns about events describing changes on the state of bio-molecules from literature. Comparing to the protein-protein interactions (PPIs) extraction task which often only involves the extraction of binary relations between two proteins, biomedical events extraction is much harder since it needs to deal with complex events consisting of embedded or hierarchical relations among proteins, events, and their textual triggers. In this paper, we propose an information extraction system based on the hidden vector state (HVS) model, called HVS-BioEvent, for biomedical events extraction, and investigate its capability in extracting complex events. Methods and material: HVS has been previously employed for extracting PPIs. In HVS-BioEvent, we propose an automated way to generate abstract annotations for HVS training and further propose novel machine learning approaches for event trigger words identification, and for biomedical events extraction from the HVS parse results. Results: Our proposed system achieves an F-score of 49.57% on the corpus used in the BioNLP'09 shared task, which is only 2.38% lower than the best performing system by UTurku in the BioNLP'09 shared task. Nevertheless, HVS-BioEvent outperforms UTurku's system on complex events extraction with 36.57% vs. 30.52% being achieved for extracting regulation events, and 40.61% vs. 38.99% for negative regulation events. Conclusions: The results suggest that the HVS model with the hierarchical hidden state structure is indeed more suitable for complex event extraction since it could naturally model embedded structural context in sentences.

Ontology-based protein-protein interactions extraction from literature using the hidden vector state model

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper proposes a novel framework of incorporating protein-protein interactions (PPI) ontology knowledge into PPI extraction from biomedical literature in order to address the emerging challenges of deep natural language understanding. It is built upon the existing work on relation extraction using the Hidden Vector State (HVS) model. The HVS model belongs to the category of statistical learning methods. It can be trained directly from un-annotated data in a constrained way whilst at the same time being able to capture the underlying named entity relationships. However, it is difficult to incorporate background knowledge or non-local information into the HVS model. This paper proposes to represent the HVS model as a conditionally trained undirected graphical model in which non-local features derived from PPI ontology through inference would be easily incorporated. The seamless fusion of ontology inference with statistical learning produces a new paradigm to information extraction.

Browsing for information by highlighting automatically generated annotations:a user study and evaluation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The realization of the Semantic Web is constrained by a knowledge acquisition bottleneck, i.e. the problem of how to add RDF mark-up to the millions of ordinary web pages that already exist. Information Extraction (IE) has been proposed as a solution to the annotation bottleneck. In the task based evaluation reported here, we compared the performance of users without access to annotation, users working with annotations which had been produced from manually constructed knowledge bases, and users working with annotations augmented using IE. We looked at retrieval performance, overlap between retrieved items and the two sets of annotations, and usage of annotation options. Automatically generated annotations were found to add value to the browsing experience in the scenario investigated. Copyright 2005 ACM.

Re-expressing business processes information from corporate documents into controlled language

Relevância:

90.00% 90.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
63
64
»