989 resultados para Automatic term extraction


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We show a new method for term extraction from a domain relevant corpus using natural language processing for the purposes of semi-automatic ontology learning. Literature shows that topical words occur in bursts. We find that the ranking of extracted terms is insensitive to the choice of population model, but calculating frequencies relative to the burst size rather than the document length in words yields significantly different results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Automated feature extraction and correspondence determination is an extremely important problem in the face recognition community as it often forms the foundation of the normalisation and database construction phases of many recognition and verification systems. This paper presents a completely automatic feature extraction system based upon a modified volume descriptor. These features form a stable descriptor for faces and are utilised in a reversible jump Markov chain Monte Carlo correspondence algorithm to automatically determine correspondences which exist between faces. The developed system is invariant to changes in pose and occlusion and results indicate that it is also robust to minor face deformations which may be present with variations in expression.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A building information model (BIM) provides a rich representation of a building's design. However, there are many challenges in getting construction-specific information from a BIM, limiting the usability of BIM for construction and other downstream processes. This paper describes a novel approach that utilizes ontology-based feature modeling, automatic feature extraction based on ifcXML, and query processing to extract information relevant to construction practitioners from a given BIM. The feature ontology generically represents construction-specific information that is useful for a broad range of construction management functions. The software prototype uses the ontology to transform the designer-focused BIM into a construction-specific feature-based model (FBM). The formal query methods operate on the FBM to further help construction users to quickly extract the necessary information from a BIM. Our tests demonstrate that this approach provides a richer representation of construction-specific information compared to existing BIM tools.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a preprocessing module for improving the performance of a Spanish into Spanish Sign Language (Lengua de Signos Espanola: LSE) translation system when dealing with sparse training data. This preprocessing module replaces Spanish words with associated tags. The list with Spanish words (vocabulary) and associated tags used by this module is computed automatically considering those signs that show the highest probability of being the translation of every Spanish word. This automatic tag extraction has been compared to a manual strategy achieving almost the same improvement. In this analysis, several alternatives for dealing with non-relevant words have been studied. Non-relevant words are Spanish words not assigned to any sign. The preprocessing module has been incorporated into two well-known statistical translation architectures: a phrase-based system and a Statistical Finite State Transducer (SFST). This system has been developed for a specific application domain: the renewal of Identity Documents and Driver's License. In order to evaluate the system a parallel corpus made up of 4080 Spanish sentences and their LSE translation has been used. The evaluation results revealed a significant performance improvement when including this preprocessing module. In the phrase-based system, the proposed module has given rise to an increase in BLEU (Bilingual Evaluation Understudy) from 73.8% to 81.0% and an increase in the human evaluation score from 0.64 to 0.83. In the case of SFST, BLEU increased from 70.6% to 78.4% and the human evaluation score from 0.65 to 0.82.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The most difficult operation in the flood inundation mapping using optical flood images is to separate fully inundated areas from the ‘wet’ areas where trees and houses are partly covered by water. This can be referred as a typical problem the presence of mixed pixels in the images. A number of automatic information extraction image classification algorithms have been developed over the years for flood mapping using optical remote sensing images. Most classification algorithms generally, help in selecting a pixel in a particular class label with the greatest likelihood. However, these hard classification methods often fail to generate a reliable flood inundation mapping because the presence of mixed pixels in the images. To solve the mixed pixel problem advanced image processing techniques are adopted and Linear Spectral unmixing method is one of the most popular soft classification technique used for mixed pixel analysis. The good performance of linear spectral unmixing depends on two important issues, those are, the method of selecting endmembers and the method to model the endmembers for unmixing. This paper presents an improvement in the adaptive selection of endmember subset for each pixel in spectral unmixing method for reliable flood mapping. Using a fixed set of endmembers for spectral unmixing all pixels in an entire image might cause over estimation of the endmember spectra residing in a mixed pixel and hence cause reducing the performance level of spectral unmixing. Compared to this, application of estimated adaptive subset of endmembers for each pixel can decrease the residual error in unmixing results and provide a reliable output. In this current paper, it has also been proved that this proposed method can improve the accuracy of conventional linear unmixing methods and also easy to apply. Three different linear spectral unmixing methods were applied to test the improvement in unmixing results. Experiments were conducted in three different sets of Landsat-5 TM images of three different flood events in Australia to examine the method on different flooding conditions and achieved satisfactory outcomes in flood mapping.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multiple flame-flame interactions in premixed combustion are investigated using direct numerical simulations of twin turbulent V-flames for a range of turbulence intensities and length scales. Interactions are identified using a novel automatic feature extraction (AFE) technique, based on data registration using the dual-tree complex wavelet transform. Information on the time, position, and type of interactions, and their influence on the flame area is extracted using AFE. Characteristic length and time scales for the interactions are identified. The effect of interactions on the flame brush is quantified through a global stretch rate, defined as the sum of flamelet stretch and interaction stretch contributions. The effects of each interaction type are discussed. It is found that the magnitude of the fluctuations in flamelet and interaction stretch are comparable, and a qualitative sensitivity to turbulence length scale is found for one interaction type. Implications for modeling are discussed. © 2013 Copyright Taylor and Francis Group, LLC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The influence of Lewis number on turbulent premixed flame interactions is investigated using automatic feature extraction (AFE) applied to high-resolution flame simulation data. Premixed turbulent twin V-flames under identical turbulence conditions are simulated at global Lewis numbers of 0.4, 0.8, 1.0, and 1.2. Information on the position, frequency, and magnitude of the interactions is compared, and the sensitivity of the results to sample interval is discussed. It is found that both the frequency and magnitude of normal type interactions increases with decreasing Lewis number. Counternormal type interactions become more likely as the Lewis number increases. The variation in both the frequency and the magnitude of the interactions is found to be caused by large-scale changes in flame wrinkling resulting from differences in the thermo-diffusive stability of the flames. During flame interactions, thermo-diffusive effects are found to be insignificant due to the separation of time scales. © 2013 Copyright Taylor and Francis Group, LLC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sport video data is growing rapidly as a result of the maturing digital technologies that support digital video capture, faster data processing, and large storage. However, (1) semi-automatic content extraction and annotation, (2) scalable indexing model, and (3) effective retrieval and browsing, still pose the most challenging problems for maximizing the usage of large video databases. This article will present the findings from a comprehensive work that proposes a scalable and extensible sports video retrieval system with two major contributions in the area of sports video indexing and retrieval. The first contribution is a new sports video indexing model that utilizes semi-schema-based indexing scheme on top of an Object-Relationship approach. This indexing model is scalable and extensible as it enables gradual index construction which is supported by ongoing development of future content extraction algorithms. The second contribution is a set of novel queries which are based on XQuery to generate dynamic and user-oriented summaries and event structures. The proposed sports video retrieval system has been fully implemented and populated with soccer, tennis, swimming, and diving video. The system has been evaluated against 20 users to demonstrate and confirm its feasibility and benefits. The experimental sports genres were specifically selected to represent the four main categories of sports domain: period-, set-point-, time (race)-, and performance-based sports. Thus, the proposed system should be generic and robust for all types of sports.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An overview is given on the possibility of controlling the status of circuit breakers (CB) in a substations with the use of a knowledge base that relates some of the operation magnitudes, mixing status variables with time variables and fuzzy sets. It is shown that even when all the magnitudes to be controlled cannot be included in the analysis, it is possible to control the desired status while supervising some important magnitudes as the voltage, power factor, and harmonic distortion, as well as the present status.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Several kinds of research in road extraction have been carried out in the last 6 years by the Photogrammetry and Computer Vision Research Group (GPF&VC - Grupo de Pesquisa em Fotogrametria e Visão Computacional). Several semi-automatic road extraction methodologies have been developed, including sequential and optimizatin techniques. The GP-F&VC has also been developing fully automatic methodologies for road extraction. This paper presents an overview of the GP-F&VC research in road extraction from digital images, along with examples of results obtained by the developed methodologies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The purpose of this paper is to introduce a methodology for semi-automatic road extraction from aerial digital image pairs by using dynamic programming and epipolar geometry. The method uses both images from where each road feature pair is extracted. The operator identifies the corresponding road featuresand s/he selects sparse seed points along them. After all road pairs have been extracted, epipolar geometry is applied to determine the automatic point-to-point correspondence between each correspondent feature. Finally, each correspondent road pair is georeferenced by photogrammetric intersection. Experiments were made with rural aerial images. The results led to the conclusion that the methodology is robust and efficient, even in the presence of shadows of trees and buildings or other irregularities.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a methodology for edge detection in digital images using the Canny detector, but associated with a priori edge structure focusing by a nonlinear anisotropic diffusion via the partial differential equation (PDE). This strategy aims at minimizing the effect of the well-known duality of the Canny detector, under which is not possible to simultaneously enhance the insensitivity to image noise and the localization precision of detected edges. The process of anisotropic diffusion via thePDE is used to a priori focus the edge structure due to its notable characteristic in selectively smoothing the image, leaving the homogeneous regions strongly smoothed and mainly preserving the physical edges, i.e., those that are actually related to objects presented in the image. The solution for the mentioned duality consists in applying the Canny detector to a fine gaussian scale but only along the edge regions focused by the process of anisotropic diffusion via the PDE. The results have shown that the method is appropriate for applications involving automatic feature extraction, since it allowed the high-precision localization of thinned edges, which are usually related to objects present in the image. © Nauka/Interperiodica 2006.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE