9 resultados para Feature Descriptors

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a filter-based algorithm for feature selection. The filter is based on the partitioning of the set of features into clusters. The number of clusters, and consequently the cardinality of the subset of selected features, is automatically estimated from data. The computational complexity of the proposed algorithm is also investigated. A variant of this filter that considers feature-class correlations is also proposed for classification problems. Empirical results involving ten datasets illustrate the performance of the developed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to state of the art algorithms that find clusters of features. We show that, if computational efficiency is an important issue, then the proposed filter May be preferred over their counterparts, thus becoming eligible to join a pool of feature selection algorithms to be used in practice. As an additional contribution of this work, a theoretical framework is used to formally analyze some properties of feature selection methods that rely on finding clusters of features. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Successful classification, information retrieval and image analysis tools are intimately related with the quality of the features employed in the process. Pixel intensities, color, texture and shape are, generally, the basis from which most of the features are Computed and used in such fields. This papers presents a novel shape-based feature extraction approach where an image is decomposed into multiple contours, and further characterized by Fourier descriptors. Unlike traditional approaches we make use of topological knowledge to generate well-defined closed contours, which are efficient signatures for image retrieval. The method has been evaluated in the CBIR context and image analysis. The results have shown that the multi-contour decomposition, as opposed to a single shape information, introduced a significant improvement in the discrimination power. (c) 2008 Elsevier B.V. All rights reserved,

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We introduce a flexible technique for interactive exploration of vector field data through classification derived from user-specified feature templates. Our method is founded on the observation that, while similar features within the vector field may be spatially disparate, they share similar neighborhood characteristics. Users generate feature-based visualizations by interactively highlighting well-accepted and domain specific representative feature points. Feature exploration begins with the computation of attributes that describe the neighborhood of each sample within the input vector field. Compilation of these attributes forms a representation of the vector field samples in the attribute space. We project the attribute points onto the canonical 2D plane to enable interactive exploration of the vector field using a painting interface. The projection encodes the similarities between vector field points within the distances computed between their associated attribute points. The proposed method is performed at interactive rates for enhanced user experience and is completely flexible as showcased by the simultaneous identification of diverse feature types.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a parallel hardware architecture for image feature detection based on the Scale Invariant Feature Transform algorithm and applied to the Simultaneous Localization And Mapping problem. The work also proposes specific hardware optimizations considered fundamental to embed such a robotic control system on-a-chip. The proposed architecture is completely stand-alone; it reads the input data directly from a CMOS image sensor and provides the results via a field-programmable gate array coupled to an embedded processor. The results may either be used directly in an on-chip application or accessed through an Ethernet connection. The system is able to detect features up to 30 frames per second (320 x 240 pixels) and has accuracy similar to a PC-based implementation. The achieved system performance is at least one order of magnitude better than a PC-based solution, a result achieved by investigating the impact of several hardware-orientated optimizations oil performance, area and accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work presents a novel approach in order to increase the recognition power of Multiscale Fractal Dimension (MFD) techniques, when applied to image classification. The proposal uses Functional Data Analysis (FDA) with the aim of enhancing the MFD technique precision achieving a more representative descriptors vector, capable of recognizing and characterizing more precisely objects in an image. FDA is applied to signatures extracted by using the Bouligand-Minkowsky MFD technique in the generation of a descriptors vector from them. For the evaluation of the obtained improvement, an experiment using two datasets of objects was carried out. A dataset was used of characters shapes (26 characters of the Latin alphabet) carrying different levels of controlled noise and a dataset of fish images contours. A comparison with the use of the well-known methods of Fourier and wavelets descriptors was performed with the aim of verifying the performance of FDA method. The descriptor vectors were submitted to Linear Discriminant Analysis (LDA) classification method and we compared the correctness rate in the classification process among the descriptors methods. The results demonstrate that FDA overcomes the literature methods (Fourier and wavelets) in the processing of information extracted from the MFD signature. In this way, the proposed method can be considered as an interesting choice for pattern recognition and image classification using fractal analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the formulation of a combinatorial optimization problem with the following characteristics: (i) the search space is the power set of a finite set structured as a Boolean lattice; (ii) the cost function forms a U-shaped curve when applied to any lattice chain. This formulation applies for feature selection in the context of pattern recognition. The known approaches for this problem are branch-and-bound algorithms and heuristics that explore partially the search space. Branch-and-bound algorithms are equivalent to the full search, while heuristics are not. This paper presents a branch-and-bound algorithm that differs from the others known by exploring the lattice structure and the U-shaped chain curves of the search space. The main contribution of this paper is the architecture of this algorithm that is based on the representation and exploration of the search space by new lattice properties proven here. Several experiments, with well known public data, indicate the superiority of the proposed method to the sequential floating forward selection (SFFS), which is a popular heuristic that gives good results in very short computational time. In all experiments, the proposed method got better or equal results in similar or even smaller computational time. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Some sesquiterpene lactones (SLs) are the active compounds of a great number of traditionally medicinal plants from the Asteraceae family and possess considerable cytotoxic activity. Several studies in vitro have shown the inhibitory activity against cells derived from human carcinoma of the nasopharynx (KB). Chemical studies showed that the cytotoxic activity is due to the reaction of alpha,beta-unsaturated carbonyl structures of the SLs with thiols, such as cysteine. These studies support the view that SLs inhibit tumour growth by selective alkylation of growth-regulatory biological macromolecules, such as key enzymes, which control cell division, thereby inhibiting a variety of cellular functions, which directs the cells into apoptosis. In this study we investigated a set of 55 different sesquiterpene lactones, represented by 5 skeletons (22 germacranolides, 6 elemanolides, 2 eudesmanolides, 16 guaianolides and nor-derivatives and 9 pseudoguaianolides), in respect to their cytotoxic properties. The experimental results and 3D molecular descriptors were submitted to Kohonen self-organizing map (SOM) to classify (training set) and predict (test set) the cytotoxic activity. From the obtained results, it was concluded that only the geometrical descriptors showed satisfactory values. The Kohonen map obtained after training set using 25 geometrical descriptors shows a very significant match, mainly among the inactive compounds (similar to 84%). Analyzing both groups, the percentage seen is high (83%). The test set shows the highest match, where 89% of the substances had their cytotoxic activity correctly predicted. From these results, important properties for the inhibition potency are discussed for the whole dataset and for subsets of the different structural skeletons. (C) 2008 Elsevier Masson SAS. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Various significant anti-HCV and cytotoxic sesquiterpene lactones (SLs) have been characterized. In this work, the chemometric tool Principal Component Analysis (PCA) was applied to two sets of SLs and the variance of the biological activity was explored. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. The calculations were performed using VolSurf program. For anti-HCV activity, PC1 (First Principal Component) explained 30.3% and PC2 (Second Principal Component) explained 26.5% of matrix total variance, while for cytotoxic activity, PC1 explained 30.9% and PC2 explained 15.6% of the total variance. The formalism employed generated good exploratory and predictive results and we identified some structural features, for both sets, important to the suitable biological activity and pharmacokinetic profile.