78 resultados para KNN
Resumo:
中国计算机学会
Resumo:
In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
La obtención de materiales monofásicos con respuesta ferroeléctrica y (anti-)ferromagnética simultánea y acoplada resulta problemática debido a limitaciones intrínsecas de tipo físico, estructural y electrónico. En este sentido una alternativa más realista, y en cierto modo con mayor flexibilidad a la hora de diseñar futuros dispositivos multiferroicos, consiste en preparar materiales compuestos en los cuales el acoplamiento magnetoeléctrico se puede alcanzar explotando los efectos interfaciales entre fases disimilares. Tal es el caso de los materiales compuestos basados en BaTiO3 (fase ferroeléctrica) y NiFe2O4 (fase magnética), que ya se han empezado a preparar fundamentalmente por medio de técnicas de deposición altamente energéticas. Sin embargo de cara a su aplicación práctica, sería interesante poder preparar esos materiales por métodos más sostenibles y menos costosos. De acuerdo con ello, en este trabajo se presenta un estudio preliminar en torno a la evolución microestructural experimentada por los materiales basados en NiFe2O4-BaTiO3 cuando son preparados mediante una técnica de procesamiento suave en disolución como es la síntesis hidrotermal. En concreto se ha analizado la influencia que diversos parámetros característicos del procesamiento hidrotermal pueden tener sobre la generación y distribución de fases e interfases durante la posterior consolidación térmica de estos materiales compuestos.
Resumo:
Prototype Selection (PS) algorithms allow a faster Nearest Neighbor classification by keeping only the most profitable prototypes of the training set. In turn, these schemes typically lower the performance accuracy. In this work a new strategy for multi-label classifications tasks is proposed to solve this accuracy drop without the need of using all the training set. For that, given a new instance, the PS algorithm is used as a fast recommender system which retrieves the most likely classes. Then, the actual classification is performed only considering the prototypes from the initial training set belonging to the suggested classes. Results show that this strategy provides a large set of trade-off solutions which fills the gap between PS-based classification efficiency and conventional kNN accuracy. Furthermore, this scheme is not only able to, at best, reach the performance of conventional kNN with barely a third of distances computed, but it does also outperform the latter in noisy scenarios, proving to be a much more robust approach.
Resumo:
In the current Information Age, data production and processing demands are ever increasing. This has motivated the appearance of large-scale distributed information. This phenomenon also applies to Pattern Recognition so that classic and common algorithms, such as the k-Nearest Neighbour, are unable to be used. To improve the efficiency of this classifier, Prototype Selection (PS) strategies can be used. Nevertheless, current PS algorithms were not designed to deal with distributed data, and their performance is therefore unknown under these conditions. This work is devoted to carrying out an experimental study on a simulated framework in which PS strategies can be compared under classical conditions as well as those expected in distributed scenarios. Our results report a general behaviour that is degraded as conditions approach to more realistic scenarios. However, our experiments also show that some methods are able to achieve a fairly similar performance to that of the non-distributed scenario. Thus, although there is a clear need for developing specific PS methodologies and algorithms for tackling these situations, those that reported a higher robustness against such conditions may be good candidates from which to start.
Resumo:
This paper describes a novel framework for facial expression recognition from still images by selecting, optimizing and fusing ‘salient’ Gabor feature layers to recognize six universal facial expressions using the K nearest neighbor classifier. The recognition comparisons with all layer approach using JAFFE and Cohn-Kanade (CK) databases confirm that using ‘salient’ Gabor feature layers with optimized sizes can achieve better recognition performance and dramatically reduce computational time. Moreover, comparisons with the state of the art performances demonstrate the effectiveness of our approach.
Resumo:
Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult; the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.
Resumo:
To enhance the performance of the k-nearest neighbors approach in forecasting short-term traffic volume, this paper proposed and tested a two-step approach with the ability of forecasting multiple steps. In selecting k-nearest neighbors, a time constraint window is introduced, and then local minima of the distances between the state vectors are ranked to avoid overlappings among candidates. Moreover, to control extreme values’ undesirable impact, a novel algorithm with attractive analytical features is developed based on the principle component. The enhanced KNN method has been evaluated using the field data, and our comparison analysis shows that it outperformed the competing algorithms in most cases.
Resumo:
Field robots often rely on laser range finders (LRFs) to detect obstacles and navigate autonomously. Despite recent progress in sensing technology and perception algorithms, adverse environmental conditions, such as the presence of smoke, remain a challenging issue for these robots. In this paper, we investigate the possibility to improve laser-based perception applications by anticipating situations when laser data are affected by smoke, using supervised learning and state-of-the-art visual image quality analysis. We propose to train a k-nearest-neighbour (kNN) classifier to recognise situations where a laser scan is likely to be affected by smoke, based on visual data quality features. This method is evaluated experimentally using a mobile robot equipped with LRFs and a visual camera. The strengths and limitations of the technique are identified and discussed, and we show that the method is beneficial if conservative decisions are the most appropriate.
Resumo:
Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression
Resumo:
Samples of Forsythia suspensa from raw (Laoqiao) and ripe (Qingqiao) fruit were analyzed with the use of HPLC-DAD and the EIS-MS techniques. Seventeen peaks were detected, and of these, twelve were identified. Most were related to the glucopyranoside molecular fragment. Samples collected from three geographical areas (Shanxi, Henan and Shandong Provinces), were discriminated with the use of hierarchical clustering analysis (HCA), discriminant analysis (DA), and principal component analysis (PCA) models, but only PCA was able to provide further information about the relationships between objects and loadings; eight peaks were related to the provinces of sample origin. The supervised classification models-K-nearest neighbor (KNN), least squares support vector machines (LS-SVM), and counter propagation artificial neural network (CP-ANN) methods, indicated successful classification but KNN produced 100% classification rate. Thus, the fruit were discriminated on the basis of their places of origin.