993 resultados para K-NN query


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper introduces an algorithm that uses boosting to learn a distance measure for multiclass k-nearest neighbor classification. Given a family of distance measures as input, AdaBoost is used to learn a weighted distance measure, that is a linear combination of the input measures. The proposed method can be seen both as a novel way to learn a distance measure from data, and as a novel way to apply boosting to multiclass recognition problems, that does not require output codes. In our approach, multiclass recognition of objects is reduced into a single binary recognition task, defined on triples of objects. Preliminary experiments with eight UCI datasets yield no clear winner among our method, boosting using output codes, and k-nn classification using an unoptimized distance measure. Our algorithm did achieve lower error rates in some of the datasets, which indicates that, in some domains, it may lead to better results than existing methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Intrinsic and extrinsic speaker normalization methods are systematically compared using a neural network (fuzzy ARTMAP) and L1 and L2 K-Nearest Neighbor (K-NN) categorizers trained and tested on disjoint sets of speakers of the Peterson-Barney vowel database. Intrinsic methods include one nonscaled, four psychophysical scales (bark, bark with endcorrection, mel, ERB), and three log scales, each tested on four combinations of F0 , F1, F2, F3. Extrinsic methods include four speaker adaptation schemes, each combined with the 32 intrinsic methods: centroid subtraction across all frequencies (CS), centroid subtraction for each frequency (CSi), linear scale (LS), and linear transformation (LT). ARTMAP and KNN show similar trends, with K-NN performing better, but requiring about ten times as much memory. The optimal intrinsic normalization method is bark scale, or bark with endcorrection, using the differences between all frequencies (Diff All). The order of performance for the extrinsic methods is LT, CSi, LS, and CS, with fuzzy ARTMAP performing best using bark scale with Diff All; and K-NN choosing psychophysical measures for all except CSi.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we propose a generalisation of the k-nearest neighbour (k-NN) retrieval method based on an error function using distance metrics in the solution and problem space. It is an interpolative method which is proposed to be effective for sparse case bases. The method applies equally to nominal, continuous and mixed domains, and does not depend upon an embedding n-dimensional space. In continuous Euclidean problem domains, the method is shown to be a generalisation of the Shepard's Interpolation method. We term the retrieval algorithm the Generalised Shepard Nearest Neighbour (GSNN) method. A novel aspect of GSNN is that it provides a general method for interpolation over nominal solution domains. The performance of the retrieval method is examined with reference to the Iris classification problem,and to a simulated sparse nominal value test problem. The introducion of a solution-space metric is shown to out-perform conventional nearest neighbours methods on sparse case bases.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The work done in this master's thesis, presents a new system for the recognition of human actions from a video sequence. The system uses, as input, a video sequence taken by a static camera. A binary segmentation method of the the video sequence is first achieved, by a learning algorithm, in order to detect and extract the different people from the background. To recognize an action, the system then exploits a set of prototypes generated from an MDS-based dimensionality reduction technique, from two different points of view in the video sequence. This dimensionality reduction technique, according to two different viewpoints, allows us to model each human action of the training base with a set of prototypes (supposed to be similar for each class) represented in a low dimensional non-linear space. The prototypes, extracted according to the two viewpoints, are fed to a $K$-NN classifier which allows us to identify the human action that takes place in the video sequence. The experiments of our model conducted on the Weizmann dataset of human actions provide interesting results compared to the other state-of-the art (and often more complicated) methods. These experiments show first the sensitivity of our model for each viewpoint and its effectiveness to recognize the different actions, with a variable but satisfactory recognition rate and also the results obtained by the fusion of these two points of view, which allows us to achieve a high performance recognition rate.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years there is an apparent shift in research from content based image retrieval (CBIR) to automatic image annotation in order to bridge the gap between low level features and high level semantics of images. Automatic Image Annotation (AIA) techniques facilitate extraction of high level semantic concepts from images by machine learning techniques. Many AIA techniques use feature analysis as the first step to identify the objects in the image. However, the high dimensional image features make the performance of the system worse. This paper describes and evaluates an automatic image annotation framework which uses SURF descriptors to select right number of features and right features for annotation. The proposed framework uses a hybrid approach in which k-means clustering is used in the training phase and fuzzy K-NN classification in the annotation phase. The performance of the system is evaluated using standard metrics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the most important goals of bioinformatics is the ability to identify genes in uncharacterized DNA sequences on world wide database. Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters. In these regions are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying a great number of promoters on a genome is a complex task. Nevertheless, the main drawback is the absence of a large set of promoters to identify conserved patterns among the species. Hence, a in silico method to predict them on any species is a challenge. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. In this work, we present an empirical comparison of Machine Learning (ML) techniques such as Na¨ýve Bayes, Decision Trees, Support Vector Machines and Neural Networks, Voted Perceptron, PART, k-NN and and ensemble approaches (Bagging and Boosting) to the task of predicting Bacillus subtilis. In order to do so, we first built two data set of promoter and nonpromoter sequences for B. subtilis and a hybrid one. In order to evaluate of ML methods a cross-validation procedure is applied. Good results were obtained with methods of ML like SVM and Naïve Bayes using B. subtilis. However, we have not reached good results on hybrid database

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The objective of the researches in artificial intelligence is to qualify the computer to execute functions that are performed by humans using knowledge and reasoning. This work was developed in the area of machine learning, that it s the study branch of artificial intelligence, being related to the project and development of algorithms and techniques capable to allow the computational learning. The objective of this work is analyzing a feature selection method for ensemble systems. The proposed method is inserted into the filter approach of feature selection method, it s using the variance and Spearman correlation to rank the feature and using the reward and punishment strategies to measure the feature importance for the identification of the classes. For each ensemble, several different configuration were used, which varied from hybrid (homogeneous) to non-hybrid (heterogeneous) structures of ensemble. They were submitted to five combining methods (voting, sum, sum weight, multiLayer Perceptron and naïve Bayes) which were applied in six distinct database (real and artificial). The classifiers applied during the experiments were k- nearest neighbor, multiLayer Perceptron, naïve Bayes and decision tree. Finally, the performance of ensemble was analyzed comparatively, using none feature selection method, using a filter approach (original) feature selection method and the proposed method. To do this comparison, a statistical test was applied, which demonstrate that there was a significant improvement in the precision of the ensembles

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As condições meteorológicas são determinantes para a produção agrícola; a precipitação, em particular, pode ser citada como a mais influente por sua relação direta com o balanço hídrico. Neste sentido, modelos agrometeorológicos, os quais se baseiam nas respostas das culturas às condições meteorológicas, vêm sendo cada vez mais utilizados para a estimativa de rendimentos agrícolas. Devido às dificuldades de obtenção de dados para abastecer tais modelos, métodos de estimativa de precipitação utilizando imagens dos canais espectrais dos satélites meteorológicos têm sido empregados para esta finalidade. O presente trabalho tem por objetivo utilizar o classificador de padrões floresta de caminhos ótimos para correlacionar informações disponíveis no canal espectral infravermelho do satélite meteorológico GOES-12 com a refletividade obtida pelo radar do IPMET/UNESP localizado no município de Bauru, visando o desenvolvimento de um modelo para a detecção de ocorrência de precipitação. Nos experimentos foram comparados quatro algoritmos de classificação: redes neurais artificiais (ANN), k-vizinhos mais próximos (k-NN), máquinas de vetores de suporte (SVM) e floresta de caminhos ótimos (OPF). Este último obteve melhor resultado, tanto em eficiência quanto em precisão.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women. In CAD systems, several studies have investigated the use of wavelet transform as a multiresolution analysis tool for texture analysis and could be interpreted as inputs to a classifier. In classification, polynomial classifier has been used due to the advantages of providing only one model for optimal separation of classes and to consider this as the solution of the problem. In this paper, a system is proposed for texture analysis and classification of lesions in mammographic images. Multiresolution analysis features were extracted from the region of interest of a given image. These features were computed based on three different wavelet functions, Daubechies 8, Symlet 8 and bi-orthogonal 3.7. For classification, we used the polynomial classification algorithm to define the mammogram images as normal or abnormal. We also made a comparison with other artificial intelligence algorithms (Decision Tree, SVM, K-NN). A Receiver Operating Characteristics (ROC) curve is used to evaluate the performance of the proposed system. Our system is evaluated using 360 digitized mammograms from DDSM database and the result shows that the algorithm has an area under the ROC curve Az of 0.98 ± 0.03. The performance of the polynomial classifier has proved to be better in comparison to other classification algorithms. © 2013 Elsevier Ltd. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Soil organic matter (SOM) constitutes an important reservoir of terrestrial carbon and can be considered an alternative for atmospheric carbon storage, contributing to global warming mitigation. Soil management can favor atmospheric carbon incorporation into SUM or its release from SOM to atmosphere. Thus, the evaluation of the humification degree (HD), which is an indication of the recalcitrance of SOM, can provide an estimation of the capacity of carbon sequestration by soils under various managements. The HD of SOM can be estimated by using various analytical techniques including fluorescence spectroscopy. In the present work, the potential of laser-induced breakdown spectroscopy (LIBS) to estimate the HD of SUM was evaluated for the first time. Intensities of emission lines of Al, Mg and Ca from LIBS spectra showing correlation with fluorescence emissions determined by laser-induced fluorescence spectroscopy (LIFS) reference technique were used to obtain a multivaried calibration model based on the k-nearest neighbor (k-NN) method. The values predicted by the proposed model (A-LIBS) showed strong correlation with LIFS results with a Pearson's coefficient of 0.87. The HD of SUM obtained after normalizing A-LIBS by total carbon in the sample showed a strong correlation to that determined by LIFS (0.94), thus suggesting the great potential of LIBS for this novel application. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores which uses only music notation to determine the author. The steps of the proposed system are the following. First of all, the music sheet is preprocessed for obtaining a music score without the staff lines. Afterwards, four different methods for generating texture images from music symbols are applied. Every approach uses a different spatial variation when combining the music symbols to generate the textures. Finally, Gabor filters and Grey-scale Co-ocurrence matrices are used to obtain the features. The classification is performed using a k-NN classifier based on Euclidean distance. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving encouraging identification rates.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tumor necrosis factor (TNF)-Receptor Associated Factors (TRAFs) are a family of signal transducer proteins. TRAF6 is a unique member of this family in that it is involved in not only the TNF superfamily, but the toll-like receptor (TLR)/IL-1R (TIR) superfamily. The formation of the complex consisting of Receptor Activator of Nuclear Factor κ B (RANK), with its ligand (RANKL) results in the recruitment of TRAF6, which activates NF-κB, JNK and MAP kinase pathways. TRAF6 is critical in signaling with leading to release of various growth factors in bone, and promotes osteoclastogenesis. TRAF6 has also been implicated as an oncogene in lung cancer and as a target in multiple myeloma. In the hopes of developing small molecule inhibitors of the TRAF6-RANK interaction, multiple steps were carried out. Computational prediction of hot spot residues on the protein-protein interaction of TRAF6 and RANK were examined. Three methods were used: Robetta, KFC2, and HotPoint, each of which uses a different methodology to determine if a residue is a hot spot. These hot spot predictions were considered the basis for resolving the binding site for in silico high-throughput screening using GOLD and the MyriaScreen database of drug/lead-like compounds. Computationally intensive molecular dynamics simulations highlighted the binding mechanism and TRAF6 structural changes upon hit binding. Compounds identified as hits were verified using a GST-pull down assay, comparing inhibition to a RANK decoy peptide. Since many drugs fail due to lack of efficacy and toxicity, predictive models for the evaluation of the LD50 and bioavailability of our TRAF6 hits, and these models can be used towards other drugs and small molecule therapeutics as well. Datasets of compounds and their corresponding bioavailability and LD50 values were curated based, and QSAR models were built using molecular descriptors of these compounds using the k-nearest neighbor (k-NN) method, and quality of these models were cross-validated.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a stress detection system based on fuzzy logic and two physiological signals: Galvanic Skin Response and Heart Rate. Instead of providing a global stress classification, this approach creates an individual stress templates, gathering the behaviour of individuals under situations with different degrees of stress. The proposed method is able to detect stress properly with a rate of 99.5%, being evaluated with a database of 80 individuals. This result improves former approaches in the literature and well-known machine learning techniques like SVM, k-NN, GMM and Linear Discriminant Analysis. Finally, the proposed method is highly suitable for real-time applications

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a stress detection system based on fuzzy logic and the physiological signals heart rate and galvanic skin response. The main contribution of this method relies on the creation of a stress template, collecting the behaviour of previous signals under situations with a different level of stress in each individual. The creation of this template provides an accuracy of 99.5% in stress detection, improving the results obtained by current pattern recognition techniques like GMM, k-NN, SVM or Fisher Linear Discriminant. In addition, this system can be embedded in security systems to detect critical situations in accesses as cross-border control. Furthermore, its applications can be extended to other fields as vehicle driver state-of-mind management, medicine or sport training.