175 resultados para Fuzzy K Nearest Neighbor
Resumo:
The molecular and metal profile fingerprints were obtained from a complex substance, Atractylis chinensis DC—a traditional Chinese medicine (TCM), with the use of the high performance liquid chromatography (HPLC) and inductively coupled plasma atomic emission spectroscopy (ICP-AES) techniques. This substance was used in this work as an example of a complex biological material, which has found application as a TCM. Such TCM samples are traditionally processed by the Bran, Cut, Fried and Swill methods, and were collected from five provinces in China. The data matrices obtained from the two types of analysis produced two principal component biplots, which showed that the HPLC fingerprint data were discriminated on the basis of the methods for processing the raw TCM, while the metal analysis grouped according to the geographical origin. When the two data matrices were combined into a one two-way matrix, the resulting biplot showed a clear separation on the basis of the HPLC fingerprints. Importantly, within each different grouping the objects separated according to their geographical origin, and they ranked approximately in the same order in each group. This result suggested that by using such an approach, it is possible to derive improved characterisation of the complex TCM materials on the basis of the two kinds of analytical data. In addition, two supervised pattern recognition methods, K-nearest neighbors (KNNs) method, and linear discriminant analysis (LDA), were successfully applied to the individual data matrices—thus, supporting the PCA approach.
Resumo:
This paper presents a travel time prediction model and evaluates its performance and transferability. Advanced Travelers Information Systems (ATIS) are gaining more and more importance, increasing the need for accurate, timely and useful information to the travelers. Travel time information quantifies the traffic condition in an easy to understand way for the users. The proposed travel time prediction model is based on an efficient use of nearest neighbor search. The model is calibrated for optimal performance using Genetic Algorithms. Results indicate better performance by using the proposed model than the presently used naïve model.
Resumo:
Cell invasion involves a population of cells which are motile and proliferative. Traditional discrete models of proliferation involve agents depositing daughter agents on nearest- neighbor lattice sites. Motivated by time-lapse images of cell invasion, we propose and analyze two new discrete proliferation models in the context of an exclusion process with an undirected motility mechanism. These discrete models are related to a family of reaction- diffusion equations and can be used to make predictions over a range of scales appropriate for interpreting experimental data. The new proliferation mechanisms are biologically relevant and mathematically convenient as the continuum-discrete relationship is more robust for the new proliferation mechanisms relative to traditional approaches.
Resumo:
Recent algorithms for monocular motion capture (MoCap) estimate weak-perspective camera matrices between images using a small subset of approximately-rigid points on the human body (i.e. the torso and hip). A problem with this approach, however, is that these points are often close to coplanar, causing canonical linear factorisation algorithms for rigid structure from motion (SFM) to become extremely sensitive to noise. In this paper, we propose an alternative solution to weak-perspective SFM based on a convex relaxation of graph rigidity. We demonstrate the success of our algorithm on both synthetic and real world data, allowing for much improved solutions to marker less MoCap problems on human bodies. Finally, we propose an approach to solve the two-fold ambiguity over bone direction using a k-nearest neighbour kernel density estimator.
Resumo:
Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.
Resumo:
Field robots often rely on laser range finders (LRFs) to detect obstacles and navigate autonomously. Despite recent progress in sensing technology and perception algorithms, adverse environmental conditions, such as the presence of smoke, remain a challenging issue for these robots. In this paper, we investigate the possibility to improve laser-based perception applications by anticipating situations when laser data are affected by smoke, using supervised learning and state-of-the-art visual image quality analysis. We propose to train a k-nearest-neighbour (kNN) classifier to recognise situations where a laser scan is likely to be affected by smoke, based on visual data quality features. This method is evaluated experimentally using a mobile robot equipped with LRFs and a visual camera. The strengths and limitations of the technique are identified and discussed, and we show that the method is beneficial if conservative decisions are the most appropriate.
Resumo:
We consider a discrete agent-based model on a one-dimensional lattice and a two-dimensional square lattice, where each agent is a dimer occupying two sites. Agents move by vacating one occupied site in favor of a nearest-neighbor site and obey either a strict simple exclusion rule or a weaker constraint that permits partial overlaps between dimers. Using indicator variables and careful probability arguments, a discrete-time master equation for these processes is derived systematically within a mean-field approximation. In the continuum limit, nonlinear diffusion equations that describe the average agent occupancy of the dimer population are obtained. In addition, we show that multiple species of interacting subpopulations give rise to advection-diffusion equations. Averaged discrete simulation data compares very well with the solution to the continuum partial differential equation models. Since many cell types are elongated rather than circular, this work offers insight into population-level behavior of collective cellular motion.
Resumo:
A discrete agent-based model on a periodic lattice of arbitrary dimension is considered. Agents move to nearest-neighbor sites by a motility mechanism accounting for general interactions, which may include volume exclusion. The partial differential equation describing the average occupancy of the agent population is derived systematically. A diffusion equation arises for all types of interactions and is nonlinear except for the simplest interactions. In addition, multiple species of interacting subpopulations give rise to an advection-diffusion equation for each subpopulation. This work extends and generalizes previous specific results, providing a construction method for determining the transport coefficients in terms of a single conditional transition probability, which depends on the occupancy of sites in an influence region. These coefficients characterize the diffusion of agents in a crowded environment in biological and physical processes.
Resumo:
Age-related Macular Degeneration (AMD) is one of the major causes of vision loss and blindness in ageing population. Currently, there is no cure for AMD, however early detection and subsequent treatment may prevent the severe vision loss or slow the progression of the disease. AMD can be classified into two types: dry and wet AMDs. The people with macular degeneration are mostly affected by dry AMD. Early symptoms of AMD are formation of drusen and yellow pigmentation. These lesions are identified by manual inspection of fundus images by the ophthalmologists. It is a time consuming, tiresome process, and hence an automated diagnosis of AMD screening tool can aid clinicians in their diagnosis significantly. This study proposes an automated dry AMD detection system using various entropies (Shannon, Kapur, Renyi and Yager), Higher Order Spectra (HOS) bispectra features, Fractional Dimension (FD), and Gabor wavelet features extracted from greyscale fundus images. The features are ranked using t-test, Kullback–Lieber Divergence (KLD), Chernoff Bound and Bhattacharyya Distance (CBBD), Receiver Operating Characteristics (ROC) curve-based and Wilcoxon ranking methods in order to select optimum features and classified into normal and AMD classes using Naive Bayes (NB), k-Nearest Neighbour (k-NN), Probabilistic Neural Network (PNN), Decision Tree (DT) and Support Vector Machine (SVM) classifiers. The performance of the proposed system is evaluated using private (Kasturba Medical Hospital, Manipal, India), Automated Retinal Image Analysis (ARIA) and STructured Analysis of the Retina (STARE) datasets. The proposed system yielded the highest average classification accuracies of 90.19%, 95.07% and 95% with 42, 54 and 38 optimal ranked features using SVM classifier for private, ARIA and STARE datasets respectively. This automated AMD detection system can be used for mass fundus image screening and aid clinicians by making better use of their expertise on selected images that require further examination.
Resumo:
Existing crowd counting algorithms rely on holistic, local or histogram based features to capture crowd properties. Regression is then employed to estimate the crowd size. Insufficient testing across multiple datasets has made it difficult to compare and contrast different methodologies. This paper presents an evaluation across multiple datasets to compare holistic, local and histogram based methods, and to compare various image features and regression models. A K-fold cross validation protocol is followed to evaluate the performance across five public datasets: UCSD, PETS 2009, Fudan, Mall and Grand Central datasets. Image features are categorised into five types: size, shape, edges, keypoints and textures. The regression models evaluated are: Gaussian process regression (GPR), linear regression, K nearest neighbours (KNN) and neural networks (NN). The results demonstrate that local features outperform equivalent holistic and histogram based features; optimal performance is observed using all image features except for textures; and that GPR outperforms linear, KNN and NN regression
Resumo:
Modularity has been suggested to be connected to evolvability because a higher degree of independence among parts allows them to evolve as separate units. Recently, the Escoufier RV coefficient has been proposed as a measure of the degree of integration between modules in multivariate morphometric datasets. However, it has been shown, using randomly simulated datasets, that the value of the RV coefficient depends on sample size. Also, so far there is no statistical test for the difference in the RV coefficient between a priori defined groups of observations. Here, we (1), using a rarefaction analysis, show that the value of the RV coefficient depends on sample size also in real geometric morphometric datasets; (2) propose a permutation procedure to test for the difference in the RV coefficient between a priori defined groups of observations; (3) show, through simulations, that such a permutation procedure has an appropriate Type I error; (4) suggest that a rarefaction procedure could be used to obtain sample-size-corrected values of the RV coefficient; and (5) propose a nearest-neighbor procedure that could be used when studying the variation of modularity in geographic space. The approaches outlined here, readily extendable to non-morphometric datasets, allow study of the variation in the degree of integration between a priori defined modules. A Java application – that will allow performance of the proposed test using a software with graphical user interface – has also been developed and is available at the Morphometrics at Stony Brook Web page (http://life.bio.sunysb.edu/morph/).
Resumo:
Flos Chrysanthemum is a generic name for a particular group of edible plants, which also have medicinal properties. There are, in fact, twenty to thirty different cultivars, which are commonly used in beverages and for medicinal purposes. In this work, four Flos Chrysanthemum cultivars, Hangju, Taiju, Gongju, and Boju, were collected and chromatographic fingerprints were used to distinguish and assess these cultivars for quality control purposes. Chromatography fingerprints contain chemical information but also often have baseline drifts and peak shifts, which complicate data processing, and adaptive iteratively reweighted, penalized least squares, and correlation optimized warping were applied to correct the fingerprint peaks. The adjusted data were submitted to unsupervised and supervised pattern recognition methods. Principal component analysis was used to qualitatively differentiate the Flos Chrysanthemum cultivars. Partial least squares, continuum power regression, and K-nearest neighbors were used to predict the unknown samples. Finally, the elliptic joint confidence region method was used to evaluate the prediction ability of these models. The partial least squares and continuum power regression methods were shown to best represent the experimental results.
Resumo:
A novel near-infrared spectroscopy (NIRS) method has been researched and developed for the simultaneous analyses of the chemical components and associated properties of mint (Mentha haplocalyx Briq.) tea samples. The common analytes were: total polysaccharide content, total flavonoid content, total phenolic content, and total antioxidant activity. To resolve the NIRS data matrix for such analyses, least squares support vector machines was found to be the best chemometrics method for prediction, although it was closely followed by the radial basis function/partial least squares model. Interestingly, the commonly used partial least squares was unsatisfactory in this case. Additionally, principal component analysis and hierarchical cluster analysis were able to distinguish the mint samples according to their four geographical provinces of origin, and this was further facilitated with the use of the chemometrics classification methods-K-nearest neighbors, linear discriminant analysis, and partial least squares discriminant analysis. In general, given the potential savings with sampling and analysis time as well as with the costs of special analytical reagents required for the standard individual methods, NIRS offered a very attractive alternative for the simultaneous analysis of mint samples.
Resumo:
Frog protection has become increasingly essential due to the rapid decline of its biodiversity. Therefore, it is valuable to develop new methods for studying this biodiversity. In this paper, a novel feature extraction method is proposed based on perceptual wavelet packet decomposition for classifying frog calls in noisy environments. Pre-processing and syllable segmentation are first applied to the frog call. Then, a spectral peak track is extracted from each syllable if possible. Track duration, dominant frequency and oscillation rate are directly extracted from the track. With k-means clustering algorithm, the calculated dominant frequency of all frog species is clustered into k parts, which produce a frequency scale for wavelet packet decomposition. Based on the adaptive frequency scale, wavelet packet decomposition is applied to the frog calls. Using the wavelet packet decomposition coefficients, a new feature set named perceptual wavelet packet decomposition sub-band cepstral coefficients is extracted. Finally, a k-nearest neighbour (k-NN) classifier is used for the classification. The experiment results show that the proposed features can achieve an average classification accuracy of 97.45% which outperforms syllable features (86.87%) and Mel-frequency cepstral coefficients (MFCCs) feature (90.80%).
Resumo:
Results of a study designed to investigate the possibility of using the Si(111)- Ge(5×5) surface reconstruction as a template for In cluster growth are described. As with Si(111)-7×7, the In adatoms preferentially adsorb in the faulted half-unit cell, but on Si(111)- Ge(5×5) a richer variety of cluster geometries are found. In addition to the clusters that occupy the faulted half-unit cell, clusters that span two and four half-unit cells are found. The latter have a triangular shape spanning one unfaulted and three, nearest neighbor, faulted half-unit cells, Triangular clusters in the opposite orientation were not found. Many of the faulted halfunit cells have a streaked appearance consistent with adatom mobility.