157 resultados para Least-squares support vector machine


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe an investigation into how Massey University’s Pollen Classifynder can accelerate the understanding of pollen and its role in nature. The Classifynder is an imaging microscopy system that can locate, image and classify slide based pollen samples. Given the laboriousness of purely manual image acquisition and identification it is vital to exploit assistive technologies like the Classifynder to enable acquisition and analysis of pollen samples. It is also vital that we understand the strengths and limitations of automated systems so that they can be used (and improved) to compliment the strengths and weaknesses of human analysts to the greatest extent possible. This article reviews some of our experiences with the Classifynder system and our exploration of alternative classifier models to enhance both accuracy and interpretability. Our experiments in the pollen analysis problem domain have been based on samples from the Australian National University’s pollen reference collection (2,890 grains, 15 species) and images bundled with the Classifynder system (400 grains, 4 species). These samples have been represented using the Classifynder image feature set.We additionally work through a real world case study where we assess the ability of the system to determine the pollen make-up of samples of New Zealand honey. In addition to the Classifynder’s native neural network classifier, we have evaluated linear discriminant, support vector machine, decision tree and random forest classifiers on these data with encouraging results. Our hope is that our findings will help enhance the performance of future releases of the Classifynder and other systems for accelerating the acquisition and analysis of pollen samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe an investigation into how Massey University's Pollen Classifynder can accelerate the understanding of pollen and its role in nature. The Classifynder is an imaging microscopy system that can locate, image and classify slide based pollen samples. Given the laboriousness of purely manual image acquisition and identification it is vital to exploit assistive technologies like the Classifynder to enable acquisition and analysis of pollen samples. It is also vital that we understand the strengths and limitations of automated systems so that they can be used (and improved) to compliment the strengths and weaknesses of human analysts to the greatest extent possible. This article reviews some of our experiences with the Classifynder system and our exploration of alternative classifier models to enhance both accuracy and interpretability. Our experiments in the pollen analysis problem domain have been based on samples from the Australian National University's pollen reference collection (2890 grains, 15 species) and images bundled with the Classifynder system (400 grains, 4 species). These samples have been represented using the Classifynder image feature set. In addition to the Classifynder's native neural network classifier, we have evaluated linear discriminant, support vector machine, decision tree and random forest classifiers on these data with encouraging results. Our hope is that our findings will help enhance the performance of future releases of the Classifynder and other systems for accelerating the acquisition and analysis of pollen samples. © 2013 AIP Publishing LLC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diabetic macular edema (DME) is one of the most common causes of visual loss among diabetes mellitus patients. Early detection and successive treatment may improve the visual acuity. DME is mainly graded into non-clinically significant macular edema (NCSME) and clinically significant macular edema according to the location of hard exudates in the macula region. DME can be identified by manual examination of fundus images. It is laborious and resource intensive. Hence, in this work, automated grading of DME is proposed using higher-order spectra (HOS) of Radon transform projections of the fundus images. We have used third-order cumulants and bispectrum magnitude, in this work, as features, and compared their performance. They can capture subtle changes in the fundus image. Spectral regression discriminant analysis (SRDA) reduces feature dimension, and minimum redundancy maximum relevance method is used to rank the significant SRDA components. Ranked features are fed to various supervised classifiers, viz. Naive Bayes, AdaBoost and support vector machine, to discriminate No DME, NCSME and clinically significant macular edema classes. The performance of our system is evaluated using the publicly available MESSIDOR dataset (300 images) and also verified with a local dataset (300 images). Our results show that HOS cumulants and bispectrum magnitude obtained an average accuracy of 95.56 and 94.39 % for MESSIDOR dataset and 95.93 and 93.33 % for local dataset, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In vegetated environments, reliable obstacle detection remains a challenge for state-of-the-art methods, which are usually based on geometrical representations of the environment built from LIDAR and/or visual data. In many cases, in practice field robots could safely traverse through vegetation, thereby avoiding costly detours. However, it is often mistakenly interpreted as an obstacle. Classifying vegetation is insufficient since there might be an obstacle hidden behind or within it. Some Ultra-wide band (UWB) radars can penetrate through vegetation to help distinguish actual obstacles from obstacle-free vegetation. However, these sensors provide noisy and low-accuracy data. Therefore, in this work we address the problem of reliable traversability estimation in vegetation by augmenting LIDAR-based traversability mapping with UWB radar data. A sensor model is learned from experimental data using a support vector machine to convert the radar data into occupancy probabilities. These are then fused with LIDAR-based traversability data. The resulting augmented traversability maps capture the fine resolution of LIDAR-based maps but clear safely traversable foliage from being interpreted as obstacle. We validate the approach experimentally using sensors mounted on two different mobile robots, navigating in two different environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rolling-element bearing failures are the most frequent problems in rotating machinery, which can be catastrophic and cause major downtime. Hence, providing advance failure warning and precise fault detection in such components are pivotal and cost-effective. The vast majority of past research has focused on signal processing and spectral analysis for fault diagnostics in rotating components. In this study, a data mining approach using a machine learning technique called anomaly detection (AD) is presented. This method employs classification techniques to discriminate between defect examples. Two features, kurtosis and Non-Gaussianity Score (NGS), are extracted to develop anomaly detection algorithms. The performance of the developed algorithms was examined through real data from a test to failure bearing. Finally, the application of anomaly detection is compared with one of the popular methods called Support Vector Machine (SVM) to investigate the sensitivity and accuracy of this approach and its ability to detect the anomalies in early stages.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we aim at predicting protein structural classes for low-homology data sets based on predicted secondary structures. We propose a new and simple kernel method, named as SSEAKSVM, to predict protein structural classes. The secondary structures of all protein sequences are obtained by using the tool PSIPRED and then a linear kernel on the basis of secondary structure element alignment scores is constructed for training a support vector machine classifier without parameter adjusting. Our method SSEAKSVM was evaluated on two low-homology datasets 25PDB and 1189 with sequence homology being 25% and 40%, respectively. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies on these two data sets are 86.3% and 84.5%, respectively, which are higher than those obtained by other existing methods. Especially, our method achieves higher accuracies (88.1% and 88.5%) for differentiating the α + β class and the α/β class compared to other methods. This suggests that our method is valuable to predict protein structural classes particularly for low-homology protein sequences. The source code of the method in this paper can be downloaded at http://math.xtu.edu.cn/myphp/math/research/source/SSEAK_source_code.rar.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Frogs have received increasing attention due to their effectiveness for indicating the environment change. Therefore, it is important to monitor and assess frogs. With the development of sensor techniques, large volumes of audio data (including frog calls) have been collected and need to be analysed. After transforming the audio data into its spectrogram representation using short-time Fourier transform, the visual inspection of this representation motivates us to use image processing techniques for analysing audio data. Applying acoustic event detection (AED) method to spectrograms, acoustic events are firstly detected from which ridges are extracted. Three feature sets, Mel-frequency cepstral coefficients (MFCCs), AED feature set and ridge feature set, are then used for frog call classification with a support vector machine classifier. Fifteen frog species widely spread in Queensland, Australia, are selected to evaluate the proposed method. The experimental results show that ridge feature set can achieve an average classification accuracy of 74.73% which outperforms the MFCCs (38.99%) and AED feature set (67.78%).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classification of Australian frogs: family, genus and species, including three families, eleven genera and eighty five species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classifier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classifier is used for the multi- level classification with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classification accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The past several years have seen significant advances in the development of computational methods for the prediction of the structure and interactions of coiled-coil peptides. These methods are generally based on pairwise correlations of amino acids, helical propensity, thermal melts and the energetics of sidechain interactions, as well as statistical patterns based on Hidden Markov Model (HMM) and Support Vector Machine (SVM) techniques. These methods are complemented by a number of public databases that contain sequences, motifs, domains and other details of coiled-coil structures identified by various algorithms. Some of these computational methods have been developed to make predictions of coiled-coil structure on the basis of sequence information; however, structural predictions of the oligomerisation state of these peptides still remains largely an open question due to the dynamic behaviour of these molecules. This review focuses on existing in silico methods for the prediction of coiled-coil peptides of functional importance using sequence and/or three-dimensional structural data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Context: Pheochromocytomas and paragangliomas (PPGLs) are heritable neoplasms that can be classified into gene-expression subtypes corresponding to their underlying specific genetic drivers. Objective: This study aimed to develop a diagnostic and research tool (Pheo-type) capable of classifying PPGL tumors into gene-expression subtypes that could be used to guide and interpret genetic testing, determine surveillance programs, and aid in elucidation of PPGL biology. Design: A compendium of published microarray data representing 205 PPGL tumors was used for the selection of subtype-specific genes that were then translated to the Nanostring gene-expression platform. A support vector machine was trained on the microarray dataset and then tested on an independent Nanostring dataset representing 38 familial and sporadic cases of PPGL of known genotype (RET, NF1, TMEM127, MAX, HRAS, VHL, and SDHx). Different classifier models involving between three and six subtypes were compared for their discrimination potential. Results: A gene set of 46 genes and six endogenous controls was selected representing six known PPGL subtypes; RTK1–3 (RET, NF1, TMEM127, and HRAS), MAX-like, VHL, and SDHx. Of 38 test cases, 34 (90%) were correctly predicted to six subtypes based on the known genotype to gene-expression subtype association. Removal of the RTK2 subtype from training, characterized by an admixture of tumor and normal adrenal cortex, improved the classification accuracy (35/38). Consolidation of RTK and pseudohypoxic PPGL subtypes to four- and then three-class architectures improved the classification accuracy for clinical application. Conclusions: The Pheo-type gene-expression assay is a reliable method for predicting PPGL genotype using routine diagnostic tumor samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The most difficult operation in flood inundation mapping using optical flood images is to map the ‘wet’ areas where trees and houses are partly covered by water. This can be referred to as a typical problem of the presence of mixed pixels in the images. A number of automatic information extracting image classification algorithms have been developed over the years for flood mapping using optical remote sensing images, with most labelling a pixel as a particular class. However, they often fail to generate reliable flood inundation mapping because of the presence of mixed pixels in the images. To solve this problem, spectral unmixing methods have been developed. In this thesis, methods for selecting endmembers and the method to model the primary classes for unmixing, the two most important issues in spectral unmixing, are investigated. We conduct comparative studies of three typical spectral unmixing algorithms, Partial Constrained Linear Spectral unmixing, Multiple Endmember Selection Mixture Analysis and spectral unmixing using the Extended Support Vector Machine method. They are analysed and assessed by error analysis in flood mapping using MODIS, Landsat and World View-2 images. The Conventional Root Mean Square Error Assessment is applied to obtain errors for estimated fractions of each primary class. Moreover, a newly developed Fuzzy Error Matrix is used to obtain a clear picture of error distributions at the pixel level. This thesis shows that the Extended Support Vector Machine method is able to provide a more reliable estimation of fractional abundances and allows the use of a complete set of training samples to model a defined pure class. Furthermore, it can be applied to analysis of both pure and mixed pixels to provide integrated hard-soft classification results. Our research also identifies and explores a serious drawback in relation to endmember selections in current spectral unmixing methods which apply fixed sets of endmember classes or pure classes for mixture analysis of every pixel in an entire image. However, as it is not accurate to assume that every pixel in an image must contain all endmember classes, these methods usually cause an over-estimation of the fractional abundances in a particular pixel. In this thesis, a subset of adaptive endmembers in every pixel is derived using the proposed methods to form an endmember index matrix. The experimental results show that using the pixel-dependent endmembers in unmixing significantly improves performance.