Biblioteca Digital

152 resultados para Rule-Based Classification

Semi-binary based video features for activity representation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Efficient and effective feature detection and representation is an important consideration when processing videos, and a large number of applications such as motion analysis, 3D scene understanding, tracking etc. depend on this. Amongst several feature description methods, local features are becoming increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational complexity, their performance is still too limited for real world applications. Furthermore, rapid increases in the uptake of mobile devices has increased the demand for algorithms that can run with reduced memory and computational requirements. In this paper we propose a semi binary based feature detectordescriptor based on the BRISK detector, which can detect and represent videos with significantly reduced computational requirements, while achieving comparable performance to the state of the art spatio-temporal feature descriptors. First, the BRISK feature detector is applied on a frame by frame basis to detect interest points, then the detected key points are compared against consecutive frames for significant motion. Key points with significant motion are encoded with the BRISK descriptor in the spatial domain and Motion Boundary Histogram in the temporal domain. This descriptor is not only lightweight but also has lower memory requirements because of the binary nature of the BRISK descriptor, allowing the possibility of applications using hand held devices.We evaluate the combination of detectordescriptor performance in the context of action classification with a standard, popular bag-of-features with SVM framework. Experiments are carried out on two popular datasets with varying complexity and we demonstrate comparable performance with other descriptors with reduced computational complexity.

Predicting fault-prone software modules with rank sum classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The detection and correction of defects remains among the most time consuming and expensive aspects of software development. Extensive automated testing and code inspections may mitigate their effect, but some code fragments are necessarily more likely to be faulty than others, and automated identification of fault prone modules helps to focus testing and inspections, thus limiting wasted effort and potentially improving detection rates. However, software metrics data is often extremely noisy, with enormous imbalances in the size of the positive and negative classes. In this work, we present a new approach to predictive modelling of fault proneness in software modules, introducing a new feature representation to overcome some of these issues. This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision and recall to optimise inspection effort to suit different testing environments. The method is evaluated using the NASA Metrics Data Program (MDP) data sets, and performance is compared with existing studies based on the Support Vector Machine (SVM) and Naïve Bayes (NB) Classifiers, and with our own comprehensive evaluation of these methods.

Local inter-session variability modelling for object classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Object classification is plagued by the issue of session variation. Session variation describes any variation that makes one instance of an object look different to another, for instance due to pose or illumination variation. Recent work in the challenging task of face verification has shown that session variability modelling provides a mechanism to overcome some of these limitations. However, for computer vision purposes, it has only been applied in the limited setting of face verification. In this paper we propose a local region based intersession variability (ISV) modelling approach, and apply it to challenging real-world data. We propose a region based session variability modelling approach so that local session variations can be modelled, termed Local ISV. We then demonstrate the efficacy of this technique on a challenging real-world fish image database which includes images taken underwater, providing significant real-world session variations. This Local ISV approach provides a relative performance improvement of, on average, 23% on the challenging MOBIO, Multi-PIE and SCface face databases. It also provides a relative performance improvement of 35% on our challenging fish image dataset.

Cell image classification using histograms, higher order statistics and adaboost

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A cell classification algorithm that uses first, second and third order statistics of pixel intensity distributions over pre-defined regions is implemented and evaluated. A cell image is segmented into 6 regions extending from a boundary layer to an inner circle. First, second and third order statistical features are extracted from histograms of pixel intensities in these regions. Third order statistical features used are one-dimensional bispectral invariants. 108 features were considered as candidates for Adaboost based fusion. The best 10 stage fused classifier was selected for each class and a decision tree constructed for the 6-class problem. The classifier is robust, accurate and fast by design.

Integrated approach for diagnostics and prognostics of HP LNG pump based on health state probability estimation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Effective machine fault prognostic technologies can lead to elimination of unscheduled downtime and increase machine useful life and consequently lead to reduction of maintenance costs as well as prevention of human casualties in real engineering asset management. This paper presents a technique for accurate assessment of the remnant life of machines based on health state probability estimation technique and historical failure knowledge embedded in the closed loop diagnostic and prognostic system. To estimate a discrete machine degradation state which can represent the complex nature of machine degradation effectively, the proposed prognostic model employed a classification algorithm which can use a number of damage sensitive features compared to conventional time series analysis techniques for accurate long-term prediction. To validate the feasibility of the proposed model, the five different level data of typical four faults from High Pressure Liquefied Natural Gas (HP-LNG) pumps were used for the comparison of intelligent diagnostic test using five different classification algorithms. In addition, two sets of impeller-rub data were analysed and employed to predict the remnant life of pump based on estimation of health state probability using the Support Vector Machine (SVM) classifier. The results obtained were very encouraging and showed that the proposed prognostics system has the potential to be used as an estimation tool for machine remnant life prediction in real life industrial applications.

Facilitating query decomposition in query language modeling by association rule mining using multiple sliding windows

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a novel framework to further advance the recent trend of using query decomposition and high-order term relationships in query language modeling, which takes into account terms implicitly associated with different subsets of query terms. Existing approaches, most remarkably the language model based on the Information Flow method are however unable to capture multiple levels of associations and also suffer from a high computational overhead. In this paper, we propose to compute association rules from pseudo feedback documents that are segmented into variable length chunks via multiple sliding windows of different sizes. Extensive experiments have been conducted on various TREC collections and our approach significantly outperforms a baseline Query Likelihood language model, the Relevance Model and the Information Flow model.

Comparison of intensity-based cut-points for the RT3 accelerometer in youth

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES: To compare the classification accuracy of previously published RT3 accelerometer cut-points for youth using energy expenditure, measured via portable indirect calorimetry, as a criterion measure. DESIGN: Cross-sectional cross-validation study. METHODS: 100 children (mean age 11.2±2.8 years, 61% male) completed 12 standardized activities trials (3 sedentary, 5 lifestyle and 4 ambulatory) while wearing an RT3 accelerometer. V˙O2 was measured concurrently using the Oxycon Mobile portable calorimeter. Cut-points by Vanhelst (VH), Rowlands (RW), Chu (CH), Kavouras (KV) and the RT3 manufacturer (RT3M) were used to classify PA intensity as sedentary (SED), light (LPA), moderate (MPA) or vigorous (VPA). Classification accuracy was evaluated using the area under the Receiver Operating Characteristic curve (ROC-AUC) and weighted Kappa (κ). RESULTS: For moderate-to-vigorous PA (MVPA), VH, KV and RW exhibited excellent accuracy classification (ROC-AUC≥0.90), while the CH and RT3M exhibited good classification accuracy (ROC-AUC>0.80). Classification accuracy for LPA was fair to poor (ROC-AUC<0.76). For SED, VH exhibited excellent classification accuracy (ROC-AUC>0.90), while RW, CH, and RT3M exhibited good classification accuracy (ROC-AUC>0.80). Kappa statistics ranged from 0.67 (VH) to 0.55 (CH). CONCLUSIONS: All cut-points provided acceptable classification accuracy for SED and MVPA, but limited accuracy for LPA. On the basis of classification accuracy over all four levels of intensity, the use of the VH cut-points is recommended.

Random projections on manifolds of symmetric positive definite matrices for image classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent advances suggest that encoding images through Symmetric Positive Definite (SPD) matrices and then interpreting such matrices as points on Riemannian manifolds can lead to increased classification performance. Taking into account manifold geometry is typically done via (1) embedding the manifolds in tangent spaces, or (2) embedding into Reproducing Kernel Hilbert Spaces (RKHS). While embedding into tangent spaces allows the use of existing Euclidean-based learning algorithms, manifold shape is only approximated which can cause loss of discriminatory information. The RKHS approach retains more of the manifold structure, but may require non-trivial effort to kernelise Euclidean-based learning algorithms. In contrast to the above approaches, in this paper we offer a novel solution that allows SPD matrices to be used with unmodified Euclidean-based learning algorithms, with the true manifold shape well-preserved. Specifically, we propose to project SPD matrices using a set of random projection hyperplanes over RKHS into a random projection space, which leads to representing each matrix as a vector of projection coefficients. Experiments on face recognition, person re-identification and texture classification show that the proposed approach outperforms several recent methods, such as Tensor Sparse Coding, Histogram Plus Epitome, Riemannian Locality Preserving Projection and Relational Divergence Classification.

Improved image set classification via joint sparse approximated nearest subspaces

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).

Brain decoding based on functional magnetic resonance imaging using machine learning : a comparative study

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain decoding of functional Magnetic Resonance Imaging data is a pattern analysis task that links brain activity patterns to the experimental conditions. Classifiers predict the neural states from the spatial and temporal pattern of brain activity extracted from multiple voxels in the functional images in a certain period of time. The prediction results offer insight into the nature of neural representations and cognitive mechanisms and the classification accuracy determines our confidence in understanding the relationship between brain activity and stimuli. In this paper, we compared the efficacy of three machine learning algorithms: neural network, support vector machines, and conditional random field to decode the visual stimuli or neural cognitive states from functional Magnetic Resonance data. Leave-one-out cross validation was performed to quantify the generalization accuracy of each algorithm on unseen data. The results indicated support vector machine and conditional random field have comparable performance and the potential of the latter is worthy of further investigation.

A pilot study on affective classification of facial images for emerging news topics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.

Untangling the evidence : introducing an empirical model for evidence-based library and information practice

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction This research is the first to investigate the experiences of teacher-librarians as evidence-based practice. An empirically derived model is presented in this paper. Method This qualitative study utilised the expanded critical incident approach, and investigated the real-life experiences of fifteen Australian teacher-librarians, through semi-structured interviews and inductive data analysis. Data collection utilised semi-structured interviews, on-site observations, journaling and the rubric for contextual information. These approaches allowed each of the interviewees to tell their own story and provided richness to the data. Analysis The analysis involved two types of data categorisation: binary and thematic. Binary classification was used to identify factual details. Thematic analysis involved categorising the emerging themes. Results An empirically derived model for evidence-based practice was devised and associated critical findings identified. The results demonstrate that evidence-based practice for teacher-librarians is a holistic practice. It is not a linear, step-by-step process. Conclusions This study is significant for teacher-librarians and library and information professionals as it provides new understanding of evidence-based practice.

Locality-sensitive hashing for protein classification

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Determination of sequence similarity is a central issue in computational biology, a problem addressed primarily through BLAST, an alignment based heuristic which has underpinned much of the analysis and annotation of the genomic era. Despite their success, alignment-based approaches scale poorly with increasing data set size, and are not robust under structural sequence rearrangements. Successive waves of innovation in sequencing technologies – so-called Next Generation Sequencing (NGS) approaches – have led to an explosion in data availability, challenging existing methods and motivating novel approaches to sequence representation and similarity scoring, including adaptation of existing methods from other domains such as information retrieval. In this work, we investigate locality-sensitive hashing of sequences through binary document signatures, applying the method to a bacterial protein classification task. Here, the goal is to predict the gene family to which a given query protein belongs. Experiments carried out on a pair of small but biologically realistic datasets (the full protein repertoires of families of Chlamydia and Staphylococcus aureus genomes respectively) show that a measure of similarity obtained by locality sensitive hashing gives highly accurate results while offering a number of avenues which will lead to substantial performance improvements over BLAST..

Deriving a preference-based utility measure for cancer patients from the European Organisation for the Research and Treatment of Cancer's Quality of Life Questionnaire C30 : a confirmatory versus exploratory approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Multi attribute utility instruments (MAUIs) are preference-based measures that comprise a health state classification system (HSCS) and a scoring algorithm that assigns a utility value to each health state in the HSCS. When developing a MAUI from a health-related quality of life (HRQOL) questionnaire, first a HSCS must be derived. This typically involves selecting a subset of domains and items because HRQOL questionnaires typically have too many items to be amendable to the valuation task required to develop the scoring algorithm for a MAUI. Currently, exploratory factor analysis (EFA) followed by Rasch analysis is recommended for deriving a MAUI from a HRQOL measure. Aim To determine whether confirmatory factor analysis (CFA) is more appropriate and efficient than EFA to derive a HSCS from the European Organisation for the Research and Treatment of Cancer’s core HRQOL questionnaire, Quality of Life Questionnaire (QLQ-C30), given its well-established domain structure. Methods QLQ-C30 (Version 3) data were collected from 356 patients receiving palliative radiotherapy for recurrent/metastatic cancer (various primary sites). The dimensional structure of the QLQ-C30 was tested with EFA and CFA, the latter informed by the established QLQ-C30 structure and views of both patients and clinicians on which are the most relevant items. Dimensions determined by EFA or CFA were then subjected to Rasch analysis. Results CFA results generally supported the proposed QLQ-C30 structure (comparative fit index =0.99, Tucker–Lewis index =0.99, root mean square error of approximation =0.04). EFA revealed fewer factors and some items cross-loaded on multiple factors. Further assessment of dimensionality with Rasch analysis allowed better alignment of the EFA dimensions with those detected by CFA. Conclusion CFA was more appropriate and efficient than EFA in producing clinically interpretable results for the HSCS for a proposed new cancer-specific MAUI. Our findings suggest that CFA should be recommended generally when deriving a preference-based measure from a HRQOL measure that has an established domain structure.

Injury narrative text classification : a preliminary study

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes probabilistic method to build classifiers for mapping. In this paper, we focus on providing guidance on the selection of a classification method. We build a number of classifiers belonging to different classification families such as decision tree, probabilistic, neural networks, and instance-based, ensemble-based and kernel-based linear classifiers. An extensive pre-processing is carried out to ensure the quality of data and, in hence, the quality classification outcome. The records with a null entry in injury description are removed. The misspelling correction process is carried out by finding and replacing the misspelt word with a soundlike word. Meaningful phrases have been identified and kept, instead of removing the part of phrase as a stop word. The abbreviations appearing in many forms of entry are manually identified and only one form of abbreviations is used. Clustering is utilised to discriminate between non-frequent and frequent terms. This process reduced the number of text features dramatically from about 28,000 to 5000. The medical narrative text injury dataset, under consideration, is composed of many short documents. The data can be characterized as high-dimensional and sparse, i.e., few features are irrelevant but features are correlated with one another. Therefore, Matrix factorization techniques such as Singular Value Decomposition (SVD) and Non Negative Matrix Factorization (NNMF) have been used to map the processed feature space to a lower-dimensional feature space. Classifiers with these reduced feature space have been built. In experiments, a set of tests are conducted to reflect which classification method is best for the medical text classification. The Non Negative Matrix Factorization with Support Vector Machine method can achieve 93% precision which is higher than all the tested traditional classifiers. We also found that TF/IDF weighting which works well for long text classification is inferior to binary weighting in short document classification. Another finding is that the Top-n terms should be removed in consultation with medical experts, as it affects the classification performance.

«
1
2
3
4
5
6
7
8
9
10
11
»