393 resultados para Features selection
Resumo:
This study assesses the recently proposed data-driven background dataset refinement technique for speaker verification using alternate SVM feature sets to the GMM supervector features for which it was originally designed. The performance improvements brought about in each trialled SVM configuration demonstrate the versatility of background dataset refinement. This work also extends on the originally proposed technique to exploit support vector coefficients as an impostor suitability metric in the data-driven selection process. Using support vector coefficients improved the performance of the refined datasets in the evaluation of unseen data. Further, attempts are made to exploit the differences in impostor example suitability measures from varying features spaces to provide added robustness.
Resumo:
Gabor representations have been widely used in facial analysis (face recognition, face detection and facial expression detection) due to their biological relevance and computational properties. Two popular Gabor representations used in literature are: 1) Log-Gabor and 2) Gabor energy filters. Even though these representations are somewhat similar, they also have distinct differences as the Log-Gabor filters mimic the simple cells in the visual cortex while the Gabor energy filters emulate the complex cells, which causes subtle differences in the responses. In this paper, we analyze the difference between these two Gabor representations and quantify these differences on the task of facial action unit (AU) detection. In our experiments conducted on the Cohn-Kanade dataset, we report an average area underneath the ROC curve (A`) of 92.60% across 17 AUs for the Gabor energy filters, while the Log-Gabor representation achieved an average A` of 96.11%. This result suggests that small spatial differences that the Log-Gabor filters pick up on are more useful for AU detection than the differences in contours and edges that the Gabor energy filters extract.
Resumo:
Facial expression is an important channel for human communication and can be applied in many real applications. One critical step for facial expression recognition (FER) is to accurately extract emotional features. Current approaches on FER in static images have not fully considered and utilized the features of facial element and muscle movements, which represent static and dynamic, as well as geometric and appearance characteristics of facial expressions. This paper proposes an approach to solve this limitation using ‘salient’ distance features, which are obtained by extracting patch-based 3D Gabor features, selecting the ‘salient’ patches, and performing patch matching operations. The experimental results demonstrate high correct recognition rate (CRR), significant performance improvements due to the consideration of facial element and muscle movements, promising results under face registration errors, and fast processing time. The comparison with the state-of-the-art performance confirms that the proposed approach achieves the highest CRR on the JAFFE database and is among the top performers on the Cohn-Kanade (CK) database.
Resumo:
Single particle analysis (SPA) coupled with high-resolution electron cryo-microscopy is emerging as a powerful technique for the structure determination of membrane protein complexes and soluble macromolecular assemblies. Current estimates suggest that ∼104–105 particle projections are required to attain a 3 Å resolution 3D reconstruction (symmetry dependent). Selecting this number of molecular projections differing in size, shape and symmetry is a rate-limiting step for the automation of 3D image reconstruction. Here, we present SwarmPS, a feature rich GUI based software package to manage large scale, semi-automated particle picking projects. The software provides cross-correlation and edge-detection algorithms. Algorithm-specific parameters are transparently and automatically determined through user interaction with the image, rather than by trial and error. Other features include multiple image handling (∼102), local and global particle selection options, interactive image freezing, automatic particle centering, and full manual override to correct false positives and negatives. SwarmPS is user friendly, flexible, extensible, fast, and capable of exporting boxed out projection images, or particle coordinates, compatible with downstream image processing suites.
Resumo:
Continuous user authentication with keystroke dynamics uses characters sequences as features. Since users can type characters in any order, it is imperative to find character sequences (n-graphs) that are representative of user typing behavior. The contemporary feature selection approaches do not guarantee selecting frequently-typed features which may cause less accurate statistical user-representation. Furthermore, the selected features do not inherently reflect user typing behavior. We propose four statistical based feature selection techniques that mitigate limitations of existing approaches. The first technique selects the most frequently occurring features. The other three consider different user typing behaviors by selecting: n-graphs that are typed quickly; n-graphs that are typed with consistent time; and n-graphs that have large time variance among users. We use Gunetti’s keystroke dataset and k-means clustering algorithm for our experiments. The results show that among the proposed techniques, the most-frequent feature selection technique can effectively find user representative features. We further substantiate our results by comparing the most-frequent feature selection technique with three existing approaches (popular Italian words, common n-graphs, and least frequent ngraphs). We find that it performs better than the existing approaches after selecting a certain number of most-frequent n-graphs.
Resumo:
The widespread development of Decision Support System (DSS) in construction indicate that the evaluation of software become more important than before. However, it is identified that most research in construction discipline did not attempt to assess its usability. Therefore, little is known about the approach on how to properly evaluate a DSS for specific problem. In this paper, we present a practical framework that can be guidance for DSS evaluation. It focuses on how to evaluate software that is dedicatedly designed for consultant selection problem. The framework features two main components i.e. Sub-system Validation and Face Validation. Two case studies of consultant selection at Malaysian Department of Irrigation and Drainage were integrated in this framework. Some inter-disciplinary area such as Software Engineering, Human Computer Interaction (HCI) and Construction Project Management underpinned the discussion of the paper. It is anticipated that this work can foster better DSS development and quality decision making that accurately meet the client’s expectation and needs
Resumo:
Feature extraction and selection are critical processes in developing facial expression recognition (FER) systems. While many algorithms have been proposed for these processes, direct comparison between texture, geometry and their fusion, as well as between multiple selection algorithms has not been found for spontaneous FER. This paper addresses this issue by proposing a unified framework for a comparative study on the widely used texture (LBP, Gabor and SIFT) and geometric (FAP) features, using Adaboost, mRMR and SVM feature selection algorithms. Our experiments on the Feedtum and NVIE databases demonstrate the benefits of fusing geometric and texture features, where SIFT+FAP shows the best performance, while mRMR outperforms Adaboost and SVM. In terms of computational time, LBP and Gabor perform better than SIFT. The optimal combination of SIFT+FAP+mRMR also exhibits a state-of-the-art performance.
Resumo:
The favourable scaffold for bone tissue engineering should have desired characteristic features, such as adequate mechanical strength and three-dimensional open porosity, which guarantee a suitable environment for tissue regeneration. In fact, the design of such complex structures like bone scaffolds is a challenge for investigators. One of the aims is to achieve the best possible mechanical strength-degradation rate ratio. In this paper we attempt to use numerical modelling to evaluate material properties for designing bone tissue engineering scaffold fabricated via the fused deposition modelling technique. For our studies the standard genetic algorithm was used, which is an efficient method of discrete optimization. For the fused deposition modelling scaffold, each individual strut is scrutinized for its role in the architecture and structural support it provides for the scaffold, and its contribution to the overall scaffold was studied. The goal of the study was to create a numerical tool that could help to acquire the desired behaviour of tissue engineered scaffolds and our results showed that this could be achieved efficiently by using different materials for individual struts. To represent a great number of ways in which scaffold mechanical function loss could proceed, the exemplary set of different desirable scaffold stiffness loss function was chosen. © 2012 John Wiley & Sons, Ltd.
Resumo:
Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers
Resumo:
Physical and chemical properties of biodiesel are influenced by structural features of the fatty acids, such as chain length, degree of unsaturation and branching of the carbon chain. This study investigated if microalgal fatty acid profiles are suitable for biodiesel characterization and species selection through Preference Ranking Organisation Method for Enrichment Evaluation (PROMETHEE) and Graphical Analysis for Interactive Assistance (GAIA) analysis. Fatty acid methyl ester (FAME) profiles were used to calculate the likely key chemical and physical properties of the biodiesel [cetane number (CN), iodine value (IV), cold filter plugging point, density, kinematic viscosity, higher heating value] of nine microalgal species (this study) and twelve species from the literature, selected for their suitability for cultivation in subtropical climates. An equal-parameter weighted (PROMETHEE-GAIA) ranked Nannochloropsis oculata, Extubocellulus sp. and Biddulphia sp. highest; the only species meeting the EN14214 and ASTM D6751-02 biodiesel standards, except for the double bond limit in the EN14214. Chlorella vulgaris outranked N. oculata when the twelve microalgae were included. Culture growth phase (stationary) and, to a lesser extent, nutrient provision affected CN and IV values of N. oculata due to lower eicosapentaenoic acid (EPA) contents. Application of a polyunsaturated fatty acid (PUFA) weighting to saturation led to a lower ranking of species exceeding the double bond EN14214 thresholds. In summary, CN, IV, C18:3 and double bond limits were the strongest drivers in equal biodiesel parameter-weighted PROMETHEE analysis.
Resumo:
Clustering identities in a broadcast video is a useful task to aid in video annotation and retrieval. Quality based frame selection is a crucial task in video face clustering, to both improve the clustering performance and reduce the computational cost. We present a frame work that selects the highest quality frames available in a video to cluster the face. This frame selection technique is based on low level and high level features (face symmetry, sharpness, contrast and brightness) to select the highest quality facial images available in a face sequence for clustering. We also consider the temporal distribution of the faces to ensure that selected faces are taken at times distributed throughout the sequence. Normalized feature scores are fused and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face clustering system. We present a news video database to evaluate the clustering system performance. Experiments on the newly created news database show that the proposed method selects the best quality face images in the video sequence, resulting in improved clustering performance.
Resumo:
As of today, online reviews have become more and more important in decision making process. In recent years, the problem of identifying useful reviews for users has attracted significant attentions. For instance, in order to select reviews that focus on a particular feature, researchers proposed a method which extracts all associated words of this feature as the relevant information to evaluate and find appropriate reviews. However, the extraction of associated words is not that accurate due to the noise in free review text, and this affects the overall performance negatively. In this paper, we propose a method to select reviews according to a given feature by using a review model generated based upon a domain ontology called product feature taxonomy. The proposed review model provides relevant information about the hierarchical relationships of the features in the review which captures the review characteristics accurately. Our experiment results based on real world review dataset show that our approach is able to improve the review selection performance according to the given criteria effectively.
Resumo:
Selection of features that will permit accurate pattern classification is a difficult task. However, if a particular data set is represented by discrete valued features, it becomes possible to determine empirically the contribution that each feature makes to the discrimination between classes. This paper extends the discrimination bound method so that both the maximum and average discrimination expected on unseen test data can be estimated. These estimation techniques are the basis of a backwards elimination algorithm that can be use to rank features in order of their discriminative power. Two problems are used to demonstrate this feature selection process: classification of the Mushroom Database, and a real-world, pregnancy related medical risk prediction task - assessment of risk of perinatal death.
Resumo:
We propose expected attainable discrimination (EAD) as a measure to select discrete valued features for reliable discrimination between two classes of data. EAD is an average of the area under the ROC curves obtained when a simple histogram probability density model is trained and tested on many random partitions of a data set. EAD can be incorporated into various stepwise search methods to determine promising subsets of features, particularly when misclassification costs are difficult or impossible to specify. Experimental application to the problem of risk prediction in pregnancy is described.
Resumo:
As of today, user-generated information such as online reviews has become increasingly significant for customers in decision making process. Meanwhile, as the volume of online reviews proliferates, there is an insistent demand to help the users tackle the information overload problem. In order to extract useful information from overwhelming reviews, considerable work has been proposed such as review summarization and review selection. Particularly, to avoid the redundant information, researchers attempt to select a small set of reviews to represent the entire review corpus by preserving its statistical properties (e.g., opinion distribution). However, one significant drawback of the existing works is that they only measure the utility of the extracted reviews as a whole without considering the quality of each individual review. As a result, the set of chosen reviews may consist of low-quality ones even its statistical property is close to that of the original review corpus, which is not preferred by the users. In this paper, we proposed a review selection method which takes review quality into consideration during the selection process. Specifically, we examine the relationships between product features based upon a domain ontology to capture the review characteristics based on which to select reviews that have good quality and preserve the opinion distribution as well. Our experimental results based on real world review datasets demonstrate that our proposed approach is feasible and able to improve the performance of the review selection effectively.