74 resultados para Feature Descriptors
Resumo:
This paper proposes max separation clustering (MSC), a new non-hierarchical clustering method used for feature extraction from optical emission spectroscopy (OES) data for plasma etch process control applications. OES data is high dimensional and inherently highly redundant with the result that it is difficult if not impossible to recognize useful features and key variables by direct visualization. MSC is developed for clustering variables with distinctive patterns and providing effective pattern representation by a small number of representative variables. The relationship between signal-to-noise ratio (SNR) and clustering performance is highlighted, leading to a requirement that low SNR signals be removed before applying MSC. Experimental results on industrial OES data show that MSC with low SNR signal removal produces effective summarization of the dominant patterns in the data.
Resumo:
N-gram analysis is an approach that investigates the structure of a program using bytes, characters, or text strings. A key issue with N-gram analysis is feature selection amidst the explosion of features that occurs when N is increased. The experiments within this paper represent programs as operational code (opcode) density histograms gained through dynamic analysis. A support vector machine is used to create a reference model, which is used to evaluate two methods of feature reduction, which are 'area of intersect' and 'subspace analysis using eigenvectors.' The findings show that the relationships between features are complex and simple statistics filtering approaches do not provide a viable approach. However, eigenvector subspace analysis produces a suitable filter.
Resumo:
Call control features (e.g., call-divert, voice-mail) are primitive options to which users can subscribe off-line to personalise their service. The configuration of a feature subscription involves choosing and sequencing features from a catalogue and is subject to constraints that prevent undesirable feature interactions at run-time. When the subscription requested by a user is inconsistent, one problem is to find an optimal relaxation, which is a generalisation of the feedback vertex set problem on directed graphs, and thus it is an NP-hard task. We present several constraint programming formulations of the problem. We also present formulations using partial weighted maximum Boolean satisfiability and mixed integer linear programming. We study all these formulations by experimentally comparing them on a variety of randomly generated instances of the feature subscription problem.
Resumo:
Literature data on the toxicity of chlorophenols for three luminescent bacteria (Vibrio fischeri, and the lux-marked Pseudomonas fluorescens 10586s pUCD607 and Burkholderia spp. RASC c2 (Tn4431)) have been analyzed in relation to a set of computed molecular physico-chemical properties. The quantitative structure-toxicity relationships of the compounds in each species showed marked differences when based upon semi-empirical molecular-orbital molecular and atom based properties. For mono-, di- and tri-chlorophenols multiple linear regression analysis of V. fischeri toxicity showed a good correlation with the solvent accessible surface area and the charge on the oxygen atom. This correlation successfully predicted the toxicity of the heavily chlorinated phenols, suggesting in V. fischeri only one overall mechanism is present for all chlorophenols. Good correlations were also found for RASC c2 with molecular properties, such as the surface area and the nucleophilic super-delocalizability of the oxygen. In contrast the best QSTR for P. fluorescens contained the 2nd order connectivity index and ELUMO suggesting a different, more reactive mechanism. Cross-species correlations were examined, and between V. fischeri and RASC c2 the inclusion of the minimum value of the nucleophilic susceptibility on the ring carbons produced good results. Poorer correlations were found with P. fluorescens highlighting the relative similarity of V. fischeri and RASC c2, in contrast to that of P. fluorescens.
Resumo:
This paper considers the separation and recognition of overlapped speech sentences assuming single-channel observation. A system based on a combination of several different techniques is proposed. The system uses a missing-feature approach for improving crosstalk/noise robustness, a Wiener filter for speech enhancement, hidden Markov models for speech reconstruction, and speaker-dependent/-independent modeling for speaker and speech recognition. We develop the system on the Speech Separation Challenge database, involving a task of separating and recognizing two mixing sentences without assuming advanced knowledge about the identity of the speakers nor about the signal-to-noise ratio. The paper is an extended version of a previous conference paper submitted for the challenge.
Resumo:
Bayesian probabilistic analysis offers a new approach to characterize semantic representations by inferring the most likely feature structure directly from the patterns of brain activity. In this study, infinite latent feature models [1] are used to recover the semantic features that give rise to the brain activation vectors when people think about properties associated with 60 concrete concepts. The semantic features recovered by ILFM are consistent with the human ratings of the shelter, manipulation, and eating factors that were recovered by a previous factor analysis. Furthermore, different areas of the brain encode different perceptual and conceptual features. This neurally-inspired semantic representation is consistent with some existing conjectures regarding the role of different brain areas in processing different semantic and perceptual properties. © 2012 Springer-Verlag.