84 resultados para Feature extraction
Resumo:
We analyze the spectral zero-crossing rate (SZCR) properties of transient signals and show that SZCR contains accurate localization information about the transient. For a train of pulses containing transient events, the SZCR computed on a sliding window basis is useful in locating the impulse locations accurately. We present the properties of SZCR on standard stylized signal models and then show how it may be used to estimate the epochs in speech signals. We also present comparisons with some state-of-the-art techniques that are based on the group-delay function. Experiments on real speech show that the proposed SZCR technique is better than other group-delay-based epoch detectors. In the presence of noise, a comparison with the zero-frequency filtering technique (ZFF) and Dynamic programming projected Phase-Slope Algorithm (DYPSA) showed that performance of the SZCR technique is better than DYPSA and inferior to that of ZFF. For highpass-filtered speech, where ZFF performance suffers drastically, the identification rates of SZCR are better than those of DYPSA.
Resumo:
Classification of a large document collection involves dealing with a huge feature space where each distinct word is a feature. In such an environment, classification is a costly task both in terms of running time and computing resources. Further it will not guarantee optimal results because it is likely to overfit by considering every feature for classification. In such a context, feature selection is inevitable. This work analyses the feature selection methods, explores the relations among them and attempts to find a minimal subset of features which are discriminative for document classification.
Resumo:
In this paper, we present a methodology for identifying best features from a large feature space. In high dimensional feature space nearest neighbor search is meaningless. In this feature space we see quality and performance issue with nearest neighbor search. Many data mining algorithms use nearest neighbor search. So instead of doing nearest neighbor search using all the features we need to select relevant features. We propose feature selection using Non-negative Matrix Factorization(NMF) and its application to nearest neighbor search. Recent clustering algorithm based on Locally Consistent Concept Factorization(LCCF) shows better quality of document clustering by using local geometrical and discriminating structure of the data. By using our feature selection method we have shown further improvement of performance in the clustering.
Resumo:
Outlier detection in high dimensional categorical data has been a problem of much interest due to the extensive use of qualitative features for describing the data across various application areas. Though there exist various established methods for dealing with the dimensionality aspect through feature selection on numerical data, the categorical domain is actively being explored. As outlier detection is generally considered as an unsupervised learning problem due to lack of knowledge about the nature of various types of outliers, the related feature selection task also needs to be handled in a similar manner. This motivates the need to develop an unsupervised feature selection algorithm for efficient detection of outliers in categorical data. Addressing this aspect, we propose a novel feature selection algorithm based on the mutual information measure and the entropy computation. The redundancy among the features is characterized using the mutual information measure for identifying a suitable feature subset with less redundancy. The performance of the proposed algorithm in comparison with the information gain based feature selection shows its effectiveness for outlier detection. The efficacy of the proposed algorithm is demonstrated on various high-dimensional benchmark data sets employing two existing outlier detection methods.
Resumo:
Clustering has been the most popular method for data exploration. Clustering is partitioning the data set into sub-partitions based on some measures say the distance measure, each partition has its own significant information. There are a number of algorithms explored for this purpose, one such algorithm is the Particle Swarm Optimization(PSO) which is a population based heuristic search technique derived from swarm intelligence. In this paper we present an improved version of the Particle Swarm Optimization where, each feature of the data set is given significance accordingly by adding some random weights, which also minimizes the distortions in the dataset if any. The performance of the above proposed algorithm is evaluated using some benchmark datasets from Machine Learning Repository. The experimental results shows that our proposed methodology performs significantly better than the previously performed experiments.
Resumo:
Identifying symmetry in scalar fields is a recent area of research in scientific visualization and computer graphics communities. Symmetry detection techniques based on abstract representations of the scalar field use only limited geometric information in their analysis. Hence they may not be suited for applications that study the geometric properties of the regions in the domain. On the other hand, methods that accumulate local evidence of symmetry through a voting procedure have been successfully used for detecting geometric symmetry in shapes. We extend such a technique to scalar fields and use it to detect geometrically symmetric regions in synthetic as well as real-world datasets. Identifying symmetry in the scalar field can significantly improve visualization and interactive exploration of the data. We demonstrate different applications of the symmetry detection method to scientific visualization: query-based exploration of scalar fields, linked selection in symmetric regions for interactive visualization, and classification of geometrically symmetric regions and its application to anomaly detection.
Resumo:
We consider the problem of finding the best features for value function approximation in reinforcement learning and develop an online algorithm to optimize the mean square Bellman error objective. For any given feature value, our algorithm performs gradient search in the parameter space via a residual gradient scheme and, on a slower timescale, also performs gradient search in the Grassman manifold of features. We present a proof of convergence of our algorithm. We show empirical results using our algorithm as well as a similar algorithm that uses temporal difference learning in place of the residual gradient scheme for the faster timescale updates.
Resumo:
Data clustering groups data so that data which are similar to each other are in the same group and data which are dissimilar to each other are in different groups. Since generally clustering is a subjective activity, it is possible to get different clusterings of the same data depending on the need. This paper attempts to find the best clustering of the data by first carrying out feature selection and using only the selected features, for clustering. A PSO (Particle Swarm Optimization)has been used for clustering but feature selection has also been carried out simultaneously. The performance of the above proposed algorithm is evaluated on some benchmark data sets. The experimental results shows the proposed methodology outperforms the previous approaches such as basic PSO and Kmeans for the clustering problem.
Resumo:
Acoustic signal variation and female preference for different signal components constitute the prerequisite framework to study the mechanisms of sexual selection that shape acoustic communication. Despite several studies of acoustic communication in crickets, information on both male calling song variation in the field and female preference in the same system is lacking for most species. Previous studies on acoustic signal variation either were carried out on populations maintained in the laboratory or did not investigate signal repeatability. We therefore used repeatability analysis to quantify variation in the spectral, temporal and amplitudinal characteristics of the male calling song of the field cricket Plebeiogryllus guttiventris in a wild population, at two temporal scales, within and across nights. Carrier frequency (CF) was the most repeatable character across nights, whereas chirp period (CP) had low repeatability across nights. We investigated whether female preferences were more likely to be based on features with high (CF) or low (CP) repeatability. Females showed no consistent preferences for CF but were significantly more attracted towards signals with short CPs. The attractiveness of lower CP calls disappeared, however, when traded off with sound pressure level (SPL). SPL was the only acoustic feature that was significantly positively correlated with male body size. Since relative SPL affects female phonotaxis strongly and can vary unpredictably based on male spacing, our results suggest that even strong female preferences for acoustic features may not necessarily translate into greater advantage for males possessing these features in the field. (C) 2013 The Association for the Study of Animal Behaviour. Published by Elsevier Ltd. All rights reserved.
Resumo:
Epoch is defined as the instant of significant excitation within a pitch period of voiced speech. Epoch extraction continues to attract the interest of researchers because of its significance in speech analysis. Existing high performance epoch extraction algorithms require either dynamic programming techniques or a priori information of the average pitch period. An algorithm without such requirements is proposed based on integrated linear prediction residual (ILPR) which resembles the voice source signal. Half wave rectified and negated ILPR (or Hilbert transform of ILPR) is used as the pre-processed signal. A new non-linear temporal measure named the plosion index (PI) has been proposed for detecting `transients' in speech signal. An extension of PI, called the dynamic plosion index (DPI) is applied on pre-processed signal to estimate the epochs. The proposed DPI algorithm is validated using six large databases which provide simultaneous EGG recordings. Creaky and singing voice samples are also analyzed. The algorithm has been tested for its robustness in the presence of additive white and babble noise and on simulated telephone quality speech. The performance of the DPI algorithm is found to be comparable or better than five state-of-the-art techniques for the experiments considered.
Resumo:
Variable Endmember Constrained Least Square (VECLS) technique is proposed to account endmember variability in the linear mixture model by incorporating the variance for each class, the signals of which varies from pixel to pixel due to change in urban land cover (LC) structures. VECLS is first tested with a computer simulated three class endmember considering four bands having small, medium and large variability with three different spatial resolutions. The technique is next validated with real datasets of IKONOS, Landsat ETM+ and MODIS. The results show that correlation between actual and estimated proportion is higher by an average of 0.25 for the artificial datasets compared to a situation where variability is not considered. With IKONOS, Landsat ETM+ and MODIS data, the average correlation increased by 0.15 for 2 and 3 classes and by 0.19 for 4 classes, when compared to single endmember per class. (C) 2013 COSPAR. Published by Elsevier Ltd. All rights reserved.
Resumo:
Non-human primate populations, other than responding appropriately to naturally occurring challenges, also need to cope with anthropogenic factors such as environmental pollution, resource depletion, and habitat destruction. Populations and individuals are likely to show considerable variations in food extraction abilities, with some populations and individuals more efficient than others at exploiting a set of resources. In this study, we examined among urban free-ranging bonnet macaques, Macaca radiata (a) local differences in food extraction abilities, (b) between-individual variation and within-individual consistency in problem-solving success and the underlying problem-solving characteristics, and (c) behavioral patterns associated with higher efficiency in food extraction. When presented with novel food extraction tasks, the urban macaques having more frequent exposure to novel physical objects in their surroundings, extracted food material from PET bottles and also solved another food extraction task (i.e., extracting an orange from a wire mesh box), more often than those living under more natural conditions. Adults solved the tasks more frequently than juveniles, and females more frequently than males. Both solution-technique and problem-solving characteristics varied across individuals but remained consistent within each individual across the successive presentations of PET bottles. The macaques that solved the tasks showed lesser within-individual variation in their food extraction behavior as compared to those that failed to solve the tasks. A few macaques appropriately modified their problem-solving behavior in accordance with the task requirements and solved the modified versions of the tasks without trial-and-error learning. These observations are ecologically relevant - they demonstrate considerable local differences in food extraction abilities, between-individual variation and within-individual consistency in food extraction techniques among free-ranging bonnet macaques, possibly affecting the species' local adaptability and resilience to environmental changes.
Resumo:
The direct and accurate determination of heteronuclear ((n)J(HX), X = F-19, P-31) couplings from the one dimensional H-1-NMR spectrum is severely hampered due to the simultaneous presence of large numbers of (n)J(HH). The present study demonstrates the utility of the pure shift NMR approach for spectral simplification, and precise and direct measurement of heteronuclear couplings. As a consequence of refocusing of homonuclear couplings ((n)J(HH)) by the pure shift NMR, only heteronuclear couplings ((n)J(HX)) appear as simple multiplets at the resonance position of each chemically non-equivalent proton, enabling their direct measurement from the 1D-H-1 spectrum. The experiment is demonstrated on a number of molecules containing either F-19 or P-31, where (n)J(HF) and (n)J(HP) could be precisely measured in a straightforward manner. The distinct advantage of the experiment is demonstrated on molecules containing more than one fluorine atom, where most of the available NMR experiments fail or have restricted utility.
Resumo:
Segregating the dynamics of gate bias induced threshold voltage shift, and in particular, charge trapping in thin film transistors (TFTs) based on time constants provides insight into the different mechanisms underlying TFTs instability. In this Letter we develop a representation of the time constants and model the magnitude of charge trapped in the form of an equivalent density of created trap states. This representation is extracted from the Fourier spectrum of the dynamics of charge trapping. Using amorphous In-Ga-Zn-O TFTs as an example, the charge trapping was modeled within an energy range of Delta E-t approximate to 0.3 eV and with a density of state distribution as D-t(Et-j) = D-t0 exp(-Delta E-t/kT) with D-t0 = 5.02 x 10(11) cm(-2) eV(-1). Such a model is useful for developing simulation tools for circuit design. (C) 2014 AIP Publishing LLC.
Resumo:
The accurate solution of 3D full-wave Method of Moments (MoM) on an arbitrary mesh of a package-board structure does not guarantee accuracy, since the discretizations may not be fine enough to capture rapid spatial changes in the solution variable. At the same time, uniform over-meshing on the entire structure generates large number of solution variables and therefore requires an unnecessarily large matrix solution. In this work, a suitable refinement criterion for MoM based electromagnetic package-board extraction is proposed and the advantages of the adaptive strategy are demonstrated from both accuracy and speed perspectives.