903 resultados para Pattern classification
Resumo:
Design and development of a piezoelectric polyvinylidene fluoride (PVDF) thin film based nasal sensor to monitor human respiration pattern (RP) from each nostril simultaneously is presented in this paper. Thin film based PVDF nasal sensor is designed in a cantilever beam configuration. Two cantilevers are mounted on a spectacle frame in such a way that the air flow from each nostril impinges on this sensor causing bending of the cantilever beams. Voltage signal produced due to air flow induced dynamic piezoelectric effect produce a respective RP. A group of 23 healthy awake human subjects are studied. The RP in terms of respiratory rate (RR) and Respiratory air-flow changes/alterations obtained from the developed PVDF nasal sensor are compared with RP obtained from respiratory inductance plethysmograph (RIP) device. The mean RR of the developed nasal sensor (19.65 +/- A 4.1) and the RIP (19.57 +/- A 4.1) are found to be almost same (difference not significant, p > 0.05) with the correlation coefficient 0.96, p < 0.0001. It was observed that any change/alterations in the pattern of RIP is followed by same amount of change/alterations in the pattern of PVDF nasal sensor with k = 0.815 indicating strong agreement between the PVDF nasal sensor and RIP respiratory air-flow pattern. The developed sensor is simple in design, non-invasive, patient friendly and hence shows promising routine clinical usage. The preliminary result shows that this new method can have various applications in respiratory monitoring and diagnosis.
Resumo:
Frequent episode discovery is a popular framework for pattern discovery from sequential data. It has found many applications in domains like alarm management in telecommunication networks, fault analysis in the manufacturing plants, predicting user behavior in web click streams and so on. In this paper, we address the discovery of serial episodes. In the episodes context, there have been multiple ways to quantify the frequency of an episode. Most of the current algorithms for episode discovery under various frequencies are apriori-based level-wise methods. These methods essentially perform a breadth-first search of the pattern space. However currently there are no depth-first based methods of pattern discovery in the frequent episode framework under many of the frequency definitions. In this paper, we try to bridge this gap. We provide new depth-first based algorithms for serial episode discovery under non-overlapped and total frequencies. Under non-overlapped frequency, we present algorithms that can take care of span constraint and gap constraint on episode occurrences. Under total frequency we present an algorithm that can handle span constraint. We provide proofs of correctness for the proposed algorithms. We demonstrate the effectiveness of the proposed algorithms by extensive simulations. We also give detailed run-time comparisons with the existing apriori-based methods and illustrate scenarios under which the proposed pattern-growth algorithms perform better than their apriori counterparts. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Myopathies are muscular diseases in which muscle fibers degenerate due to many factors such as nutrient deficiency, infection and mutations in myofibrillar etc. The objective of this study is to identify the bio-markers to distinguish various muscle mutants in Drosophila (fruit fly) using Raman Spectroscopy. Principal Components based Linear Discriminant Analysis (PC-LDA) classification model yielding >95% accuracy was developed to classify such different mutants representing various myopathies according to their physiopathology.
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.
Resumo:
Sparse representation based classification (SRC) is one of the most successful methods that has been developed in recent times for face recognition. Optimal projection for Sparse representation based classification (OPSRC)1] provides a dimensionality reduction map that is supposed to give optimum performance for SRC framework. However, the computational complexity involved in this method is too high. Here, we propose a new projection technique using the data scatter matrix which is computationally superior to the optimal projection method with comparable classification accuracy with respect OPSRC. The performance of the proposed approach is benchmarked with various publicly available face database.
Resumo:
In this paper, we consider the setting of the pattern maximum likelihood (PML) problem studied by Orlitsky et al. We present a well-motivated heuristic algorithm for deciding the question of when the PML distribution of a given pattern is uniform. The algorithm is based on the concept of a ``uniform threshold''. This is a threshold at which the uniform distribution exhibits an interesting phase transition in the PML problem, going from being a local maximum to being a local minimum.
Resumo:
Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.
Resumo:
Elastic Net Regularizers have shown much promise in designing sparse classifiers for linear classification. In this work, we propose an alternating optimization approach to solve the dual problems of elastic net regularized linear classification Support Vector Machines (SVMs) and logistic regression (LR). One of the sub-problems turns out to be a simple projection. The other sub-problem can be solved using dual coordinate descent methods developed for non-sparse L2-regularized linear SVMs and LR, without altering their iteration complexity and convergence properties. Experiments on very large datasets indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models.
Resumo:
Establishing functional relationships between multi-domain protein sequences is a non-trivial task. Traditionally, delineating functional assignment and relationships of proteins requires domain assignments as a prerequisite. This process is sensitive to alignment quality and domain definitions. In multi-domain proteins due to multiple reasons, the quality of alignments is poor. We report the correspondence between the classification of proteins represented as full-length gene products and their functions. Our approach differs fundamentally from traditional methods in not performing the classification at the level of domains. Our method is based on an alignment free local matching scores (LMS) computation at the amino-acid sequence level followed by hierarchical clustering. As there are no gold standards for full-length protein sequence classification, we resorted to Gene Ontology and domain-architecture based similarity measures to assess our classification. The final clusters obtained using LMS show high functional and domain architectural similarities. Comparison of the current method with alignment based approaches at both domain and full-length protein showed superiority of the LMS scores. Using this method we have recreated objective relationships among different protein kinase sub-families and also classified immunoglobulin containing proteins where sub-family definitions do not exist currently. This method can be applied to any set of protein sequences and hence will be instrumental in analysis of large numbers of full-length protein sequences.
Resumo:
Classification of pharmacologic activity of a chemical compound is an essential step in any drug discovery process. We develop two new atom-centered fragment descriptors (vertex indices) - one based solely on topological considerations without discriminating atomor bond types, and another based on topological and electronic features. We also assess their usefulness by devising a method to rank and classify molecules with regard to their antibacterial activity. Classification performances of our method are found to be superior compared to two previous studies on large heterogeneous data sets for hit finding and hit-to-lead studies even though we use much fewer parameters. It is found that for hit finding studies topological features (simple graph) alone provide significant discriminating power, and for hit-to-lead process small but consistent improvement can be made by additionally including electronic features (colored graph). Our approach is simple, interpretable, and suitable for design of molecules as we do not use any physicochemical properties. The singular use of vertex index as descriptor, novel range based feature extraction, and rigorous statistical validation are the key elements of this study.
Resumo:
Group VB and VIB M-Si systems are considered to show an interesting pattern in the diffusion of components with the change in atomic number in a particular group (M = V, Nb, Ta or M = Mo, W, respectively). Mainly two phases, MSi2 and M5Si3 are considered for this discussion. Except for Ta-silicides, the activation energy for the integrated diffusion of MSi2 is always lower than M5Si3. In both phases, the relative mobilities measured by the ratio of the tracer diffusion coefficients, , decrease with an increasing atomic number in the given group. If determined at the same homologous temperature, the interdiffusion coefficients increase with the atomic number of the refractory metal in the MSi2 phases and decrease in the M5Si3 ones. This behaviour features the basic changes in the defect concentrations on different sublattices with a change in the atomic number of the refractory components.
Resumo:
Head pose classification from surveillance images acquired with distant, large field-of-view cameras is difficult as faces are captured at low-resolution and have a blurred appearance. Domain adaptation approaches are useful for transferring knowledge from the training (source) to the test (target) data when they have different attributes, minimizing target data labeling efforts in the process. This paper examines the use of transfer learning for efficient multi-view head pose classification with minimal target training data under three challenging situations: (i) where the range of head poses in the source and target images is different, (ii) where source images capture a stationary person while target images capture a moving person whose facial appearance varies under motion due to changing perspective, scale and (iii) a combination of (i) and (ii). On the whole, the presented methods represent novel transfer learning solutions employed in the context of multi-view head pose classification. We demonstrate that the proposed solutions considerably outperform the state-of-the-art through extensive experimental validation. Finally, the DPOSE dataset compiled for benchmarking head pose classification performance with moving persons, and to aid behavioral understanding applications is presented in this work.
Resumo:
Differential mobility analyzers (DMAs) are commonly used to generate monodisperse nanoparticle aerosols. Commercial DMAs operate at quasi-atmospheric pressures and are therefore not designed to be vacuum-tight. In certain particle synthesis methods, the use of a vacuum-compatible DMA is a requirement as a process step for producing high-purity metallic particles. A vacuum-tight radial DMA (RDMA) has been developed and tested at low pressures. Its performance has been evaluated by using a commercial NANO-DMA as the reference. The performance of this low-pressure RDMA (LP-RDMA) in terms of the width of its transfer function is found to be comparable with that of other NANO-DMAs at atmospheric pressure and is almost independent of the pressure down to 30 mbar. It is shown that LP-RDMA can be used for the classification of nanometer-sized particles (5-20 nm) under low pressure condition (30 mbar) and has been successfully applied to nanoparticles produced by ablating FeNi at low pressures.
Resumo:
Thiolases are enzymes involved in lipid metabolism. Thiolases remove the acetyl-CoA moiety from 3-ketoacyl-CoAs in the degradative reaction. They can also catalyze the reverse Claisen condensation reaction, which is the first step of biosynthetic processes such as the biosynthesis of sterols and ketone bodies. In human, six distinct thiolases have been identified. Each of these thiolases is different from the other with respect to sequence, oligomeric state, substrate specificity and subcellular localization. Four sequence fingerprints, identifying catalytic loops of thiolases, have been described. In this study genome searches of two mycobacterial species (Mycobacterium tuberculosis and Mycobacterium smegmatis), were carried out, using the six human thiolase sequences as queries. Eight and thirteen different thiolase sequences were identified in M. tuberculosis and M. smegmatis, respectively. In addition, thiolase-like proteins (one encoded in the Mtb and two in the Msm genome) were found. The purpose of this study is to classify these mostly uncharacterized thiolases and thiolase-like proteins. Several other sequences obtained by searches of genome databases of bacteria, mammals and the parasitic protist family of the Trypanosomatidae were included in the analysis. Thiolase-like proteins were also found in the trypanosomatid genomes, but not in those of mammals. In order to study the phylogenetic relationships at a high confidence level, additional thiolase sequences were included such that a total of 130 thiolases and thiolase-like protein sequences were used for the multiple sequence alignment. The resulting phylogenetic tree identifies 12 classes of sequences, each possessing a characteristic set of sequence fingerprints for the catalytic loops. From this analysis it is now possible to assign the mycobacterial thiolases to corresponding homologues in other kingdoms of life. The results of this bioinformatics analysis also show interesting differences between the distributions of M. tuberculosis and M. smegmatis thiolases over the 12 different classes. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Designing a robust algorithm for visual object tracking has been a challenging task since many years. There are trackers in the literature that are reasonably accurate for many tracking scenarios but most of them are computationally expensive. This narrows down their applicability as many tracking applications demand real time response. In this paper, we present a tracker based on random ferns. Tracking is posed as a classification problem and classification is done using ferns. We used ferns as they rely on binary features and are extremely fast at both training and classification as compared to other classification algorithms. Our experiments show that the proposed tracker performs well on some of the most challenging tracking datasets and executes much faster than one of the state-of-the-art trackers, without much difference in tracking accuracy.