208 resultados para Functional Classification Trees
Resumo:
This paper presents an efficient approach to the modeling and classification of vehicles using the magnetic signature of the vehicle. A database was created using the magnetic signature collected over a wide range of vehicles(cars). A sensor dependent approach called as Magnetic Field Angle Model is proposed for modeling the obtained magnetic signature. Based on the data model, we present a novel method to extract the feature vector from the magnetic signature. In the classification of vehicles, a linear support vector machine configuration is used to classify the vehicles based on the obtained feature vectors.
Resumo:
Fundamental gap renormalization due to electronic polarization is a basic phenomenon in molecular crystals. Despite its ubiquity and importance, all conventional approaches within density-functional theory completely fail to capture it, even qualitatively. Here, we present a new screened range-separated hybrid functional, which, through judicious introduction of the scalar dielectric constant, quantitatively captures polarization-induced gap renormalization, as demonstrated on the prototypical organic molecular crystals of benzene, pentacene, and C-60. This functional is predictive, as it contains system-specific adjustable parameters that are determined from first principles, rather than from empirical considerations.
Resumo:
This paper presents classification, representation and extraction of deformation features in sheet-metal parts. The thickness is constant for these shape features and hence these are also referred to as constant thickness features. The deformation feature is represented as a set of faces with a characteristic arrangement among the faces. Deformation of the base-sheet or forming of material creates Bends and Walls with respect to a base-sheet or a reference plane. These are referred to as Basic Deformation Features (BDFs). Compound deformation features having two or more BDFs are defined as characteristic combinations of Bends and Walls and represented as a graph called Basic Deformation Features Graph (BDFG). The graph, therefore, represents a compound deformation feature uniquely. The characteristic arrangement of the faces and type of bends belonging to the feature decide the type and nature of the deformation feature. Algorithms have been developed to extract and identify deformation features from a CAD model of sheet-metal parts. The proposed algorithm does not require folding and unfolding of the part as intermediate steps to recognize deformation features. Representations of typical features are illustrated and results of extracting these deformation features from typical sheet metal parts are presented and discussed. (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
Myopathies are muscular diseases in which muscle fibers degenerate due to many factors such as nutrient deficiency, infection and mutations in myofibrillar etc. The objective of this study is to identify the bio-markers to distinguish various muscle mutants in Drosophila (fruit fly) using Raman Spectroscopy. Principal Components based Linear Discriminant Analysis (PC-LDA) classification model yielding >95% accuracy was developed to classify such different mutants representing various myopathies according to their physiopathology.
Resumo:
We study consistency properties of surrogate loss functions for general multiclass classification problems, defined by a general loss matrix. We extend the notion of classification calibration, which has been studied for binary and multiclass 0-1 classification problems (and for certain other specific learning problems), to the general multiclass setting, and derive necessary and sufficient conditions for a surrogate loss to be classification calibrated with respect to a loss matrix in this setting. We then introduce the notion of \emph{classification calibration dimension} of a multiclass loss matrix, which measures the smallest `size' of a prediction space for which it is possible to design a convex surrogate that is classification calibrated with respect to the loss matrix. We derive both upper and lower bounds on this quantity, and use these results to analyze various loss matrices. In particular, as one application, we provide a different route from the recent result of Duchi et al.\ (2010) for analyzing the difficulty of designing `low-dimensional' convex surrogates that are consistent with respect to pairwise subset ranking losses. We anticipate the classification calibration dimension may prove to be a useful tool in the study and design of surrogate losses for general multiclass learning problems.
Resumo:
We consider the problem of developing privacy-preserving machine learning algorithms in a dis-tributed multiparty setting. Here different parties own different parts of a data set, and the goal is to learn a classifier from the entire data set with-out any party revealing any information about the individual data points it owns. Pathak et al [7]recently proposed a solution to this problem in which each party learns a local classifier from its own data, and a third party then aggregates these classifiers in a privacy-preserving manner using a cryptographic scheme. The generaliza-tion performance of their algorithm is sensitive to the number of parties and the relative frac-tions of data owned by the different parties. In this paper, we describe a new differentially pri-vate algorithm for the multiparty setting that uses a stochastic gradient descent based procedure to directly optimize the overall multiparty ob-jective rather than combining classifiers learned from optimizing local objectives. The algorithm achieves a slightly weaker form of differential privacy than that of [7], but provides improved generalization guarantees that do not depend on the number of parties or the relative sizes of the individual data sets. Experimental results corrob-orate our theoretical findings.
Resumo:
Transductive SVM (TSVM) is a well known semi-supervised large margin learning method for binary text classification. In this paper we extend this method to multi-class and hierarchical classification problems. We point out that the determination of labels of unlabeled examples with fixed classifier weights is a linear programming problem. We devise an efficient technique for solving it. The method is applicable to general loss functions. We demonstrate the value of the new method using large margin loss on a number of multi-class and hierarchical classification datasets. For maxent loss we show empirically that our method is better than expectation regularization/constraint and posterior regularization methods, and competitive with the version of entropy regularization method which uses label constraints.
Resumo:
Double helical structures of DNA and RNA are mostly determined by base pair stacking interactions, which give them the base sequence-directed features, such as small roll values for the purine-pyrimidine steps. Earlier attempts to characterize stacking interactions were mostly restricted to calculations on fiber diffraction geometries or optimized structure using ab initio calculations lacking variation in geometry to comment on rather unusual large roll values observed in AU/AU base pair step in crystal structures of RNA double helices. We have generated stacking energy hyperspace by modeling geometries with variations along the important degrees of freedom, roll, and slide, which were chosen via statistical analysis as maximally sequence dependent. Corresponding energy contours were constructed by several quantum chemical methods including dispersion corrections. This analysis established the most suitable methods for stacked base pair systems despite the limitation imparted by number of atom in a base pair step to employ very high level of theory. All the methods predict negative roll value and near-zero slide to be most favorable for the purine-pyrimidine steps, in agreement with Calladine's steric clash based rule. Successive base pairs in RNA are always linked by sugar-phosphate backbone with C3-endo sugars and this demands C1-C1 distance of about 5.4 angstrom along the chains. Consideration of an energy penalty term for deviation of C1-C1 distance from the mean value, to the recent DFT-D functionals, specifically B97X-D appears to predict reliable energy contour for AU/AU step. Such distance-based penalty improves energy contours for the other purine-pyrimidine sequences also. (c) 2013 Wiley Periodicals, Inc. Biopolymers 101: 107-120, 2014.
Resumo:
Myopathies are muscular diseases in which muscle fibers degenerate due to many factors such as nutrient deficiency, infection and mutations in myofibrillar etc. The objective of this study is to identify the bio-markers to distinguish various muscle mutants in Drosophila (fruit fly) using Raman Spectroscopy. Principal Components based Linear Discriminant Analysis (PC-LDA) classification model yielding >95% accuracy was developed to classify such different mutants representing various myopathies according to their physiopathology.
Resumo:
The evolutionary diversity of the HSP70 gene family at the genetic level has generated complex structural variations leading to altered functional specificity and mode of regulation in different cellular compartments. By utilizing Saccharomyces cerevisiae as a model system for better understanding the global functional cooperativity between Hsp70 paralogs, we have dissected the differences in functional properties at the biochemical level between mitochondrial heat shock protein 70 (mtHsp70) Ssc1 and an uncharacterized Ssc3 paralog. Based on the evolutionary origin of Ssc3 and a high degree of sequence homology with Ssc1, it has been proposed that both have a close functional overlap in the mitochondrial matrix. Surprisingly, our results demonstrate that there is no functional cross-talk between Ssc1 and Ssc3 paralogs. The lack of in vivo functional overlap is due to altered conformation and significant lower stability associated with Ssc3. The substrate-binding domain of Ssc3 showed poor affinity toward mitochondrial client proteins and Tim44 due to the open conformation in ADP-bound state. In addition to that, the nucleotide-binding domain of Ssc3 showed an altered regulation by the Mge1 co-chaperone due to a high degree of conformational plasticity, which strongly promotes aggregation. Besides, Ssc3 possesses a dysfunctional inter-domain interface thus rendering it unable to perform functions similar to generic Hsp70s. Moreover, we have identified the critical amino acid sequence of Ssc1 and Ssc3 that can “make or break” mtHsp70 chaperone function. Together, our analysis provides the first evidence to show that the nucleotide-binding domain of mtHsp70s plays a critical role in determining the functional specificity among paralogs and orthologs across kingdoms.
Resumo:
Internal mobility of the two domain molecule of ribosome recycling factor (RRF) is known to be important for its action. Mycobacterium tuberculosis RRF does not complement E. coli for its deficiency of RRF (in the presence of E. coli EF-G alone). Crystal structure had revealed higher rigidity of the M. tuberculosis RRF due to the presence of additional salt bridges between domains. Two inter-domain salt bridges and one between the linker region and the domain containing C-terminal residues were disrupted by appropriate mutations. Except for a C-terminal deletion mutant, all mutants showed RRF activity in E. coli when M. tuberculosis EF-G was also co-expressed. The crystal structures of the point mutants, that of the C-terminal deletion mutant and that of the protein grown in the presence of a detergent, were determined. The increased mobility resulting from the disruption of the salt bridge involving the hinge region allows the appropriate mutant to weakly complement E. coli for its deficiency of RRF even in the absence of simultaneous expression of the mycobacterial EF-G. The loss of activity of the C-terminal deletion mutant appears to be partly due to the rigidification of the molecule consequent to changes in the hinge region.
Resumo:
Sensory receptors determine the type and the quantity of information available for perception. Here, we quantified and characterized the information transferred by primary afferents in the rat whisker system using neural system identification. Quantification of ``how much'' information is conveyed by primary afferents, using the direct method (DM), a classical information theoretic tool, revealed that primary afferents transfer huge amounts of information (up to 529 bits/s). Information theoretic analysis of instantaneous spike-triggered kinematic stimulus features was used to gain functional insight on ``what'' is coded by primary afferents. Amongst the kinematic variables tested-position, velocity, and acceleration-primary afferent spikes encoded velocity best. The other two variables contributed to information transfer, but only if combined with velocity. We further revealed three additional characteristics that play a role in information transfer by primary afferents. Firstly, primary afferent spikes show preference for well separated multiple stimuli (i.e., well separated sets of combinations of the three instantaneous kinematic variables). Secondly, neurons are sensitive to short strips of the stimulus trajectory (up to 10 ms pre-spike time), and thirdly, they show spike patterns (precise doublet and triplet spiking). In order to deal with these complexities, we used a flexible probabilistic neuron model fitting mixtures of Gaussians to the spike triggered stimulus distributions, which quantitatively captured the contribution of the mentioned features and allowed us to achieve a full functional analysis of the total information rate indicated by the DM. We found that instantaneous position, velocity, and acceleration explained about 50% of the total information rate. Adding a 10 ms pre-spike interval of stimulus trajectory achieved 80-90%. The final 10-20% were found to be due to non-linear coding by spike bursts.