93 resultados para Classification, Decimal
Resumo:
Background:Overwhelming majority of the Serine/Threonine protein kinases identified by gleaning archaeal and eubacterial genomes could not be classified into any of the well known Hanks and Hunter subfamilies of protein kinases. This is owing to the development of Hanks and Hunter classification scheme based on eukaryotic protein kinases which are highly divergent from their prokaryotic homologues. A large dataset of prokaryotic Serine/Threonine protein kinases recognized from genomes of prokaryotes have been used to develop a classification framework for prokaryotic Ser/Thr protein kinases. Methodology/Principal Findings: We have used traditional sequence alignment and phylogenetic approaches and clustered the prokaryotic kinases which represent 72 subfamilies with at least 4 members in each. Such a clustering enables classification of prokaryotic Ser/Thr kinases and it can be used as a framework to classify newly identified prokaryotic Ser/Thr kinases. After series of searches in a comprehensive sequence database we recognized that 38 subfamilies of prokaryotic protein kinases are associated to a specific taxonomic level. For example 4, 6 and 3 subfamilies have been identified that are currently specific to phylum proteobacteria, cyanobacteria and actinobacteria respectively. Similarly subfamilies which are specific to an order, sub-order, class, family and genus have also been identified. In addition to these, we also identify organism-diverse subfamilies. Members of these clusters are from organisms of different taxonomic levels, such as archaea, bacteria, eukaryotes and viruses.Conclusion/Significance: Interestingly, occurrence of several taxonomic level specific subfamilies of prokaryotic kinases contrasts with classification of eukaryotic protein kinases in which most of the popular subfamilies of eukaryotic protein kinases occur diversely in several eukaryotes. Many prokaryotic Ser/Thr kinases exhibit a wide variety of modular organization which indicates a degree of complexity and protein-protein interactions in the signaling pathways in these microbes.
Resumo:
This paper aims at evaluating the methods of multiclass support vector machines (SVMs) for effective use in distance relay coordination. Also, it describes a strategy of supportive systems to aid the conventional protection philosophy in combating situations where protection systems have maloperated and/or information is missing and provide selective and secure coordinations. SVMs have considerable potential as zone classifiers of distance relay coordination. This typically requires a multiclass SVM classifier to effectively analyze/build the underlying concept between reach of different zones and the apparent impedance trajectory during fault. Several methods have been proposed for multiclass classification where typically several binary SVM classifiers are combined together. Some authors have extended binary SVM classification to one-step single optimization operation considering all classes at once. In this paper, one-step multiclass classification, one-against-all, and one-against-one multiclass methods are compared for their performance with respect to accuracy, number of iterations, number of support vectors, training, and testing time. The performance analysis of these three methods is presented on three data sets belonging to training and testing patterns of three supportive systems for a region and part of a network, which is an equivalent 526-bus system of the practical Indian Western grid.
Resumo:
In this paper. we propose a novel method using wavelets as input to neural network self-organizing maps and support vector machine for classification of magnetic resonance (MR) images of the human brain. The proposed method classifies MR brain images as either normal or abnormal. We have tested the proposed approach using a dataset of 52 MR brain images. Good classification percentage of more than 94% was achieved using the neural network self-organizing maps (SOM) and 98% front support vector machine. We observed that the classification rate is high for a Support vector machine classifier compared to self-organizing map-based approach.
Resumo:
Background: Protein phosphorylation is a generic way to regulate signal transduction pathways in all kingdoms of life. In many organisms, it is achieved by the large family of Ser/Thr/Tyr protein kinases which are traditionally classified into groups and subfamilies on the basis of the amino acid sequence of their catalytic domains. Many protein kinases are multidomain in nature but the diversity of the accessory domains and their organization are usually not taken into account while classifying kinases into groups or subfamilies. Methodology: Here, we present an approach which considers amino acid sequences of complete gene products, in order to suggest refinements in sets of pre-classified sequences. The strategy is based on alignment-free similarity scores and iterative Area Under the Curve (AUC) computation. Similarity scores are computed by detecting common patterns between two sequences and scoring them using a substitution matrix, with a consistent normalization scheme. This allows us to handle full-length sequences, and implicitly takes into account domain diversity and domain shuffling. We quantitatively validate our approach on a subset of 212 human protein kinases. We then employ it on the complete repertoire of human protein kinases and suggest few qualitative refinements in the subfamily assignment stored in the KinG database, which is based on catalytic domains only. Based on our new measure, we delineate 37 cases of potential hybrid kinases: sequences for which classical classification based entirely on catalytic domains is inconsistent with the full-length similarity scores computed here, which implicitly consider multi-domain nature and regions outside the catalytic kinase domain. We also provide some examples of hybrid kinases of the protozoan parasite Entamoeba histolytica. Conclusions: The implicit consideration of multi-domain architectures is a valuable inclusion to complement other classification schemes. The proposed algorithm may also be employed to classify other families of enzymes with multidomain architecture.
Resumo:
Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.
Resumo:
Equilibrium sediment volume tests are conducted on field soils to classify them based on their degree of expansivity and/or to predict the liquid limit of soils. The present technical paper examines different equilibrium sediment volume tests, critically evaluating each of them. It discusses the settling behavior of fine-grained soils during the soil sediment formation to evolve a rationale for conducting the latest version of equilibrium sediment volume test. Probable limitations of equilibrium sediment volume test and the possible solution to overcome the same have also been indicated.
Resumo:
An explicit construction of all the homogeneous holomorphic Hermitian vector bundles over the unit disc D is given. It is shown that every such vector bundle is a direct sum of irreducible ones. Among these irreducible homogeneous holomorphic Hermitian vector bundles over D, the ones corresponding to operators in the Cowen-Douglas class B-n(D) are identified. The classification of homogeneous operators in B-n(D) is completed using an explicit realization of these operators. We also show how the homogeneous operators in B-n(D) split into similarity classes. (C) 2011 Elsevier Inc. All rights reserved.
Resumo:
This paper studies the problem of constructing robust classifiers when the training is plagued with uncertainty. The problem is posed as a Chance-Constrained Program (CCP) which ensures that the uncertain data points are classified correctly with high probability. Unfortunately such a CCP turns out to be intractable. The key novelty is in employing Bernstein bounding schemes to relax the CCP as a convex second order cone program whose solution is guaranteed to satisfy the probabilistic constraint. Prior to this work, only the Chebyshev based relaxations were exploited in learning algorithms. Bernstein bounds employ richer partial information and hence can be far less conservative than Chebyshev bounds. Due to this efficient modeling of uncertainty, the resulting classifiers achieve higher classification margins and hence better generalization. Methodologies for classifying uncertain test data points and error measures for evaluating classifiers robust to uncertain data are discussed. Experimental results on synthetic and real-world datasets show that the proposed classifiers are better equipped to handle data uncertainty and outperform state-of-the-art in many cases.
Resumo:
Our ability to infer the protein quaternary structure automatically from atom and lattice information is inadequate, especially for weak complexes, and heteromeric quaternary structures. Several approaches exist, but they have limited performance. Here, we present a new scheme to infer protein quaternary structure from lattice and protein information, with all-around coverage for strong, weak and very weak affinity homomeric and heteromeric complexes. The scheme combines naive Bayes classifier and point group symmetry under Boolean framework to detect quaternary structures in crystal lattice. It consistently produces >= 90% coverage across diverse benchmarking data sets, including a notably superior 95% coverage for recognition heteromeric complexes, compared with 53% on the same data set by current state-of-the-art method. The detailed study of a limited number of prediction-failed cases offers interesting insights into the intriguing nature of protein contacts in lattice. The findings have implications for accurate inference of quaternary states of proteins, especially weak affinity complexes.
Resumo:
A general analysis of squeezing transformations for two-mode systems is given based on the four-dimensional real symplectic group Sp(4, R). Within the framework of the unitary (metaplectic) representation of this group, a distinction between compact photon-number-conserving and noncompact photon-number-nonconserving squeezing transformations is made. We exploit the U(2) invariant squeezing criterion to divide the set of all squeezing transformations into a two-parameter family of distinct equivalence classes with representative elements chosen for each class. Familiar two-mode squeezing transformations in the literature are recognized in our framework and seen to form a set of measure zero. Examples of squeezed coherent and thermal states are worked out. The need to extend the heterodyne detection scheme to encompass all of U(2) is emphasized, and known experimental situations where all U(2) elements can be reproduced are briefly described.
Resumo:
Three classification techniques, namely, K-means Cluster Analysis (KCA), Fuzzy Cluster Analysis (FCA), and Kohonen Neural Networks (KNN) were employed to group 25 microwatersheds of Kherthal watershed, Rajasthan into homogeneous groups for formulating the basis for suitable conservation and management practices. Ten parameters, mainly, morphological, namely, drainage density (D-d), bifurcation ratio (R-b), stream frequency (F-u), length of overland flow (L-o), form factor (R-f), shape factor (B-s), elongation ratio (R-e), circulatory ratio (R-c), compactness coefficient (C-c) and texture ratio (T) are used for the classification. Optimal number of groups is chosen, based on two cluster validation indices Davies-Bouldin and Dunn's. Comparative analysis of various clustering techniques revealed that 13 microwatersheds out of 25 are commonly suggested by KCA, FCA and KNN i.e., 52%; 17 microwatersheds out of 25 i.e., 68% are commonly suggested by KCA and FCA whereas these are 16 out of 25 in FCA and KNN (64%) and 15 out of 25 in KNN and CA (60%). It is observed from KNN sensitivity analysis that effect of various number of epochs (1000, 3000, 5000) and learning rates (0.01, 0.1-0.9) on total squared error values is significant even though no fixed trend is observed. Sensitivity analysis studies revealed that microwatershecls have occupied all the groups even though their number in each group is different in case of further increase in the number of groups from 5 to 6, 7 and 8. (C) 2010 International Association of Hydro-environment Engineering and Research, Asia Pacific Division. Published by Elsevier B.V. All rights reserved.
Resumo:
In this paper, we show that it is possible to reduce the complexity of Intra MB coding in H.264/AVC based on a novel chance constrained classifier. Using the pairs of simple mean-variances values, our technique is able to reduce the complexity of Intra MB coding process with a negligible loss in PSNR. We present an alternate approach to address the classification problem which is equivalent to machine learning. Implementation results show that the proposed method reduces encoding time to about 20% of the reference implementation with average loss of 0.05 dB in PSNR.
Resumo:
Part classification and coding is still considered as laborious and time-consuming exercise. Keeping in view, the crucial role, which it plays, in developing automated CAPP systems, the attempts have been made in this article to automate a few elements of this exercise using a shape analysis model. In this study, a 24-vector directional template is contemplated to represent the feature elements of the parts (candidate and prototype). Various transformation processes such as deformation, straightening, bypassing, insertion and deletion are embedded in the proposed simulated annealing (SA)-like hybrid algorithm to match the candidate part with their prototype. For a candidate part, searching its matching prototype from the information data is computationally expensive and requires large search space. However, the proposed SA-like hybrid algorithm for solving the part classification problem considerably minimizes the search space and ensures early convergence of the solution. The application of the proposed approach is illustrated by an example part. The proposed approach is applied for the classification of 100 candidate parts and their prototypes to demonstrate the effectiveness of the algorithm. (C) 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
Land cover (LC) refers to what is actually present on the ground and provide insights into the underlying solution for improving the conditions of many issues, from water pollution to sustainable economic development. One of the greatest challenges of modeling LC changes using remotely sensed (RS) data is of scale-resolution mismatch: that the spatial resolution of detail is less than what is required, and that this sub-pixel level heterogeneity is important but not readily knowable. However, many pixels consist of a mixture of multiple classes. The solution to mixed pixel problem typically centers on soft classification techniques that are used to estimate the proportion of a certain class within each pixel. However, the spatial distribution of these class components within the pixel remains unknown. This study investigates Orthogonal Subspace Projection - an unmixing technique and uses pixel-swapping algorithm for predicting the spatial distribution of LC at sub-pixel resolution. Both the algorithms are applied on many simulated and actual satellite images for validation. The accuracy on the simulated images is ~100%, while IRS LISS-III and MODIS data show accuracy of 76.6% and 73.02% respectively. This demonstrates the relevance of these techniques for applications such as urban-nonurban, forest-nonforest classification studies etc.