94 resultados para hierarchical clustering
Resumo:
Structural information over the entire course of binding interactions based on the analyses of energy landscapes is described, which provides a framework to understand the events involved during biomolecular recognition. Conformational dynamics of malectin's exquisite selectivity for diglucosylated N-glycan (Dig-N-glycan), a highly flexible oligosaccharide comprising of numerous dihedral torsion angles, are described as an example. For this purpose, a novel approach based on hierarchical sampling for acquiring metastable molecular conformations constituting low-energy minima for understanding the structural features involved in a biologic recognition is proposed. For this purpose, four variants of principal component analysis were employed recursively in both Cartesian space and dihedral angles space that are characterized by free energy landscapes to select the most stable conformational substates. Subsequently, k-means clustering algorithm was implemented for geometric separation of the major native state to acquire a final ensemble of metastable conformers. A comparison of malectin complexes was then performed to characterize their conformational properties. Analyses of stereochemical metrics and other concerted binding events revealed surface complementarity, cooperative and bidentate hydrogen bonds, water-mediated hydrogen bonds, carbohydrate-aromatic interactions including CH-pi and stacking interactions involved in this recognition. Additionally, a striking structural transition from loop to beta-strands in malectin CRD upon specific binding to Dig-N-glycan is observed. The interplay of the above-mentioned binding events in malectin and Dig-N-glycan supports an extended conformational selection model as the underlying binding mechanism.
Resumo:
T-cell responses in humans are initiated by the binding of a peptide antigen to a human leukocyte antigen (HLA) molecule. The peptide-HLA complex then recruits an appropriate T cell, leading to cell-mediated immunity. More than 2000 HLA class-I alleles are known in humans, and they vary only in their peptide-binding grooves. The polymorphism they exhibit enables them to bind a wide range of peptide antigens from diverse sources. HLA molecules and peptides present a complex molecular recognition pattern, as many peptides bind to a given allele and a given peptide can be recognized by many alleles. A powerful grouping scheme that not only provides an insightful classification, but is also capable of dissecting the physicochemical basis of recognition specificity is necessary to address this complexity. We present a hierarchical classification of 2010 class-I alleles by using a systematic divisive clustering method. All-pair distances of alleles were obtained by comparing binding pockets in the structural models. By varying the similarity thresholds, a multilevel classification was obtained, with 7 supergroups, each further subclassifying to yield 72 groups. An independent clustering performed based only on similarities in their epitope pools correlated highly with pocket-based clustering. Physicochemical feature combinations that best explain the basis of clustering are identified. Mutual information calculated for the set of peptide ligands enables identification of binding site residues contributing to peptide specificity. The grouping of HLA molecules achieved here will be useful for rational vaccine design, understanding disease susceptibilities and predicting risk of organ transplants.
Resumo:
The work reported hen was motivated by a desire to verify the existence of structure - specifically MP-rich clusters induced by sodium bromide (NaBr) in the ternary liquid mixture 3-methylpyridine (Mf) + water(W) + NaBr. We present small-angle X-ray scattering (SAXS) measurements in this mixture. These measurements were obtained at room temperature (similar to 298 K) in the one-phase region (below the relevant lower consolute points, T(L)s) at different values of X (i.e., X = 0.02 - 0.17), where X is the weight fraction of NaBr in the mixture. Cluster-size distribution, estimated on the assumption that the clusters are spherical, shows systematic behaviour in that the peak of the distribution shifts rewards larger values of cluster radius as X increases. The largest spatial extent of the clusters (similar to 4.5 nm) is seen at X = 0.17. Data analysis assuming arbitrary shapes and sizes of clusters gives a limiting value of cluster size (- 4.5 nm) that is not very sensitive to X. It is suggested that the cluster size determined may not be the same as the usual critical-point fluctuations far removed from the critical point (T-L). The influence of the additional length scale due to clustering is discussed from the standpoint of crossover from Ising to mean-field critical behaviour, when moving away from the T-L.
Resumo:
We analyse the fault-tolerant parameters and topological properties of a hierarchical network of hypercubes. We take a close look at the Extended Hypercube (EH) and the Hyperweave (HW) architectures and also compare them with other popular architectures. These two architectures have low diameter and constant degree of connectivity making it possible to expand these networks without affecting the existing configuration. A scheme for incrementally expanding this network is also presented. We also look at the performance of the ASCEND/DESCEND class of algorithms on these architectures.
Resumo:
n this paper, a multistage evolutionary scheme is proposed for clustering in a large data base, like speech data. This is achieved by clustering a small subset of the entire sample set in each stage and treating the cluster centroids so obtained as samples, together with another subset of samples not considered previously, as input data to the next stage. This is continued till the whole sample set is exhausted. The clustering is accomplished by constructing a fuzzy similarity matrix and using the fuzzy techniques proposed here. The technique is illustrated by an efficient scheme for voiced-unvoiced-silence classification of speech.
Resumo:
Learning automata arranged in a two-level hierarchy are considered. The automata operate in a stationary random environment and update their action probabilities according to the linear-reward- -penalty algorithm at each level. Unlike some hierarchical systems previously proposed, no information transfer exists from one level to another, and yet the hierarchy possesses good convergence properties. Using weak-convergence concepts it is shown that for large time and small values of parameters in the algorithm, the evolution of the optimal path probability can be represented by a diffusion whose parameters can be computed explicitly.
Resumo:
Systems of learning automata have been studied by various researchers to evolve useful strategies for decision making under uncertainity. Considered in this paper are a class of hierarchical systems of learning automata where the system gets responses from its environment at each level of the hierarchy. A classification of such sequential learning tasks based on the complexity of the learning problem is presented. It is shown that none of the existing algorithms can perform in the most general type of hierarchical problem. An algorithm for learning the globally optimal path in this general setting is presented, and its convergence is established. This algorithm needs information transfer from the lower levels to the higher levels. Using the methodology of estimator algorithms, this model can be generalized to accommodate other kinds of hierarchical learning tasks.
Resumo:
An approach is presented for hierarchical control of an ammonia reactor, which is a key unit process in a nitrogen fertilizer complex. The aim of the control system is to ensure safe operation of the reactor around the optimal operating point in the face of process variable disturbances and parameter variations. The four different layers perform the functions of regulation, optimization, adaptation, and self-organization. The simulation for this proposed application is conducted on an AD511 hybrid computer in which the AD5 analog processor is used to represent the process and the PDP-11/ 35 digital computer is used for the implementation of control laws. Simulation results relating to the different layers have been presented.
Resumo:
In this paper the notion of conceptual cohesiveness is precised and used to group objects semantically, based on a knowledge structure called ‘cohesion forest’. A set of axioms is proposed which should be satisfied to make the generated clusters meaningful.
Resumo:
A learning automaton operating in a random environment updates its action probabilities on the basis of the reactions of the environment, so that asymptotically it chooses the optimal action. When the number of actions is large the automaton becomes slow because there are too many updatings to be made at each instant. A hierarchical system of such automata with assured c-optimality is suggested to overcome that problem.The learning algorithm for the hierarchical system turns out to be a simple modification of the absolutely expedient algorithm known in the literature. The parameters of the algorithm at each level in the hierarchy depend only on the parameters and the action probabilities of the previous level. It follows that to minimize the number of updatings per cycle each automaton in the hierarchy need have only two or three actions.
Resumo:
An algorithm is described for developing a hierarchy among a set of elements having certain precedence relations. This algorithm, which is based on tracing a path through the graph, is easily implemented by a computer.
Resumo:
An algorithm is described for developing a hierarchy among a set of elements having certain precedence relations. This algorithm, which is based on tracing a path through the graph, is easily implemented by a computer.
Resumo:
The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.
Resumo:
The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.
Resumo:
A new clustering technique, based on the concept of immediato neighbourhood, with a novel capability to self-learn the number of clusters expected in the unsupervized environment, has been developed. The method compares favourably with other clustering schemes based on distance measures, both in terms of conceptual innovations and computational economy. Test implementation of the scheme using C-l flight line training sample data in a simulated unsupervized mode has brought out the efficacy of the technique. The technique can easily be implemented as a front end to established pattern classification systems with supervized learning capabilities to derive unified learning systems capable of operating in both supervized and unsupervized environments. This makes the technique an attractive proposition in the context of remotely sensed earth resources data analysis wherein it is essential to have such a unified learning system capability.