814 resultados para Hierarchical clustering model
Resumo:
T-cell responses in humans are initiated by the binding of a peptide antigen to a human leukocyte antigen (HLA) molecule. The peptide-HLA complex then recruits an appropriate T cell, leading to cell-mediated immunity. More than 2000 HLA class-I alleles are known in humans, and they vary only in their peptide-binding grooves. The polymorphism they exhibit enables them to bind a wide range of peptide antigens from diverse sources. HLA molecules and peptides present a complex molecular recognition pattern, as many peptides bind to a given allele and a given peptide can be recognized by many alleles. A powerful grouping scheme that not only provides an insightful classification, but is also capable of dissecting the physicochemical basis of recognition specificity is necessary to address this complexity. We present a hierarchical classification of 2010 class-I alleles by using a systematic divisive clustering method. All-pair distances of alleles were obtained by comparing binding pockets in the structural models. By varying the similarity thresholds, a multilevel classification was obtained, with 7 supergroups, each further subclassifying to yield 72 groups. An independent clustering performed based only on similarities in their epitope pools correlated highly with pocket-based clustering. Physicochemical feature combinations that best explain the basis of clustering are identified. Mutual information calculated for the set of peptide ligands enables identification of binding site residues contributing to peptide specificity. The grouping of HLA molecules achieved here will be useful for rational vaccine design, understanding disease susceptibilities and predicting risk of organ transplants.
Resumo:
Folding of Ubiquitin (Ub), a functionally important protein found in eukaryotic organisms, is investigated at low and neutral pH at different temperatures using simulations of the coarse-grained self-organized-polymer model with side chains (SOP-SC). The melting temperatures (T-m's), identified with the peaks in the heat capacity curves, decrease as pH decreases, in qualitative agreement with experiments. The calculated radius of gyration, showing dramatic variations with pH, is in excellent agreement with scattering experiments. At T-m Ub folds in a two-state manner at low and neutral pH. Clustering analysis of the conformations sampled in equilibrium folding trajectories at T-m with multiple transitions between the folded and unfolded states, shows a network of metastable states connecting the native and unfolded states. At low and neutral pH, Ub folds with high probability through a preferred set of conformations resulting in a pH-dependent dominant folding pathway. Folding kinetics reveal that Ub assembly at low pH occurs by multiple pathways involving a combination of nucleation-collapse and diffusion collision mechanism. The mechanism by which Ub folds is dictated by the stability of the key secondary structural elements responsible for establishing long-range contacts and collapse of Ub. Nucleation collapse mechanism holds if the stability of these elements are marginal, as would be the case at elevated temperatures. If the lifetimes associated with these structured microdomains are on the order of hundreds of microseconds, then Ub folding follows the diffusion collision mechanism with intermediates, many of which coincide with those found in equilibrium. Folding at neutral pH is a sequential process with a populated intermediate resembling that sampled at equilibrium. The transition state structures, obtained using a P-fold analysis, are homogeneous and globular with most of the secondary and tertiary structures being native-like. Many of our findings for both the thermodynamics and kinetics of folding are not only in agreement with experiments but also provide missing details not resolvable in standard experiments. The key prediction that folding mechanism varies dramatically with pH is amenable to experimental tests.
Resumo:
We show that a film of a suspension of polymer grafted nanoparticles on a liquid substrate can be employed to create two-dimensional nanostructures with a remarkable variation in the pattern length scales. The presented experiments also reveal the emergence of concentration-dependent bimodal patterns as well as re-entrant behaviour that involves length scales due to dewetting and compositional instabilities. The experimental observations are explained through a gradient dynamics model consisting of coupled evolution equations for the height of the suspension film and the concentration of polymer. Using a Flory-Huggins free energy functional for the polymer solution, we show in a linear stability analysis that the thin film undergoes dewetting and/or compositional instabilities depending on the concentration of the polymer in the solution. We argue that the formation via `hierarchical self-assembly' of various functional nanostructures observed in different systems can be explained as resulting from such an interplay of instabilities.
Resumo:
We present a stochastic simulation technique for subset selection in time series models, based on the use of indicator variables with the Gibbs sampler within a hierarchical Bayesian framework. As an example, the method is applied to the selection of subset linear AR models, in which only significant lags are included. Joint sampling of the indicators and parameters is found to speed convergence. We discuss the possibility of model mixing where the model is not well determined by the data, and the extension of the approach to include non-linear model terms.
Resumo:
A shear-lag model is used to study the mechanical properties of bone-like hierarchical materials. The relationship between the overall effective modulus and the number of hierarchy level is obtained. The result is compared with that based on the tension-shear chain model and finite element simulation, respectively. It is shown that all three models can be used to describe the mechanical behavior of the hierarchical material when the number of hierarchy levels is small. By increasing the number of hierarchy level, the shear-lag result is consistent with the finite element result. However the tension-shear chain model leads to an opposite trend. The transition point position depends on the fraction of hard phase, aspect ratio and modulus ratio of hard phase to soft phase. Further discussion is performed on the flaw tolerance size and strength of hierarchical materials based on the shear-lag analysis.
Resumo:
Forest mapping over mountainous terrains is difficult because of high relief Although digital elevation models (DEMs) are often useful to improve mapping accuracy, high quality DEMs are seldom available over large areas, especially in developing countries
Resumo:
MOTIVATION: The integration of multiple datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct-but often complementary-information. We present a Bayesian method for the unsupervised integrative modelling of multiple datasets, which we refer to as MDI (Multiple Dataset Integration). MDI can integrate information from a wide range of different datasets and data types simultaneously (including the ability to model time series data explicitly using Gaussian processes). Each dataset is modelled using a Dirichlet-multinomial allocation (DMA) mixture model, with dependencies between these models captured through parameters that describe the agreement among the datasets. RESULTS: Using a set of six artificially constructed time series datasets, we show that MDI is able to integrate a significant number of datasets simultaneously, and that it successfully captures the underlying structural similarity between the datasets. We also analyse a variety of real Saccharomyces cerevisiae datasets. In the two-dataset case, we show that MDI's performance is comparable with the present state-of-the-art. We then move beyond the capabilities of current approaches and integrate gene expression, chromatin immunoprecipitation-chip and protein-protein interaction data, to identify a set of protein complexes for which genes are co-regulated during the cell cycle. Comparisons to other unsupervised data integration techniques-as well as to non-integrative approaches-demonstrate that MDI is competitive, while also providing information that would be difficult or impossible to extract using other methods.
Resumo:
This paper aims to solve the fault tolerant control problem of a wind turbine benchmark. A hierarchical controller with model predictive pre-compensators, a global model predictive controller and a supervisory controller is proposed. In the model predictive pre-compensator, an extended Kalman Filter is designed to estimate the system states and various fault parameters. Based on the estimation, a group of model predictive controllers are designed to compensate the fault effects for each component of the wind turbine. The global MPC is used to schedule the operation of the components and exploit potential system-level redundancies. Extensive simulations of various fault conditions show that the proposed controller has small transients when faults occur and uses smoother and smaller generator torque and pitch angle inputs than the default controller. This paper shows that MPC can be a good candidate for fault tolerant controllers, especially the one with an adaptive internal model combined with a parameter estimation and update mechanism, such as an extended Kalman Filter. © 2012 IFAC.
Resumo:
The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition.
Resumo:
Clustering behavior is studied in a model of integrate-and-fire oscillators with excitatory pulse coupling. When considering a population of identical oscillators, the main result is a proof of global convergence to a phase-locked clustered behavior. The robustness of this clustering behavior is then investigated in a population of nonidentical oscillators by studying the transition from total clustering to the absence of clustering as the group coherence decreases. A robust intermediate situation of partial clustering, characterized by few oscillators traveling among nearly phase-locked clusters, is of particular interest. The analysis complements earlier studies of synchronization in a closely related model. © 2008 American Institute of Physics.
Resumo:
Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled. The true number of clusters in the data is often unknown and most models require this parameter as an input. Dirichlet process mixture models are appealing as they can infer the number of clusters from the data. However, these models do not deal with high dimensional data well and can encounter difficulties in inference. We present a novel nonparameteric Bayesian kernel based method to cluster data points without the need to prespecify the number of clusters or to model complicated densities from which data points are assumed to be generated from. The key insight is to use determinants of submatrices of a kernel matrix as a measure of how close together a set of points are. We explore some theoretical properties of the model and derive a natural Gibbs based algorithm with MCMC hyperparameter learning. The model is implemented on a variety of synthetic and real world data sets.
Resumo:
John Warren and Chris Topping (2004). A trait specific model of competition in a spatially structured plant community. Ecological Modelling, 180 pp.477-485 RAE2008
Resumo:
A system is described that tracks moving objects in a video dataset so as to extract a representation of the objects' 3D trajectories. The system then finds hierarchical clusters of similar trajectories in the video dataset. Objects' motion trajectories are extracted via an EKF formulation that provides each object's 3D trajectory up to a constant factor. To increase accuracy when occlusions occur, multiple tracking hypotheses are followed. For trajectory-based clustering and retrieval, a modified version of edit distance, called longest common subsequence (LCSS) is employed. Similarities are computed between projections of trajectories on coordinate axes. Trajectories are grouped based, using an agglomerative clustering algorithm. To check the validity of the approach, experiments using real data were performed.
Resumo:
We present what we believe to be the first thorough characterization of live streaming media content delivered over the Internet. Our characterization of over five million requests spanning a 28-day period is done at three increasingly granular levels, corresponding to clients, sessions, and transfers. Our findings support two important conclusions. First, we show that the nature of interactions between users and objects is fundamentally different for live versus stored objects. Access to stored objects is user driven, whereas access to live objects is object driven. This reversal of active/passive roles of users and objects leads to interesting dualities. For instance, our analysis underscores a Zipf-like profile for user interest in a given object, which is to be contrasted to the classic Zipf-like popularity of objects for a given user. Also, our analysis reveals that transfer lengths are highly variable and that this variability is due to the stickiness of clients to a particular live object, as opposed to structural (size) properties of objects. Second, based on observations we make, we conjecture that the particular characteristics of live media access workloads are likely to be highly dependent on the nature of the live content being accessed. In our study, this dependence is clear from the strong temporal correlations we observed in the traces, which we attribute to the synchronizing impact of live content on access characteristics. Based on our analyses, we present a model for live media workload generation that incorporates many of our findings, and which we implement in GISMO [19].
Resumo:
Most associative memory models perform one level mapping between predefined sets of input and output patterns1 and are unable to represent hierarchical knowledge. Complex AI systems allow hierarchical representation of concepts, but generally do not have learning capabilities. In this paper, a memory model is proposed which forms concept hierarchy by learning sample relations between concepts. All concepts are represented in a concept layer. Relations between a concept and its defining lower level concepts, are chunked as cognitive codes represented in a coding layer. By updating memory contents in the concept layer through code firing in the coding layer, the system is able to perform an important class of commonsense reasoning, namely recognition and inheritance.