869 resultados para height partition clustering
Resumo:
In this article, we present a novel application of a quantum clustering (QC) technique to objectively cluster the conformations, sampled by molecular dynamics simulations performed on different ligand bound structures of the protein. We further portray each conformational population in terms of dynamically stable network parameters which beautifully capture the ligand induced variations in the ensemble in atomistic detail. The conformational populations thus identified by the QC method and verified by network parameters are evaluated for different ligand bound states of the protein pyrrolysyl-tRNA synthetase (DhPylRS) from D. hafniense. The ligand/environment induced re-distribution of protein conformational ensembles forms the basis for understanding several important biological phenomena such as allostery and enzyme catalysis. The atomistic level characterization of each population in the conformational ensemble in terms of the re-orchestrated networks of amino acids is a challenging problem, especially when the changes are minimal at the backbone level. Here we demonstrate that the QC method is sensitive to such subtle changes and is able to cluster MD snapshots which are similar at the side-chain interaction level. Although we have applied these methods on simulation trajectories of a modest time scale (20 ns each), we emphasize that our methodology provides a general approach towards an objective clustering of large-scale MD simulation data and may be applied to probe multistate equilibria at higher time scales, and to problems related to protein folding for any protein or protein-protein/RNA/DNA complex of interest with a known structure.
Resumo:
Among the carbon allotropes, carbyne chains appear outstandingly accessible for sorption and very light. Hydrogen adsorption on calcium-decorated carbyne chain was studied using ab initio density functional calculations. The estimation of surface area of carbyne gives the value four times larger than that of graphene, which makes carbyne attractive as a storage scaffold medium. Furthermore, calculations show that a Ca-decorated carbyne can adsorb up to 6 H(2) molecules per Ca atom with a binding energy of similar to 0.2 eV, desirable for reversible storage, and the hydrogen storage capacity can exceed similar to 8 wt %. Unlike recently reported transition metal-decorated carbon nanostructures, which suffer from the metal clustering diminishing the storage capacity, the clustering of Ca atoms on carbyne is energetically unfavorable. Thermodynamics of adsorption of H(2) molecules on the Ca atom was also investigated using equilibrium grand partition function.
Resumo:
Emerging high-dimensional data mining applications needs to find interesting clusters embeded in arbitrarily aligned subspaces of lower dimensionality. It is difficult to cluster high-dimensional data objects, when they are sparse and skewed. Updations are quite common in dynamic databases and they are usually processed in batch mode. In very large dynamic databases, it is necessary to perform incremental cluster analysis only to the updations. We present a incremental clustering algorithm for subspace clustering in very high dimensions, which handles both insertion and deletions of datapoints to the backend databases.
Resumo:
Delineation of homogeneous precipitation regions (regionalization) is necessary for investigating frequency and spatial distribution of meteorological droughts. The conventional methods of regionalization use statistics of precipitation as attributes to establish homogeneous regions. Therefore they cannot be used to form regions in ungauged areas, and they may not be useful to form meaningful regions in areas having sparse rain gauge density. Further, validation of the regions for homogeneity in precipitation is not possible, since the use of the precipitation statistics to form regions and subsequently to test the regional homogeneity is not appropriate. To alleviate this problem, an approach based on fuzzy cluster analysis is presented. It allows delineation of homogeneous precipitation regions in data sparse areas using large scale atmospheric variables (LSAV), which influence precipitation in the study area, as attributes. The LSAV, location parameters (latitude, longitude and altitude) and seasonality of precipitation are suggested as features for regionalization. The approach allows independent validation of the identified regions for homogeneity using statistics computed from the observed precipitation. Further it has the ability to form regions even in ungauged areas, owing to the use of attributes that can be reliably estimated even when no at-site precipitation data are available. The approach was applied to delineate homogeneous annual rainfall regions in India, and its effectiveness is illustrated by comparing the results with those obtained using rainfall statistics, regionalization based on hard cluster analysis, and meteorological sub-divisions in India. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
We use the HΙ scale height data along with the HΙ rotation curve as constraints to probe the shape and density profile of the dark matter halos of M31 (Andromeda) and the superthin, low surface brightness (LSB) galaxy UGC 07321. We model the galaxy as a two component system of gravitationally-coupled stars and gas subjected to the force field of a dark matter halo. For M31, we get a flattened halo which is required to match the outer galactic HΙ scale height data, with our best-fit axis ratio (0.4) lying at the most oblate end of the distributions obtained from cosmological simulations. For UGC 07321, our best-fit halo core radius is only slightly larger than the stellar disc scale length, indicating that the halo is important even at small radii in this LSB galaxy. The high value of the gas velocity dispersion required to match the scale height data can explain the low star-formation rate of this galaxy.
Resumo:
Over past few years, the studies of cultured neuronal networks have opened up avenues for understanding the ion channels, receptor molecules, and synaptic plasticity that may form the basis of learning and memory. The hippocampal neurons from rats are dissociated and cultured on a surface containing a grid of 64 electrodes. The signals from these 64 electrodes are acquired using a fast data acquisition system MED64 (Alpha MED Sciences, Japan) at a sampling rate of 20 K samples with a precision of 16-bits per sample. A few minutes of acquired data runs in to a few hundreds of Mega Bytes. The data processing for the neural analysis is highly compute-intensive because the volume of data is huge. The major processing requirements are noise removal, pattern recovery, pattern matching, clustering and so on. In order to interface a neuronal colony to a physical world, these computations need to be performed in real-time. A single processor such as a desk top computer may not be adequate to meet this computational requirements. Parallel computing is a method used to satisfy the real-time computational requirements of a neuronal system that interacts with an external world while increasing the flexibility and scalability of the application. In this work, we developed a parallel neuronal system using a multi-node Digital Signal processing system. With 8 processors, the system is able to compute and map incoming signals segmented over a period of 200 ms in to an action in a trained cluster system in real time.
Resumo:
The electrical transport properties of InN/GaN heterostructure based Schottky junctions were studied over a wide temperature range of 200-500 K. The barrier height and the ideality factor were calculated from current-voltage (I-V) characteristics based on thermionic emission (TE), and found to be temperature dependent. The barrier height was found to increase and the ideality factor to decrease with increasing temperature. The observed temperature dependence of the barrier height indicates that the Schottky barrier height is inhomogeneous in nature at the heterostructure interface. Such inhomogeneous behavior was modeled by assuming the existence of a Gaussian distribution of barrier heights at the heterostructure interface. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Advertisements(Ads) are the main revenue earner for Television (TV) broadcasters. As TV reaches a large audience, it acts as the best media for advertisements of products and services. With the emergence of digital TV, it is important for the broadcasters to provide an intelligent service according to the various dimensions like program features, ad features, viewers’ interest and sponsors’ preference. We present an automatic ad recommendation algorithm that selects a set of ads by considering these dimensions and semantically match them with programs. Features of the ad video are captured interms of annotations and they are grouped into number of predefined semantic categories by using a categorization technique. Fuzzy categorical data clustering technique is applied on categorized data for selecting better suited ads for a particular program. Since the same ad can be recommended for more than one program depending upon multiple parameters, fuzzy clustering acts as the best suited method for ad recommendation. The relative fuzzy score called “degree of membership” calculated for each ad indicates the membership of a particular ad to different program clusters. Subjective evaluation of the algorithm is done by 10 different people and rated with a high success score.
Resumo:
Support Vector Clustering has gained reasonable attention from the researchers in exploratory data analysis due to firm theoretical foundation in statistical learning theory. Hard Partitioning of the data set achieved by support vector clustering may not be acceptable in real world scenarios. Rough Support Vector Clustering is an extension of Support Vector Clustering to attain a soft partitioning of the data set. But the Quadratic Programming Problem involved in Rough Support Vector Clustering makes it computationally expensive to handle large datasets. In this paper, we propose Rough Core Vector Clustering algorithm which is a computationally efficient realization of Rough Support Vector Clustering. Here Rough Support Vector Clustering problem is formulated using an approximate Minimum Enclosing Ball problem and is solved using an approximate Minimum Enclosing Ball finding algorithm. Experiments done with several Large Multi class datasets such as Forest cover type, and other Multi class datasets taken from LIBSVM page shows that the proposed strategy is efficient, finds meaningful soft cluster abstractions which provide a superior generalization performance than the SVM classifier.
Resumo:
Applications in various domains often lead to very large and frequently high-dimensional data. Successful algorithms must avoid the curse of dimensionality but at the same time should be computationally efficient. Finding useful patterns in large datasets has attracted considerable interest recently. The primary goal of the paper is to implement an efficient Hybrid Tree based clustering method based on CF-Tree and KD-Tree, and combine the clustering methods with KNN-Classification. The implementation of the algorithm involves many issues like good accuracy, less space and less time. We will evaluate the time and space efficiency, data input order sensitivity, and clustering quality through several experiments.
Resumo:
This paper presents a novel Second Order Cone Programming (SOCP) formulation for large scale binary classification tasks. Assuming that the class conditional densities are mixture distributions, where each component of the mixture has a spherical covariance, the second order statistics of the components can be estimated efficiently using clustering algorithms like BIRCH. For each cluster, the second order moments are used to derive a second order cone constraint via a Chebyshev-Cantelli inequality. This constraint ensures that any data point in the cluster is classified correctly with a high probability. This leads to a large margin SOCP formulation whose size depends on the number of clusters rather than the number of training data points. Hence, the proposed formulation scales well for large datasets when compared to the state-of-the-art classifiers, Support Vector Machines (SVMs). Experiments on real world and synthetic datasets show that the proposed algorithm outperforms SVM solvers in terms of training time and achieves similar accuracies.
Resumo:
The ultrastructural functions of the electron-dense glycopeptidolipid-containing outermost layer (OL), the arabinogalactan-mycolic acid-containing electron-transparent layer (ETL), and the electron-dense peptidoglycan layer (PGL) of the mycobacterial cell wall in septal growth and constriction are not clear. Therefore, using transmission electron microscopy, we studied the participation of the three layers in septal growth and constriction in the fast-growing saprophytic species Mycobacterium smegmatis and the slow-growing pathogenic species Mycobacterium xenopi and Mycobacterium tuberculosis in order to document the processes in a comprehensive and comparative manner and to find out whether the processes are conserved across different mycobacterial species. A complete septal partition is formed first by the fresh synthesis of the septal PGL (S-PGL) and septal ETL (S-ETL) from the envelope PGL (E-PGL) in M. smegmatis and M. xenopi. The S-ETL is not continuous with the envelope ETL (E-ETL) due to the presence of the E-PGL between them. The E-PGL disappears, and the S-ETL becomes continuous with the E-ETL, when the OL begins to grow and invaginate into the S-ETL for constriction. However, in M. tuberculosis, the S-PGL and S-ETL grow from the E-PGL and E-ETL, respectively, without a separation between the E-ETL and S-ETL by the E-PGL, in contrast to the process in M. smegmatis and M. xenopi. Subsequent growth and invagination of the OL into the S-ETL of the septal partition initiates and completes septal constriction in M. tuberculosis. A model for the conserved sequential process of mycobacterial septation, in which the formation of a complete septal partition is followed by constriction, is presented. The probable physiological significance of the process is discussed. The ultrastructural features of septation and constriction in mycobacteria are unusually different from those in the well-studied organisms Escherichia coli and Bacillus subtilis.
Resumo:
Three-component self-assembly of a cis-blocked 90 degrees Pd(II) acceptor with a mixture of a tetraimidazole and a linear dipyridyl donor self-discriminated into unusual Pd-8 molecular swing (1) and Pd-6 molecular boat (2), which are characterized by single-crystal X-ray diffraction analysis; their ability to bind C-60 in solution is established by fluorescence titration.
Resumo:
Writing the hindered rotor (hr) partition function as the trace of (rho) over cap = e(-beta(H) over cap hr), we approximate it by the sum of contributions from a set of points in position space. The contribution of the density matrix from each point is approximated by performing a local harmonic expansion around it. The highlight of this method is that it can be easily extended to multidimensional systems. Local harmonic expansion leads to a breakdown of the method a low temperatures. In order to calculate the partition function at low temperatures, we suggest a matrix multiplication procedure. The results obtained using these methods closely agree with the exact partition function at all temperature ranges. Our method bypasses the evaluation of eigenvalues and eigenfunctions and evaluates the density matrix for internal rotation directly. We also suggest a procedure to account for the antisymmetry of the total wavefunction in the same. (C) 2012 Elsevier B.V. All rights reserved.