786 resultados para Incremental Clustering
Resumo:
This paper is concerned with tensor clustering with the assistance of dimensionality reduction approaches. A class of formulation for tensor clustering is introduced based on tensor Tucker decomposition models. In this formulation, an extra tensor mode is formed by a collection of tensors of the same dimensions and then used to assist a Tucker decomposition in order to achieve data dimensionality reduction. We design two types of clustering models for the tensors: PCA Tensor Clustering model and Non-negative Tensor Clustering model, by utilizing different regularizations. The tensor clustering can thus be solved by the optimization method based on the alternative coordinate scheme. Interestingly, our experiments show that the proposed models yield comparable or even better performance compared to most recent clustering algorithms based on matrix factorization.
Resumo:
Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
Resumo:
This study has investigated serial (temporal) clustering of extra-tropical cyclones simulated by 17 climate models that participated in CMIP5. Clustering was estimated by calculating the dispersion (ratio of variance to mean) of 30 December-February counts of Atlantic storm tracks passing nearby each grid point. Results from single historical simulations of 1975-2005 were compared to those from historical ERA40 reanalyses from 1958-2001 ERA40 and single future model projections of 2069-2099 under the RCP4.5 climate change scenario. Models were generally able to capture the broad features in reanalyses reported previously: underdispersion/regularity (i.e. variance less than mean) in the western core of the Atlantic storm track surrounded by overdispersion/clustering (i.e. variance greater than mean) to the north and south and over western Europe. Regression of counts onto North Atlantic Oscillation (NAO) indices revealed that much of the overdispersion in the historical reanalyses and model simulations can be accounted for by NAO variability. Future changes in dispersion were generally found to be small and not consistent across models. The overdispersion statistic, for any 30 year sample, is prone to large amounts of sampling uncertainty that obscures the climate change signal. For example, the projected increase in dispersion for storm counts near London in the CNRMCM5 model is 0.1 compared to a standard deviation of 0.25. Projected changes in the mean and variance of NAO are insufficient to create changes in overdispersion that are discernible above natural sampling variations.
Resumo:
ICT clusters have attracted much attention because of their rapid growth and their value for other economic activities. Using a nested multi-level model, we examine how conditions at the country level and at the city level affect ICT clustering activity in 227 cities across 22 European countries. We test for the influence of three country regulations (starting a business, registering property, enforcing contracts) and two city conditions (proximity to university, network density) on ICT clustering. We consider heterogeneity within the sector and study two types of ICT activities: ICT product firms and ICT content firms. Our results indicate that country conditions and city conditions each have idiosyncratic implications for ICT clustering, and further, that these can vary by activities in ICT products or ICT content manufacturing.
Resumo:
This study examines when “incremental” change is likely to trigger “discontinuous” change, using the lens of complex adaptive systems theory. Going beyond the simulations and case studies through which complex adaptive systems have been approached so far, we study the relationship between incremental organizational reconfigurations and discontinuous organizational restructurings using a large-scale database of U.S. Fortune 50 industrial corporations. We develop two types of escalation process in organizations: accumulation and perturbation. Under ordinary conditions, it is perturbation rather than the accumulation that is more likely to trigger subsequent discontinuous change. Consistent with complex adaptive systems theory, organizations are more sensitive to both accumulation and perturbation in conditions of heightened disequilibrium. Contrary to expectations, highly interconnected organizations are not more liable to discontinuous change. We conclude with implications for further research, especially the need to attend to the potential role of managerial design and coping when transferring complex adaptive systems theory from natural systems to organizational systems.
Resumo:
In this article, along with others, we take the position that the Null-Subject Parameter (NSP) (Chomsky 1981; Rizzi 1982) cluster of properties is narrower in scope than some originally contended. We test for the resetting of the NSP by English L2 learners of Spanish at the intermediate level, including poverty-of-the stimulus knowledge of the Overt Pronoun Constraint (Montalbetti 1984). Our participants are tested before and after five months' residency in Spain in an effort to see if increased amounts of native exposure are particularly beneficial for parameter resetting. Although we demonstrate NSP resetting for some of the L2 learners, our data essentially demonstrate that even with the advent of time/exposure to native input, there is no immediate gainful effect for NSP resetting.
Resumo:
The interaction of C-type lectin receptor 2 (CLEC-2) on platelets with Podoplanin on lymphatic endothelial cells initiates platelet signaling events that are necessary for prevention of blood-lymph mixing during development. In the present study, we show that CLEC-2 signaling via Src family and Syk tyrosine kinases promotes platelet adhesion to primary mouse lymphatic endothelial cells at low shear. Using supported lipid bilayers containing mobile Podoplanin, we further show that activation of Src and Syk in platelets promotes clustering of CLEC-2 and Podoplanin. Clusters of CLEC-2-bound Podoplanin migrate rapidly to the center of the platelet to form a single structure. Fluorescence lifetime imaging demonstrates that molecules within these clusters are within 10 nm of one another and that the clusters are disrupted by inhibition of Src and Syk family kinases. CLEC-2 clusters are also seen in platelets adhered to immobilized Podoplanin using direct stochastic optical reconstruction microscopy. These findings provide mechanistic insight by which CLEC-2 signaling promotes adhesion to Podoplanin and regulation of Podoplanin signaling, thereby contributing to lymphatic vasculature development.
Resumo:
The present study investigates the parsing of pre-nominal relative clauses (RCs) in children for the first time with a realtime methodology that reveals moment-to-moment processing patterns as the sentence unfolds. A self-paced listening experiment with Turkish-speaking children (aged 5–8) and adults showed that both groups display a sign of processing cost both in subject and object RCs at different points through the flow of the utterance when integrating the cues that are uninformative (i.e., ambiguous in function) and that are structurally and probabilistically unexpected. Both groups show a processing facilitation as soon as the morphosyntactic dependencies are completed and parse the unbounded dependencies rapidly using the morphosyntactic cues rather than waiting for the clause-final filler. These findings show that five-year-old children show similar patterns to adults in processing the morphosyntactic cues incrementally and in forming expectations about the rest of the utterance on the basis of the probabilistic model of their language.
Resumo:
Clustering methods are increasingly being applied to residential smart meter data, providing a number of important opportunities for distribution network operators (DNOs) to manage and plan the low voltage networks. Clustering has a number of potential advantages for DNOs including, identifying suitable candidates for demand response and improving energy profile modelling. However, due to the high stochasticity and irregularity of household level demand, detailed analytics are required to define appropriate attributes to cluster. In this paper we present in-depth analysis of customer smart meter data to better understand peak demand and major sources of variability in their behaviour. We find four key time periods in which the data should be analysed and use this to form relevant attributes for our clustering. We present a finite mixture model based clustering where we discover 10 distinct behaviour groups describing customers based on their demand and their variability. Finally, using an existing bootstrapping technique we show that the clustering is reliable. To the authors knowledge this is the first time in the power systems literature that the sample robustness of the clustering has been tested.
Resumo:
With the fast development of wireless communications, ZigBee and semiconductor devices, home automation networks have recently become very popular. Since typical consumer products deployed in home automation networks are often powered by tiny and limited batteries, one of the most challenging research issues is concerning energy reduction and the balancing of energy consumption across the network in order to prolong the home network lifetime for consumer devices. The introduction of clustering and sink mobility techniques into home automation networks have been shown to be an efficient way to improve the network performance and have received significant research attention. Taking inspiration from nature, this paper proposes an Ant Colony Optimization (ACO) based clustering algorithm specifically with mobile sink support for home automation networks. In this work, the network is divided into several clusters and cluster heads are selected within each cluster. Then, a mobile sink communicates with each cluster head to collect data directly through short range communications. The ACO algorithm has been utilized in this work in order to find the optimal mobility trajectory for the mobile sink. Extensive simulation results from this research show that the proposed algorithm significantly improves home network performance when using mobile sinks in terms of energy consumption and network lifetime as compared to other routing algorithms currently deployed for home automation networks.
Resumo:
Subspace clustering groups a set of samples from a union of several linear subspaces into clusters, so that the samples in the same cluster are drawn from the same linear subspace. In the majority of the existing work on subspace clustering, clusters are built based on feature information, while sample correlations in their original spatial structure are simply ignored. Besides, original high-dimensional feature vector contains noisy/redundant information, and the time complexity grows exponentially with the number of dimensions. To address these issues, we propose a tensor low-rank representation (TLRR) and sparse coding-based (TLRRSC) subspace clustering method by simultaneously considering feature information and spatial structures. TLRR seeks the lowest rank representation over original spatial structures along all spatial directions. Sparse coding learns a dictionary along feature spaces, so that each sample can be represented by a few atoms of the learned dictionary. The affinity matrix used for spectral clustering is built from the joint similarities in both spatial and feature spaces. TLRRSC can well capture the global structure and inherent feature information of data, and provide a robust subspace segmentation from corrupted data. Experimental results on both synthetic and real-world data sets show that TLRRSC outperforms several established state-of-the-art methods.
Resumo:
Tensor clustering is an important tool that exploits intrinsically rich structures in real-world multiarray or Tensor datasets. Often in dealing with those datasets, standard practice is to use subspace clustering that is based on vectorizing multiarray data. However, vectorization of tensorial data does not exploit complete structure information. In this paper, we propose a subspace clustering algorithm without adopting any vectorization process. Our approach is based on a novel heterogeneous Tucker decomposition model taking into account cluster membership information. We propose a new clustering algorithm that alternates between different modes of the proposed heterogeneous tensor model. All but the last mode have closed-form updates. Updating the last mode reduces to optimizing over the multinomial manifold for which we investigate second order Riemannian geometry and propose a trust-region algorithm. Numerical experiments show that our proposed algorithm compete effectively with state-of-the-art clustering algorithms that are based on tensor factorization.