873 resultados para constrained clustering
Resumo:
ICT clusters have attracted much attention because of their rapid growth and their value for other economic activities. Using a nested multi-level model, we examine how conditions at the country level and at the city level affect ICT clustering activity in 227 cities across 22 European countries. We test for the influence of three country regulations (starting a business, registering property, enforcing contracts) and two city conditions (proximity to university, network density) on ICT clustering. We consider heterogeneity within the sector and study two types of ICT activities: ICT product firms and ICT content firms. Our results indicate that country conditions and city conditions each have idiosyncratic implications for ICT clustering, and further, that these can vary by activities in ICT products or ICT content manufacturing.
Resumo:
In this article, along with others, we take the position that the Null-Subject Parameter (NSP) (Chomsky 1981; Rizzi 1982) cluster of properties is narrower in scope than some originally contended. We test for the resetting of the NSP by English L2 learners of Spanish at the intermediate level, including poverty-of-the stimulus knowledge of the Overt Pronoun Constraint (Montalbetti 1984). Our participants are tested before and after five months' residency in Spain in an effort to see if increased amounts of native exposure are particularly beneficial for parameter resetting. Although we demonstrate NSP resetting for some of the L2 learners, our data essentially demonstrate that even with the advent of time/exposure to native input, there is no immediate gainful effect for NSP resetting.
Resumo:
The interaction of C-type lectin receptor 2 (CLEC-2) on platelets with Podoplanin on lymphatic endothelial cells initiates platelet signaling events that are necessary for prevention of blood-lymph mixing during development. In the present study, we show that CLEC-2 signaling via Src family and Syk tyrosine kinases promotes platelet adhesion to primary mouse lymphatic endothelial cells at low shear. Using supported lipid bilayers containing mobile Podoplanin, we further show that activation of Src and Syk in platelets promotes clustering of CLEC-2 and Podoplanin. Clusters of CLEC-2-bound Podoplanin migrate rapidly to the center of the platelet to form a single structure. Fluorescence lifetime imaging demonstrates that molecules within these clusters are within 10 nm of one another and that the clusters are disrupted by inhibition of Src and Syk family kinases. CLEC-2 clusters are also seen in platelets adhered to immobilized Podoplanin using direct stochastic optical reconstruction microscopy. These findings provide mechanistic insight by which CLEC-2 signaling promotes adhesion to Podoplanin and regulation of Podoplanin signaling, thereby contributing to lymphatic vasculature development.
Resumo:
Periocular recognition has recently become an active topic in biometrics. Typically it uses 2D image data of the periocular region. This paper is the first description of combining 3D shape structure with 2D texture. A simple and effective technique using iterative closest point (ICP) was applied for 3D periocular region matching. It proved its strength for relatively unconstrained eye region capture, and does not require any training. Local binary patterns (LBP) were applied for 2D image based periocular matching. The two modalities were combined at the score-level. This approach was evaluated using the Bosphorus 3D face database, which contains large variations in facial expressions, head poses and occlusions. The rank-1 accuracy achieved from the 3D data (80%) was better than that for 2D (58%), and the best accuracy (83%) was achieved by fusing the two types of data. This suggests that significant improvements to periocular recognition systems could be achieved using the 3D structure information that is now available from small and inexpensive sensors.
Resumo:
Clustering methods are increasingly being applied to residential smart meter data, providing a number of important opportunities for distribution network operators (DNOs) to manage and plan the low voltage networks. Clustering has a number of potential advantages for DNOs including, identifying suitable candidates for demand response and improving energy profile modelling. However, due to the high stochasticity and irregularity of household level demand, detailed analytics are required to define appropriate attributes to cluster. In this paper we present in-depth analysis of customer smart meter data to better understand peak demand and major sources of variability in their behaviour. We find four key time periods in which the data should be analysed and use this to form relevant attributes for our clustering. We present a finite mixture model based clustering where we discover 10 distinct behaviour groups describing customers based on their demand and their variability. Finally, using an existing bootstrapping technique we show that the clustering is reliable. To the authors knowledge this is the first time in the power systems literature that the sample robustness of the clustering has been tested.
Resumo:
With the fast development of wireless communications, ZigBee and semiconductor devices, home automation networks have recently become very popular. Since typical consumer products deployed in home automation networks are often powered by tiny and limited batteries, one of the most challenging research issues is concerning energy reduction and the balancing of energy consumption across the network in order to prolong the home network lifetime for consumer devices. The introduction of clustering and sink mobility techniques into home automation networks have been shown to be an efficient way to improve the network performance and have received significant research attention. Taking inspiration from nature, this paper proposes an Ant Colony Optimization (ACO) based clustering algorithm specifically with mobile sink support for home automation networks. In this work, the network is divided into several clusters and cluster heads are selected within each cluster. Then, a mobile sink communicates with each cluster head to collect data directly through short range communications. The ACO algorithm has been utilized in this work in order to find the optimal mobility trajectory for the mobile sink. Extensive simulation results from this research show that the proposed algorithm significantly improves home network performance when using mobile sinks in terms of energy consumption and network lifetime as compared to other routing algorithms currently deployed for home automation networks.
Resumo:
Subspace clustering groups a set of samples from a union of several linear subspaces into clusters, so that the samples in the same cluster are drawn from the same linear subspace. In the majority of the existing work on subspace clustering, clusters are built based on feature information, while sample correlations in their original spatial structure are simply ignored. Besides, original high-dimensional feature vector contains noisy/redundant information, and the time complexity grows exponentially with the number of dimensions. To address these issues, we propose a tensor low-rank representation (TLRR) and sparse coding-based (TLRRSC) subspace clustering method by simultaneously considering feature information and spatial structures. TLRR seeks the lowest rank representation over original spatial structures along all spatial directions. Sparse coding learns a dictionary along feature spaces, so that each sample can be represented by a few atoms of the learned dictionary. The affinity matrix used for spectral clustering is built from the joint similarities in both spatial and feature spaces. TLRRSC can well capture the global structure and inherent feature information of data, and provide a robust subspace segmentation from corrupted data. Experimental results on both synthetic and real-world data sets show that TLRRSC outperforms several established state-of-the-art methods.
Resumo:
Tensor clustering is an important tool that exploits intrinsically rich structures in real-world multiarray or Tensor datasets. Often in dealing with those datasets, standard practice is to use subspace clustering that is based on vectorizing multiarray data. However, vectorization of tensorial data does not exploit complete structure information. In this paper, we propose a subspace clustering algorithm without adopting any vectorization process. Our approach is based on a novel heterogeneous Tucker decomposition model taking into account cluster membership information. We propose a new clustering algorithm that alternates between different modes of the proposed heterogeneous tensor model. All but the last mode have closed-form updates. Updating the last mode reduces to optimizing over the multinomial manifold for which we investigate second order Riemannian geometry and propose a trust-region algorithm. Numerical experiments show that our proposed algorithm compete effectively with state-of-the-art clustering algorithms that are based on tensor factorization.
Resumo:
In this paper, we develop a novel constrained recursive least squares algorithm for adaptively combining a set of given multiple models. With data available in an online fashion, the linear combination coefficients of submodels are adapted via the proposed algorithm.We propose to minimize the mean square error with a forgetting factor, and apply the sum to one constraint to the combination parameters. Moreover an l1-norm constraint to the combination parameters is also applied with the aim to achieve sparsity of multiple models so that only a subset of models may be selected into the final model. Then a weighted l2-norm is applied as an approximation to the l1-norm term. As such at each time step, a closed solution of the model combination parameters is available. The contribution of this paper is to derive the proposed constrained recursive least squares algorithm that is computational efficient by exploiting matrix theory. The effectiveness of the approach has been demonstrated using both simulated and real time series examples.
Resumo:
Trypanosoma (Megatrypanum) theileri from cattle and trypanosomes of other artiodactyls form a clade of closely related species in analyses using ribosomal sequences. Analysis of polymorphic sequences of a larger number of trypanosomes from broader geographical origins is required to evaluate the Clustering of isolates as suggested by previous studies. Here, we determined the sequences of the spliced leader (SL) genes of 21 isolates from cattle and 2 from water buffalo from distant regions of Brazil. Analysis of SL gene repeats revealed that the 5S rRNA gene is inserted within the intergenic region. Phylogeographical patterns inferred using SL sequences showed at least 5 major genotypes of T. theileri distributed in 2 strongly divergent lineages. Lineage TthI comprises genotypes IA and IB from buffalo and cattle, respectively, from the Southeast and Central regions, whereas genotype IC is restricted to cattle from the Southern region. Lineage Tth II includes cattle genotypes IIA, which is restricted to the North and Northeast, and IIB, found in the Centre, West, North and Northeast. PCR-RFLP of SL genes revealed valuable markers for genotyping T. theileri. The results of this study emphasize the genetic complexity and corroborate the geographical structuring of T. theileri genotypes found in cattle.
Resumo:
We characterized 28 new isolates of Trypanosoma cruzi IIc (TCIIc) of mammals and triatomines from Northern to Southern Brazil, confirming the widespread distribution of this lineage. Phylogenetic analyses using cytochrome b and SSU rDNA sequences clearly separated TCIIc from TCIIa according to terrestrial and arboreal ecotopes of their preferential mammalian hosts and vectors. TCIIc was more closely related to TCIId/e, followed by TCIIa, and separated by large distances from TCIIb and TCI. Despite being indistinguishable by traditional genotyping and generally being assigned to Z3, we provide evidence that TCIIa from South America and TCIIa from North America correspond to independent lineages that circulate in distinct hosts and ecological niches. Armadillos, terrestrial didelphids and rodents, and domestic dogs were found infected by TCIIc in Brazil. We believe that, in Brazil, this is the first description of TCIIc from rodents and domestic dogs. Terrestrial triatomines of genera Panstrongylus and Triatoma were confirmed as vectors of TCIIc. Together, habitat, mammalian host and vector association corroborated the link between TCIIc and terrestrial transmission cycles/ecological niches. Analysis of ITS1 rDNA sequences disclosed clusters of TCIIc isolates in accordance with their geographic origin, independent of their host species. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Clustering is a difficult task: there is no single cluster definition and the data can have more than one underlying structure. Pareto-based multi-objective genetic algorithms (e.g., MOCK Multi-Objective Clustering with automatic K-determination and MOCLE-Multi-Objective Clustering Ensemble) were proposed to tackle these problems. However, the output of such algorithms can often contains a high number of partitions, becoming difficult for an expert to manually analyze all of them. In order to deal with this problem, we present two selection strategies, which are based on the corrected Rand, to choose a subset of solutions. To test them, they are applied to the set of solutions produced by MOCK and MOCLE in the context of several datasets. The study was also extended to select a reduced set of partitions from the initial population of MOCLE. These analysis show that both versions of selection strategy proposed are very effective. They can significantly reduce the number of solutions and, at the same time, keep the quality and the diversity of the partitions in the original set of solutions. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.