765 resultados para Grouping, clustering, campi, associazione
Resumo:
A new clustering technique, based on the concept of immediato neighbourhood, with a novel capability to self-learn the number of clusters expected in the unsupervized environment, has been developed. The method compares favourably with other clustering schemes based on distance measures, both in terms of conceptual innovations and computational economy. Test implementation of the scheme using C-l flight line training sample data in a simulated unsupervized mode has brought out the efficacy of the technique. The technique can easily be implemented as a front end to established pattern classification systems with supervized learning capabilities to derive unified learning systems capable of operating in both supervized and unsupervized environments. This makes the technique an attractive proposition in the context of remotely sensed earth resources data analysis wherein it is essential to have such a unified learning system capability.
Resumo:
This paper presents a statistical aircraft trajectory clustering approach aimed at discriminating between typical manned and expected unmanned traffic patterns. First, a resampled version of each trajectory is modelled using a mixture of Von Mises distributions (circular statistics). Second, the remodelled trajectories are globally aligned using tools from bioinformatics. Third, the alignment scores are used to cluster the trajectories using an iterative k-medoids approach and an appropriate distance function. The approach is then evaluated using synthetically generated unmanned aircraft flights combined with real air traffic position reports taken over a sector of Northern Queensland, Australia. Results suggest that the technique is useful in distinguishing between expected unmanned and manned aircraft traffic behaviour, as well as identifying some common conventional air traffic patterns.
Resumo:
Partitional clustering algorithms, which partition the dataset into a pre-defined number of clusters, can be broadly classified into two types: algorithms which explicitly take the number of clusters as input and algorithms that take the expected size of a cluster as input. In this paper, we propose a variant of the k-means algorithm and prove that it is more efficient than standard k-means algorithms. An important contribution of this paper is the establishment of a relation between the number of clusters and the size of the clusters in a dataset through the analysis of our algorithm. We also demonstrate that the integration of this algorithm as a pre-processing step in classification algorithms reduces their running-time complexity.
Resumo:
The k-means algorithm is an extremely popular technique for clustering data. One of the major limitations of the k-means is that the time to cluster a given dataset D is linear in the number of clusters, k. In this paper, we employ height balanced trees to address this issue. Specifically, we make two major contributions, (a) we propose an algorithm, RACK (acronym for RApid Clustering using k-means), which takes time favorably comparable with the fastest known existing techniques, and (b) we prove an expected bound on the quality of clustering achieved using RACK. Our experimental results on large datasets strongly suggest that RACK is competitive with the k-means algorithm in terms of quality of clustering, while taking significantly less time.
Resumo:
The keyword based search technique suffers from the problem of synonymic and polysemic queries. Current approaches address only theproblem of synonymic queries in which different queries might have the same information requirement. But the problem of polysemic queries,i.e., same query having different intentions, still remains unaddressed. In this paper, we propose the notion of intent clusters, the members of which will have the same intention. We develop a clustering algorithm that uses the user session information in query logs in addition to query URL entries to identify cluster of queries having the same intention. The proposed approach has been studied through case examples from the actual log data from AOL, and the clustering algorithm is shown to be successful in discerning the user intentions.
Resumo:
Resistometric studies of isochronal and isothermal annealing of an Al-0.64 at.% Ag alloy have given a value of 0.13 ± 0.02 eV for the silver-vacancy binding energy and 0.55 ± 0.03 eV for the migration energy of solute atoms.
Resumo:
The influence of 0.03 and 0.08 at. % Ag additions on the clustering of Zn atoms in an Al-4.4 at. % Zn alloy has been studied by resistometry. The effect of quenching and ageing temperatures shows that the ageing-ratio method of calculating the vacancy-solute atom binding energy is not applicable to these alloys. Zone-formation in Al-Zn is unaffected by Ag additions, but the zone-reversion process seems to be influenced. Apparent vacancy-formation energies in the binary and ternary alloys have been used to evaluate the v-Ag atom binding energy as 0.21 eV. It is proposed that, Ag and Zn being similar in size, the relative vacancy binding results from valency effects, and that in Al-Zn-Ag alloys clusters of Zn and Ag may form simultaneously, unaffected by the presence of each other. © 1970 Chapman and Hall Ltd.
Resumo:
Isochronal and isothermal ageing experiments have been carried out to determine the influence of 0.01 at. % addition of a second solute on the clustering rate in the quenched Al-4,4 a/o Zn alloy. The influence of quenching and ageing temperatures has been interpreted to obtain the apparent vacancy formation and vacancy migration energies in the various ternary alloys. Using a vacancy-aided clustering model the following values of binding free energy have been evaluated: Ce-0.18; Dy-0.24; Fe-0.18; Li-0.25; Mn-0.27; Nb-0.18; Pt-0.23; Sb-0.21; Si-0.30; Y-0.25; and Yb-0.23 (± 0.02 eV). These binding energy values refer to that between a solute atom and a single vacancy. The values of vacancy migration energy (c. 0.4 eV) and the experimental activation energy for solute diffusion (c. 1.1 eV) are unaffected by the presence of the ternary atoms in the Al-Zn alloy.
Resumo:
Al-4.4 a/oZn and Al-4.4 a/oZn with Ag, Ce, Dy, Li, Nb, Pt, Y, or Yb, alloys have been investigated by resistometry with a view to study the solute-vacancy interactions and clustering kinetics in these alloys. Solute-vacancy binding energies have been evaluated for all these elements by making use of appropriate methods of evaluation. Ag and Dy additions yield some interesting results and these have been discussed in the thesis. Solute-vacancy binding energy values obtained here have been compared with other available values and discussed. A study of the type of interaction between vacancies and solute atoms indicates that the valency effect is more predominant than the elastic effect.
Resumo:
We view association of concepts as a complex network and present a heuristic for clustering concepts by taking into account the underlying network structure of their associations. Clusters generated from our approach are qualitatively better than clusters generated from the conventional spectral clustering mechanism used for graph partitioning.
Resumo:
This paper investigates the clustering pattern in the Finnish stock market. Using trading volume and time as factors capturing the clustering pattern in the market, the Keim and Madhavan (1996) and the Engle and Russell (1998) model provide the framework for the analysis. The descriptive and the parametric analysis provide evidences that an important determinant of the famous U-shape pattern in the market is the rate of information arrivals as measured by large trading volumes and durations at the market open and close. Precisely, 1) the larger the trading volume, the greater the impact on prices both in the short and the long run, thus prices will differ across quantities. 2) Large trading volume is a non-linear function of price changes in the long run. 3) Arrival times are positively autocorrelated, indicating a clustering pattern and 4) Information arrivals as approximated by durations are negatively related to trading flow.
Resumo:
This paper discusses a method for scaling SVM with Gaussian kernel function to handle large data sets by using a selective sampling strategy for the training set. It employs a scalable hierarchical clustering algorithm to construct cluster indexing structures of the training data in the kernel induced feature space. These are then used for selective sampling of the training data for SVM to impart scalability to the training process. Empirical studies made on real world data sets show that the proposed strategy performs well on large data sets.