38 resultados para Elaborazione d’immagini, Microscopia, Istopatologia, Classificazione, K-means

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the methodology for identifying moving obstacles by obtaining a reliable and a sparse optical flow from image sequences. Given a sequence of images, basically we can detect two-types of on road vehicles, vehicles traveling in the opposite direction and vehicles traveling in the same direction. For both types, distinct feature points can be detected by Shi and Tomasi corner detector algorithm. Then pyramidal Lucas Kanade method for optical flow calculation is used to match the sparse feature set of one frame on the consecutive frame. By applying k means clustering on four component feature vector, which are to be the coordinates of the feature point and the two components of the optical flow, we can easily calculate the centroids of the clusters and the objects can be easily tracked. The vehicles traveling in the opposite direction produce a diverging vector field, while vehicles traveling in the same direction produce a converging vector field

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The k-means algorithm is a partitional clustering method. Over 60 years old, it has been successfully used for a variety of problems. The popularity of k-means is in large part a consequence of its simplicity and efficiency. In this paper we are inspired by these appealing properties of k-means in the development of a clustering algorithm which accepts the notion of "positively" and "negatively" labelled data. The goal is to discover the cluster structure of both positive and negative data in a manner which allows for the discrimination between the two sets. The usefulness of this idea is demonstrated practically on the problem of face recognition, where the task of learning the scope of a person's appearance should be done in a manner which allows this face to be differentiated from others.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to alleviate the traffic congestion and reduce the complexity of traffic control and management, it is necessary to exploit traffic sub-areas division which should be effective in planing traffic. Some researchers applied the K-Means algorithm to divide traffic sub-areas on the taxi trajectories. However, the traditional K-Means algorithms faced difficulties in processing large-scale Global Position System(GPS) trajectories of taxicabs with the restrictions of memory, I/O, computing performance. This paper proposes a Parallel Traffic Sub-Areas Division(PTSD) method which consists of two stages, on the basis of the Parallel K-Means(PKM) algorithm. During the first stage, we develop a process to cluster traffic sub-areas based on the PKM algorithm. Then, the second stage, we identify boundary of traffic sub-areas on the base of cluster result. According to this method, we divide traffic sub-areas of Beijing on the real-word (GPS) trajectories of taxicabs. The experiment and discussion show that the method is effective in dividing traffic sub-areas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Privacy-preserving data mining aims to keep data safe, yet useful. But algorithms providing strong guarantees often end up with low utility. We propose a novel privacy preserving framework that thwarts an adversary from inferring an unknown data point by ensuring that the estimation error is almost invariant to the inclusion/exclusion of the data point. By focusing directly on the estimation error of the data point, our framework is able to significantly lower the perturbation required. We use this framework to propose a new privacy aware K-means clustering algorithm. Using both synthetic and real datasets, we demonstrate that the utility of this algorithm is almost equal to that of the unperturbed K-means, and at strict privacy levels, almost twice as good as compared to the differential privacy counterpart.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To efficiently and yet accurately cluster Web documents is of great interests to Web users and is a key component of the searching accuracy of a Web search engine. To achieve this, this paper introduces a new approach for the clustering of Web documents, which is called maximal frequent itemset (MFI) approach. Iterative clustering algorithms, such as K-means and expectation-maximization (EM), are sensitive to their initial conditions. MFI approach firstly locates the center points of high density clusters precisely. These center points then are used as initial points for the K-means algorithm. Our experimental results tested on 3 Web document sets show that our MFI approach outperforms the other methods we compared in most cases, particularly in the case of large number of categories in Web document sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For many clustering algorithms, such as K-Means, EM, and CLOPE, there is usually a requirement to set some parameters. Often, these parameters directly or indirectly control the number of clusters, that is, k, to return. In the presence of different data characteristics and analysis contexts, it is often difficult for the user to estimate the number of clusters in the data set. This is especially true in text collections such as Web documents, images, or biological data. In an effort to improve the effectiveness of clustering, we seek the answer to a fundamental question: How can we effectively estimate the number of clusters in a given data set? We propose an efficient method based on spectra analysis of eigenvalues (not eigenvectors) of the data set as the solution to the above. We first present the relationship between a data set and its underlying spectra with theoretical and experimental results. We then show how our method is capable of suggesting a range of k that is well suited to different analysis contexts. Finally, we conclude with further  empirical results to show how the answer to this fundamental question enhances the clustering process for large text collections.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For many clustering algorithms, such as k-means, EM, and CLOPE, there is usually a requirement to set some parameters. Often, these parameters directly or indirectly control the number of clusters to return. In the presence of different data characteristics and analysis contexts, it is often difficult for the user to estimate the number of clusters in the data set. This is especially true in text collections such as Web documents, images or biological data. The fundamental question this paper addresses is: ldquoHow can we effectively estimate the natural number of clusters in a given text collection?rdquo. We propose to use spectral analysis, which analyzes the eigenvalues (not eigenvectors) of the collection, as the solution to the above. We first present the relationship between a text collection and its underlying spectra. We then show how the answer to this question enhances the clustering process. Finally, we conclude with empirical results and related work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examines attitudes of U.S.-based Academy of Marketing Science members toward teaching, research, participation in administration (including service), and academic promotional issues. Individuals were grouped using Ward’s and K-means clustering procedures, which revealed four groups—established academics, research-focused academics, less satisfied midcareer academics, and satisfied teachers. Clusters were further profiled according to the amount of time spent on teaching, research, and administration; research output; and individual demographic and institutional characteristics. Overall, clusters were generally dissatisfied with a range of work-related issues, with workload stress appearing as an issue that needs to be addressed within marketing academia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Australasian tertiary education sector has undergone significant organizational and cultural changes, which have increased pressures on academics to undertake a range of additional activities while at the same time improving research performance. These pressures impact on individuals in different ways, although there may be some groups or clusters of individuals within institutions with common characteristics. Managers may need to develop different sets of management strategies and policies to assist each group of academics to deal better with these pressures and improve their individual performance. The paper examines Australasian marketing academics’ perceptions of their work environments and whether these perceptions result in differing clusters of individuals who might also vary based on their research performance, time allocated to different academic roles, and their professional and demographic characteristics. Sixty-eight members of the Australian and New Zealand Academy of Marketing responded to a survey using a modified version of an instrument developed by Diamantopoulos et al. (1992). K-means clustering procedure identified four groups of academics – “Traditional Academics,” “Satisfied Professors,” “Newer Academics,” and “Satisfied Researchers.” While only a few significant differences among clusters were identified in relation to time allocated to academic activities and research performance, it appears that clusters differ on several professional and demographic characteristics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Network traffic classification is an essential component for network management and security systems. To address the limitations of traditional port-based and payload-based methods, recent studies have been focusing on alternative approaches. One promising direction is applying machine learning techniques to classify traffic flows based on packet and flow level statistics. In particular, previous papers have illustrated that clustering can achieve high accuracy and discover unknown application classes. In this work, we present a novel semi-supervised learning method using constrained clustering algorithms. The motivation is that in network domain a lot of background information is available in addition to the data instances themselves. For example, we might know that flow ƒ1 and ƒ2 are using the same application protocol because they are visiting the same host address at the same port simultaneously. In this case, ƒ1 and ƒ2 shall be grouped into the same cluster ideally. Therefore, we describe these correlations in the form of pair-wise must-link constraints and incorporate them in the process of clustering. We have applied three constrained variants of the K-Means algorithm, which perform hard or soft constraint satisfaction and metric learning from constraints. A number of real-world traffic traces have been used to show the availability of constraints and to test the proposed approach. The experimental results indicate that by incorporating constraints in the course of clustering, the overall accuracy and cluster purity can be significantly improved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the limitations of the traditional port-based and payload-based traffic classification approaches, the past decade has seen extensive work on utilizing machine learning techniques to classify network traffic based on packet and flow level features. In particular, previous studies have shown that the unsupervised clustering approach is both accurate and capable of discovering previously unknown application classes. In this paper, we explore the utility of side information in the process of traffic clustering. Specifically, we focus on the flow correlation information that can be efficiently extracted from packet headers and expressed as instance-level constraints, which indicate that particular sets of flows are using the same application and thus should be put into the same cluster. To incorporate the constraints, we propose a modified constrained K-Means algorithm. A variety of real-world traffic traces are used to show that the constraints are widely available. The experimental results indicate that the constrained approach not only improves the quality of the resulted clusters, but also speeds up the convergence of the clustering process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background

Imatinib mesylate is currently the drug of choice to treat chronic myeloid leukemia. However, patient resistance and cytotoxicity make secondary lines of treatment, such as omacetaxine mepesuccinate, a necessity. Given that drug cytotoxicity represents a major problem during treatment, it is essential to understand the biological pathways affected to better predict poor drug response and prioritize a treatment regime.
Methods

We conducted cell viability and gene expression assays to determine heritability and gene expression changes associated with imatinib and omacetaxine treatment of 55 non-cancerous lymphoblastoid cell lines, derived from 17 pedigrees. In total, 48,803 transcripts derived from Illumina Human WG-6 BeadChips were analyzed for each sample using SOLAR, whilst correcting for kinship structure.
Results

Cytotoxicity within cell lines was highly heritable following imatinib treatment (h2 = 0.60-0.73), but not omacetaxine treatment. Cell lines treated with an IC20 dose of imatinib or omacetaxine showed differential gene expression for 956 (1.96%) and 3,892 transcripts (7.97%), respectively; 395 of these (0.8%) were significantly influenced by both imatinib and omacetaxine treatment. k-means clustering and DAVID functional annotation showed expression changes in genes related to kinase binding and vacuole-related functions following imatinib treatment, whilst expression changes in genes related to cell division and apoptosis were evident following treatment with omacetaxine. The enrichment scores for these ontologies were very high (mostly >10).
Conclusions

Induction of gene expression changes related to different pathways following imatinib and omacetaxine treatment suggests that the cytotoxicity of such drugs may be differentially tolerated by individuals based on their genetic background.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The thickness of the retinal nerve fiber layer (RFNL) has become a diagnose measure for glaucoma assessment. To measure this thickness, accurate segmentation of the RFNL in optical coherence tomography (OCT) images is essential. Identification of a suitable segmentation algorithm will facilitate the enhancement of the RNFL thickness measurement accuracy. This paper investigates the performance of six algorithms in the segmentation of RNFL in OCT images. The algorithms are: normalised cuts, region growing, k-means clustering, active contour, level sets segmentation: Piecewise Gaussian Method (PGM) and Kernelized Method (KM). The performance of the six algorithms are determined through a set of experiments on OCT retinal images. An experimental procedure is used to measure the performance of the tested algorithms. The measured segmentation precision-recall results of the six algorithms are compared and discussed.