150 resultados para Labeling hierarchical clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the problem of resource scheduling in a grid computing environment. One of the main goals of grid computing is to share system resources among geographically dispersed users, and schedule resource requests in an efficient manner. Grid computing resources are distributed, heterogeneous, dynamic, and autonomous, which makes resource scheduling a complex problem. This paper proposes a new approach to resource scheduling in grid computing environments, the hierarchical stochastic Petri net (HSPN). The HSPN optimizes grid resource sharing, by categorizing resource requests in three layers, where each layer has special functions for receiving subtasks from, and delivering data to, the layer above or below. We compare the HSPN performance with the Min-min and Max-min resource scheduling algorithms. Our results show that the HSPN performs better than Max-min, but slightly underperforms Min-min.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, an approach for profiling email-born phishing activities is proposed. Profiling phishing activities are useful in determining the activity of an individual or a particular group of phishers. By generating profiles, phishing activities can be well understood and observed. Typically, work in the area of phishing is intended at detection of phishing emails, whereas we concentrate on profiling the phishing email. We formulate the profiling problem as a clustering problem using the various features in the phishing emails as feature vectors. Further, we generate profiles based on clustering predictions. These predictions are further utilized to generate complete profiles of these emails. The performance of the clustering algorithms at the earlier stage is crucial for the effectiveness of this model. We carried out an experimental evaluation to determine the performance of many classification algorithms by incorporating clustering approach in our model. Our proposed profiling email-born phishing algorithm (ProEP) demonstrates promising results with the RatioSize rules for selecting the optimal number of clusters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cloud computing is experiencing phenomenal growth and there are now many vendors offering their cloud services. In cloud computing, cloud providers cooperate together to offer their computing resource as a utility and software as a service to customers. The demands and the price of cloud service should be negotiated between providers and users based on the Service Level Agreement (SLA). In order to help cloud providers achieving an agreeable price for their services and maximizing the benefits of both cloud providers and clients, this paper proposes a cloud pricing system consisting of hierarchical system, M/M/c queuing model and pricing model. Simulation results verify the efficiency of our proposed system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper explores effective multi-label classification methods for multi-semantic image and text categorization. We perform an experimental study of clustering based multi-label classification (CBMLC) for the target problem. Experimental evaluation is conducted for identifying the impact of different clustering algorithms and base classifiers on the predictive performance and efficiency of CBMLC. In the experimental setting, three widely used clustering algorithms and six popular multi-label classification algorithms are used and evaluated on multi-label image and text datasets. A multi-label classification evaluation metrics, micro F1-measure, is used for presenting predictive performances of the classifications. Experimental evaluation results reveal that clustering based multi-label learning algorithms are more effective compared to their non-clustering counterparts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A hierarchical intrusion detection model is proposed to detect both anomaly and misuse attacks. In order to further speed up the training and testing, PCA-based feature extraction algorithm is used to reduce the dimensionality of the data. A PCA-based algorithm is used to filter normal data out in the upper level. The experiment results show that PCA can reduce noise in the original data set and the PCA-based algorithm can reach the desirable performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recent years have seen extensive work on statistics-based network traffic classification using machine learning (ML) techniques. In the particular scenario of learning from unlabeled traffic data, some classic unsupervised clustering algorithms (e.g. K-Means and EM) have been applied but the reported results are unsatisfactory in terms of low accuracy. This paper presents a novel approach for the task, which performs clustering based on Random Forest (RF) proximities instead of Euclidean distances. The approach consists of two steps. In the first step, we derive a proximity measure for each pair of data points by performing a RF classification on the original data and a set of synthetic data. In the next step, we perform a K-Medoids clustering to partition the data points into K groups based on the proximity matrix. Evaluations have been conducted on real-world Internet traffic traces and the experimental results indicate that the proposed approach is more accurate than the previous methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The internet age has fuelled an enormous explosion in the amount of information generated by humanity. Much of this information is transient in nature, created to be immediately consumed and built upon (or discarded). The field of data mining is surprisingly scant with algorithms that are geared towards the unsupervised knowledge extraction of such dynamic data streams. This chapter describes a new neural network algorithm inspired by self-organising maps. The new algorithm is a hybrid algorithm from the growing self-organising map (GSOM) and the cellular probabilistic self-organising map (CPSOM). The result is an algorithm which generates a dynamically growing feature map for the purpose of clustering dynamic data streams and tracking clusters as they evolve in the data stream.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hybrid surface micro-patterns composed of topographic structures of polyethylene glycol (PEG)-hydrogels and hierarchical lines of gold nanoparticles (Au NPs) were fabricated on silicon wafers. Micro-sized lines of Au NPs were first obtained on the surface of a silicon wafer via “micro-contact deprinting”, a method recently developed by our group. Topographic micro-patterns of PEG, of both low and high aspect ratio (AR up to 6), were then aligned on the pre-patterned surface via a procedure adapted from the soft lithographic method MIMIC (Micro-Molding in Capillaries), which is denoted as “adhesive embossing”. The result is a complex surface pattern consisting of alternating flat Au NP lines and thick PEG bars. Such patterns provide novel model surfaces for elucidating the interplay between (bio)chemical and physical cues on cell behavior.