986 resultados para PROPOSED APPROACH


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Statistics-based Internet traffic classification using machine learning techniques has attracted extensive research interest lately, because of the increasing ineffectiveness of traditional port-based and payload-based approaches. In particular, unsupervised learning, that is, traffic clustering, is very important in real-life applications, where labeled training data are difficult to obtain and new patterns keep emerging. Although previous studies have applied some classic clustering algorithms such as K-Means and EM for the task, the quality of resultant traffic clusters was far from satisfactory. In order to improve the accuracy of traffic clustering, we propose a constrained clustering scheme that makes decisions with consideration of some background information in addition to the observed traffic statistics. Specifically, we make use of equivalence set constraints indicating that particular sets of flows are using the same application layer protocols, which can be efficiently inferred from packet headers according to the background knowledge of TCP/IP networking. We model the observed data and constraints using Gaussian mixture density and adapt an approximate algorithm for the maximum likelihood estimation of model parameters. Moreover, we study the effects of unsupervised feature discretization on traffic clustering by using a fundamental binning method. A number of real-world Internet traffic traces have been used in our evaluation, and the results show that the proposed approach not only improves the quality of traffic clusters in terms of overall accuracy and per-class metrics, but also speeds up the convergence.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Internet traffic classification is a critical and essential functionality for network management and security systems. Due to the limitations of traditional port-based and payload-based classification approaches, the past several years have seen extensive research on utilizing machine learning techniques to classify Internet traffic based on packet and flow level characteristics. For the purpose of learning from unlabeled traffic data, some classic clustering methods have been applied in previous studies but the reported accuracy results are unsatisfactory. In this paper, we propose a semi-supervised approach for accurate Internet traffic clustering, which is motivated by the observation of widely existing partial equivalence relationships among Internet traffic flows. In particular, we formulate the problem using a Gaussian Mixture Model (GMM) with set-based equivalence constraint and propose a constrained Expectation Maximization (EM) algorithm for clustering. Experiments with real-world packet traces show that the proposed approach can significantly improve the quality of resultant traffic clusters. © 2014 Elsevier Inc.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Smart micro-grids can produce 'renewable' energy and store them in power storage devices. Power loss, however, is a significant problem in power exchange among the micro-grids and between the macro-station and individual micro-grids. To optimally reduce the total power losses in such a power grid system, in this paper, a greedy coalition formation algorithm is proposed, which allows the macro-station to coordinate mutual power exchange among the micro-grids and between each micro-grid and the macro-station. Our algorithm optimizes the total power losses across the entire power grid, including the cost of charging and discharging power storage devices and power losses due to power transfers. The algorithm creates exchange pairs among the micro-grids, giving priority to pairs with higher power loss reduction per exchanged power unit. Through computer-based simulations, we demonstrate that the proposed approach significantly reduces the average power loss compared with the conventional noncooperative method. The simulations also demonstrate that the communications overhead of our proposal (due to negotiations aimed at forming coalitions) does not significantly affect the available communication resource. © 2014 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Low cost pervasive electrocardiogram (ECG) monitors is changing how sinus arrhythmia are diagnosed among patients with mild symptoms. With the large amount of data generated from long-term monitoring, come new data science and analytical challenges. Although traditional rule-based detection algorithms still work on relatively short clinical quality ECG, they are not optimal for pervasive signals collected from wearable devices - they don't adapt to individual difference and assume accurate identification of ECG fiducial points. To overcome these short-comings of the rule-based methods, this paper introduces an arrhythmia detection approach for low quality pervasive ECG signals. To achieve the robustness needed, two techniques were applied. First, a set of ECG features with minimal reliance on fiducial point identification were selected. Next, the features were normalized using robust statistics to factors out baseline individual differences and clinically irrelevant temporal drift that is common in pervasive ECG. The proposed method was evaluated using pervasive ECG signals we collected, in combination with clinician validated ECG signals from Physiobank. Empirical evaluation confirms accuracy improvements of the proposed approach over the traditional clinical rules.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract This paper introduces a novel approach for discrete event simulation output analysis. The approach combines dynamic time warping and clustering to enable the identification of system behaviours contributing to overall system performance, by linking the clustering cases to specific causal events within the system. Simulation model event logs have been analysed to group entity flows based on the path taken and travel time through the system. The proposed approach is investigated for a discrete event simulation of an international airport baggage handling system. Results show that the method is able to automatically identify key factors that influence the overall dwell time of system entities, such as bags that fail primary screening. The novel analysis methodology provides insight into system performance, beyond that achievable through traditional analysis techniques. This technique also has potential application to agent-based modelling paradigms and also business event logs traditionally studied using process mining techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The nonlinear, noisy and outlier characteristics of electroencephalography (EEG) signals inspire the employment of fuzzy logic due to its power to handle uncertainty. This paper introduces an approach to classify motor imagery EEG signals using an interval type-2 fuzzy logic system (IT2FLS) in a combination with wavelet transformation. Wavelet coefficients are ranked based on the statistics of the receiver operating characteristic curve criterion. The most informative coefficients serve as inputs to the IT2FLS for the classification task. Two benchmark datasets, named Ia and Ib, downloaded from the brain-computer interface (BCI) competition II, are employed for the experiments. Classification performance is evaluated using accuracy, sensitivity, specificity and F-measure. Widely-used classifiers, including feedforward neural network, support vector machine, k-nearest neighbours, AdaBoost and adaptive neuro-fuzzy inference system, are also implemented for comparisons. The wavelet-IT2FLS method considerably dominates the comparable classifiers on both datasets, and outperforms the best performance on the Ia and Ib datasets reported in the BCI competition II by 1.40% and 2.27% respectively. The proposed approach yields great accuracy and requires low computational cost, which can be applied to a real-time BCI system for motor imagery data analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper introduces an automated medical data classification method using wavelet transformation (WT) and interval type-2 fuzzy logic system (IT2FLS). Wavelet coefficients, which serve as inputs to the IT2FLS, are a compact form of original data but they exhibits highly discriminative features. The integration between WT and IT2FLS aims to cope with both high-dimensional data challenge and uncertainty. IT2FLS utilizes a hybrid learning process comprising unsupervised structure learning by the fuzzy c-means (FCM) clustering and supervised parameter tuning by genetic algorithm. This learning process is computationally expensive, especially when employed with high-dimensional data. The application of WT therefore reduces computational burden and enhances performance of IT2FLS. Experiments are implemented with two frequently used medical datasets from the UCI Repository for machine learning: the Wisconsin breast cancer and Cleveland heart disease. A number of important metrics are computed to measure the performance of the classification. They consist of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve. Results demonstrate a significant dominance of the wavelet-IT2FLS approach compared to other machine learning methods including probabilistic neural network, support vector machine, fuzzy ARTMAP, and adaptive neuro-fuzzy inference system. The proposed approach is thus useful as a decision support system for clinicians and practitioners in the medical practice. copy; 2015 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

High-throughput experimental techniques provide a wide variety of heterogeneous proteomic data sources. To exploit the information spread across multiple sources for protein function prediction, these data sources are transformed into kernels and then integrated into a composite kernel. Several methods first optimize the weights on these kernels to produce a composite kernel, and then train a classifier on the composite kernel. As such, these approaches result in an optimal composite kernel, but not necessarily in an optimal classifier. On the other hand, some approaches optimize the loss of binary classifiers and learn weights for the different kernels iteratively. For multi-class or multi-label data, these methods have to solve the problem of optimizing weights on these kernels for each of the labels, which are computationally expensive and ignore the correlation among labels. In this paper, we propose a method called Predicting Protein Function using Multiple K ernels (ProMK). ProMK iteratively optimizes the phases of learning optimal weights and reduces the empirical loss of multi-label classifier for each of the labels simultaneously. ProMK can integrate kernels selectively and downgrade the weights on noisy kernels. We investigate the performance of ProMK on several publicly available protein function prediction benchmarks and synthetic datasets. We show that the proposed approach performs better than previously proposed protein function prediction approaches that integrate multiple data sources and multi-label multiple kernel learning methods. The codes of our proposed method are available at https://sites.google.com/site/guoxian85/promk.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The success of a Wireless Sensor Network (WSN) deployment strongly depends on the quality of service (QoS) it provides regarding issues such as data accuracy, data aggregation delays and network lifetime maximisation. This is especially challenging in data fusion mechanisms, where a small fraction of low quality data in the fusion input may negatively impact the overall fusion result. In this paper, we present a fuzzy-based data fusion approach for WSN with the aim of increasing the QoS whilst reducing the energy consumption of the sensor network. The proposed approach is able to distinguish and aggregate only true values of the collected data as such, thus reducing the burden of processing the entire data at the base station (BS). It is also able to eliminate redundant data and consequently reduce energy consumption thus increasing the network lifetime. We studied the effectiveness of the proposed data fusion approach experimentally and compared it with two baseline approaches in terms of data collection, number of transferred data packets and energy consumption. The results of the experiments show that the proposed approach achieves better results than the baseline approaches.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Despite significant advancements in wireless sensor networks (WSNs), energy conservation in the networks remains one of the most important research challenges. One approach commonly used to prolong the network lifetime is through aggregating data at the cluster heads (CHs). However, there is possibility that the CHs may fail and function incorrectly due to a number of reasons such as power instability. During the failure, the CHs are unable to collect and transfer data correctly. This affects the performance of the WSN. Early detection of failure of CHs will reduce the data loss and provide possible minimal recovery efforts. This paper proposes a self-configurable clustering mechanism to detect the disordered CHs and replace them with other nodes. Simulation results verify the effectiveness of the proposed approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A new multi-output interval type-2 fuzzy logic system (MOIT2FLS) is introduced for protein secondary structure prediction in this paper. Three outputs of the MOIT2FLS correspond to three structure classes including helix, strand (sheet) and coil. Quantitative properties of amino acids are employed to characterize twenty amino acids rather than the widely used computationally expensive binary encoding scheme. Three clustering tasks are performed using the adaptive vector quantization method to construct an equal number of initial rules for each type of secondary structure. Genetic algorithm is applied to optimally adjust parameters of the MOIT2FLS. The genetic fitness function is designed based on the Q3 measure. Experimental results demonstrate the dominance of the proposed approach against the traditional methods that are Chou-Fasman method, Garnier-Osguthorpe-Robson method, and artificial neural network models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

P2P collusive piracy, where paid P2P clients share the content with unpaid clients, has drawn significant concerns in recent years. Study on the follow relationship provides an emerging track of research in capturing the followee (e.g., paid client) for the blocking of piracy spread from all his followers (e.g., unpaid clients). Unfortunately, existing research efforts on the follow relationship in online social network have largely overlooked the time constraint and the content feedback in sequential behavior analysis. Hence, how to consider these two characteristics for effective P2P collusive piracy prevention remains an open problem. In this paper, we proposed a multi-bloom filter circle to facilitate the time-constraint storage and query of P2P sequential behaviors. Then, a probabilistic follow with content feedback model to fast discover and quantify the probabilistic follow relationship is further developed, and then, the corresponding approach to piracy prevention is designed. The extensive experimental analysis demonstrates the capability of the proposed approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Context To determine the effectiveness of software testers a suitable performance appraisal approach is necessary, both for research and practice purposes. However, review of relevant literature reveals little information of how software testers are appraised in practice. Objective (i) To enhance our knowledge of industry practice of performance appraisal of software testers and (ii) to collect feedback from project managers on a proposed performance appraisal form for software testers. Method A web-based survey with questionnaire was used to collect responses. Participants were recruited using cluster and snowball sampling. 18 software development project managers participated. Results We found two broad trends in performance appraisal of software testers - same employee appraisal process for all employees and a specialized performance appraisal method for software testers. Detailed opinions were collected and analyzed on how performance of software testers should be appraised. Our proposed appraisal approach was generally well-received. Conclusion Factors such as number of bugs found after delivery and efficiency of executing test cases were considered important in appraising software testers' performance. Our proposed approach was refined based on the feedback received.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Neuroscientific studies of in vitro neuron cell cultures has attracted paramount attention to investigate the behaviour of neuronal networks in response to different environmental conditions and external stimuli such as drugs, optical and electrical stimulations. Microelec trodearray (MEA) technology has been widely adopted as a tool for this investigation. In this work, we present a new approach to estimate interconnectivity of neural spikes using multivariate autoregressive (MVAR) analysis and Partial Directed Coherence (PDC). The proposed approach has the potential to discover hidden intra-burst causal connectivity patterns and to help understand the spatiotemporal communication patterns within bursts, pre and post stimulations.