121 resultados para clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

When no prior knowledge is available, clustering is a useful technique for categorizing data into meaningful groups or clusters. In this paper, a modified fuzzy min-max (MFMM) clustering neural network is proposed. Its efficacy for tackling power quality monitoring tasks is demonstrated. A literature review on various clustering techniques is first presented. To evaluate the proposed MFMM model, a performance comparison study using benchmark data sets pertaining to clustering problems is conducted. The results obtained are comparable with those reported in the literature. Then, a real-world case study on power quality monitoring tasks is performed. The results are compared with those from the fuzzy c-means and k-means clustering methods. The experimental outcome positively indicates the potential of MFMM in undertaking data clustering tasks and its applicability to the power systems domain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite the popularity of Failure Mode and Effect Analysis (FMEA) in a wide range of industries, two well-known shortcomings are the complexity of the FMEA worksheet and its intricacy of use. To the best of our knowledge, the use of computation techniques for solving the aforementioned shortcomings is limited. As such, the idea of clustering and visualization pertaining to the failure modes in FMEA is proposed in this paper. A neural network visualization model with an incremental learning feature, i.e., the evolving tree (ETree), is adopted to allow the failure modes in FMEA to be clustered and visualized as a tree structure. In addition, the ideas of risk interval and risk ordering for different groups of failure modes are proposed to allow the failure modes to be ordered, analyzed, and evaluated in groups. The main advantages of the proposed method lie in its ability to transform failure modes in a complex FMEA worksheet to a tree structure for better visualization, while maintaining the risk evaluation and ordering features. It can be applied to the conventional FMEA methodology without requiring additional information or data. A real world case study in the edible bird nest industry in Sarawak (Borneo Island) is used to evaluate the usefulness of the proposed method. The experiments show that the failure modes in FMEA can be effectively visualized through the tree structure. A discussion with FMEA users engaged in the case study indicates that such visualization is helpful in comprehending and analyzing the respective failure modes, as compared with those in an FMEA table. The resulting tree structure, together with risk interval and risk ordering, provides a quick and easily understandable framework to elucidate important information from complex FMEA forms; therefore facilitating the decision-making tasks by FMEA users. The significance of this study is twofold, viz., the use of a computational visualization approach to tackling two well-known shortcomings of FMEA; and the use of ETree as an effective neural network learning paradigm to facilitate FMEA implementations. These findings aim to spearhead the potential adoption of FMEA as a useful and usable risk evaluation and management tool by the wider community.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract This paper introduces a novel approach for discrete event simulation output analysis. The approach combines dynamic time warping and clustering to enable the identification of system behaviours contributing to overall system performance, by linking the clustering cases to specific causal events within the system. Simulation model event logs have been analysed to group entity flows based on the path taken and travel time through the system. The proposed approach is investigated for a discrete event simulation of an international airport baggage handling system. Results show that the method is able to automatically identify key factors that influence the overall dwell time of system entities, such as bags that fail primary screening. The novel analysis methodology provides insight into system performance, beyond that achievable through traditional analysis techniques. This technique also has potential application to agent-based modelling paradigms and also business event logs traditionally studied using process mining techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Current population-based anti-obesity campaigns often target individuals based on either weight or socio-demographic characteristics, and give a 'mass' message about personal responsibility. There is a recognition that attempts to influence attitudes and opinions may be more effective if they resonate with the beliefs that different groups have about the causes of, and solutions for, obesity. Limited research has explored how attitudinal factors may inform the development of both upstream and downstream social marketing initiatives. METHODS: Computer-assisted face-to-face interviews were conducted with 159 parents and 184 of their children (aged 9-18 years old) in two Australian states. A mixed methods approach was used to assess attitudes towards obesity, and elucidate why different groups held various attitudes towards obesity. Participants were quantitatively assessed on eight dimensions relating to the severity and extent, causes and responsibility, possible remedies, and messaging strategies. Cluster analysis was used to determine attitudinal clusters. Participants were also able to qualify each answer. Qualitative responses were analysed both within and across attitudinal clusters using a constant comparative method. RESULTS: Three clusters were identified. Concerned Internalisers (27% of the sample) judged that obesity was a serious health problem, that Australia had among the highest levels of obesity in the world and that prevalence was rapidly increasing. They situated the causes and remedies for the obesity crisis in individual choices. Concerned Externalisers (38% of the sample) held similar views about the severity and extent of the obesity crisis. However, they saw responsibility and remedies as a societal rather than an individual issue. The final cluster, the Moderates, which contained significantly more children and males, believed that obesity was not such an important public health issue, and judged the extent of obesity to be less extreme than the other clusters. CONCLUSION: Attitudinal clusters provide new information and insights which may be useful in tailoring anti-obesity social marketing initiatives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite significant advancements in wireless sensor networks (WSNs), energy conservation in the networks remains one of the most important research challenges. One approach commonly used to prolong the network lifetime is through aggregating data at the cluster heads (CHs). However, there is possibility that the CHs may fail and function incorrectly due to a number of reasons such as power instability. During the failure, the CHs are unable to collect and transfer data correctly. This affects the performance of the WSN. Early detection of failure of CHs will reduce the data loss and provide possible minimal recovery efforts. This paper proposes a self-configurable clustering mechanism to detect the disordered CHs and replace them with other nodes. Simulation results verify the effectiveness of the proposed approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pixel color has proven to be a useful and robust cue for detection of most objects of interest like fire. In this paper, a hybrid intelligent algorithm is proposed to detect fire pixels in the background of an image. The proposed algorithm is introduced by the combination of a computational search method based on a swarm intelligence technique and the Kemdoids clustering method in order to form a Fire-based Color Space (FCS), in fact, the new technique converts RGB color system to FCS through a 3*3 matrix. This algorithm consists of five main stages:(1) extracting fire and non-fire pixels manually from the original image. (2) using K-medoids clustering to find a Cost function to minimize the error value. (3) applying Particle Swarm Optimization (PSO) to search and find the best W components in order to minimize the fitness function. (4) reporting the best matrix including feature weights, and utilizing this matrix to convert the all original images in the database to the new color space. (5) using Otsu threshold technique to binarize the final images. As compared with some state-of-the-art techniques, the experimental results show the ability and efficiency of the new method to detect fire pixels in color images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many vision problems deal with high-dimensional data, such as motion segmentation and face clustering. However, these high-dimensional data usually lie in a low-dimensional structure. Sparse representation is a powerful principle for solving a number of clustering problems with high-dimensional data. This principle is motivated from an ideal modeling of data points according to linear algebra theory. However, real data in computer vision are unlikely to follow the ideal model perfectly. In this paper, we exploit the mixed norm regularization for sparse subspace clustering. This regularization term is a convex combination of the l1norm, which promotes sparsity at the individual level and the block norm l2/1 which promotes group sparsity. Combining these powerful regularization terms will provide a more accurate modeling, subsequently leading to a better solution for the affinity matrix used in sparse subspace clustering. This could help us achieve better performance on motion segmentation and face clustering problems. This formulation also caters for different types of data corruptions. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other state-of-arts on both motion segmentation and face clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Side information, or auxiliary information associated with documents or image content, provides hints for clustering. We propose a new model, side information dependent Chinese restaurant process, which exploits side information in a Bayesian nonparametric model to improve data clustering. We introduce side information into the framework of distance dependent Chinese restaurant process using a robust decay function to handle noisy side information. The threshold parameter of the decay function is updated automatically in the Gibbs sampling process. A fast inference algorithm is proposed. We evaluate our approach on four datasets: Cora, 20 Newsgroups, NUS-WIDE and one medical dataset. Types of side information explored in this paper include citations, authors, tags, keywords and auxiliary clinical information. The comparison with the state-of-the-art approaches based on standard performance measures (NMI, F1) clearly shows the superiority of our approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maximum target coverage with minimum number of sensor nodes, known as an MCMS problem, is an important problem in directional sensor networks (DSNs). For guaranteed coverage and event reporting, the underlying mechanism must ensure that all targets are covered by the sensors and the resulting network is connected. Existing solutions allow individual sensor nodes to determine the sensing direction for maximum target coverage which produces sensing coverage redundancy and much overhead. Gathering nodes into clusters might provide a better solution to this problem. In this paper, we have designed distributed clustering and target coverage algorithms to address the problem in an energy-efficient way. To the best of our knowledge, this is the first work that exploits cluster heads to determine the active sensing nodes and their directions for solving target coverage problems in DSNs. Our extensive simulation study shows that our system outperforms a number of state-of-the-art approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Privacy-preserving data mining aims to keep data safe, yet useful. But algorithms providing strong guarantees often end up with low utility. We propose a novel privacy preserving framework that thwarts an adversary from inferring an unknown data point by ensuring that the estimation error is almost invariant to the inclusion/exclusion of the data point. By focusing directly on the estimation error of the data point, our framework is able to significantly lower the perturbation required. We use this framework to propose a new privacy aware K-means clustering algorithm. Using both synthetic and real datasets, we demonstrate that the utility of this algorithm is almost equal to that of the unperturbed K-means, and at strict privacy levels, almost twice as good as compared to the differential privacy counterpart.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the problem of minimizing data transfer between different data centers of the cloud during the neurological diagnostics of cardiac autonomic neuropathy (CAN). This problem has never been considered in the literature before. All classifiers considered for the diagnostics of CAN previously assume complete access to all data, which would lead to enormous burden of data transfer during training if such classifiers were deployed in the cloud. We introduce a new model of clustering-based multi-layer distributed ensembles (CBMLDE). It is designed to eliminate the need to transfer data between different data centers for training of the classifiers. We conducted experiments utilizing a dataset derived from an extensive DiScRi database. Our comprehensive tests have determined the best combinations of options for setting up CBMLDE classifiers. The results demonstrate that CBMLDE classifiers not only completely eliminate the need in patient data transfer, but also have significantly outperformed all base classifiers and simpler counterpart models in all cloud frameworks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to the potential important information in real world networks, link prediction has become an interesting focus of different branches of science. Nevertheless, in "big data" era, link prediction faces significant challenges, such as how to predict the massive data efficiently and accurately. In this paper, we propose two novel node-coupling clustering approaches and their extensions for link prediction, which combine the coupling degrees of the common neighbor nodes of a predicted node-pair with cluster geometries of nodes. We then present an experimental evaluation to compare the prediction accuracy and effectiveness between our approaches and the representative existing methods on two synthetic datasets and six real world datasets. The experimental results show our approaches outperform the existing methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In many real-world computer vision applications, such as multi-camera surveillance, the objects of interest are captured by visual sensors concurrently, resulting in multi-view data. These views usually provide complementary information to each other. One recent and powerful computer vision method for clustering is sparse subspace clustering (SSC); however, it was not designed for multi-view data, which break down its linear separability assumption. To integrate complementary information between views, multi-view clustering algorithms are required to improve the clustering performance. In this paper, we propose a novel multi-view subspace clustering by searching for an unified latent structure as a global affinity matrix in subspace clustering. Due to the integration of affinity matrices for each view, this global affinity matrix can best represent the relationship between clusters. This could help us achieve better performance on face clustering. We derive a provably convergent algorithm based on the alternating direction method of multipliers (ADMM) framework, which is computationally efficient, to solve the formulation. We demonstrate that this formulation outperforms other alternatives based on state-of-The-Arts on challenging multi-view face datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traffic subarea division is vital for traffic system management and traffic network analysis in intelligent transportation systems (ITSs). Since existing methods may not be suitable for big traffic data processing, this paper presents a MapReduce-based Parallel Three-Phase K -Means (Par3PKM) algorithm for solving traffic subarea division problem on a widely adopted Hadoop distributed computing platform. Specifically, we first modify the distance metric and initialization strategy of K -Means and then employ a MapReduce paradigm to redesign the optimized K -Means algorithm for parallel clustering of large-scale taxi trajectories. Moreover, we propose a boundary identifying method to connect the borders of clustering results for each cluster. Finally, we divide traffic subarea of Beijing based on real-world trajectory data sets generated by 12,000 taxis in a period of one month using the proposed approach. Experimental evaluation results indicate that when compared with K -Means, Par2PK-Means, and ParCLARA, Par3PKM achieves higher efficiency, more accuracy, and better scalability and can effectively divide traffic subarea with big taxi trajectory data.