806 resultados para Fuzzy Clustering
Resumo:
This research is a step forward in improving the accuracy of detecting anomaly in a data graph representing connectivity between people in an online social network. The proposed hybrid methods are based on fuzzy machine learning techniques utilising different types of structural input features. The methods are presented within a multi-layered framework which provides the full requirements needed for finding anomalies in data graphs generated from online social networks, including data modelling and analysis, labelling, and evaluation.
Resumo:
Driver training is one of the interventions aimed at mitigating the number of crashes that involve novice drivers. Our failure to understand what is really important for learners, in terms of risky driving, is one of the many drawbacks restraining us to build better training programs. Currently, there is a need to develop and evaluate Advanced Driving Assistance Systems that could comprehensively assess driving competencies. The aim of this paper is to present a novel Intelligent Driver Training System (IDTS) that analyses crash risks for a given driving situation, providing avenues for improvement and personalisation of driver training programs. The analysis takes into account numerous variables acquired synchronously from the Driver, the Vehicle and the Environment (DVE). The system then segments out the manoeuvres within a drive. This paper further presents the usage of fuzzy set theory to develop the safety inference rules for each manoeuvre executed during the drive. This paper presents a framework and its associated prototype that can be used to comprehensively view and assess complex driving manoeuvres and then provide a comprehensive analysis of the drive used to give feedback to novice drivers.
Resumo:
The problem of clustering a large document collection is not only challenged by the number of documents and the number of dimensions, but it is also affected by the number and sizes of the clusters. Traditional clustering methods fail to scale when they need to generate a large number of clusters. Furthermore, when the clusters size in the solution is heterogeneous, i.e. some of the clusters are large in size, the similarity measures tend to degrade. A ranking based clustering method is proposed to deal with these issues in the context of the Social Event Detection task. Ranking scores are used to select a small number of most relevant clusters in order to compare and place a document. Additionally,instead of conventional cluster centroids, cluster patches are proposed to represent clusters, that are hubs-like set of documents. Text, temporal, spatial and visual content information collected from the social event images is utilized in calculating similarity. Results show that these strategies allow us to have a balance between performance and accuracy of the clustering solution gained by the clustering method.
Resumo:
This project is a step forward in the study of text mining where enhanced text representation with semantic information plays a significant role. It develops effective methods of entity-oriented retrieval, semantic relation identification and text clustering utilizing semantically annotated data. These methods are based on enriched text representation generated by introducing semantic information extracted from Wikipedia into the input text data. The proposed methods are evaluated against several start-of-art benchmarking methods on real-life data-sets. In particular, this thesis improves the performance of entity-oriented retrieval, identifies different lexical forms for an entity relation and handles clustering documents with multiple feature spaces.
Resumo:
High-Order Co-Clustering (HOCC) methods have attracted high attention in recent years because of their ability to cluster multiple types of objects simultaneously using all available information. During the clustering process, HOCC methods exploit object co-occurrence information, i.e., inter-type relationships amongst different types of objects as well as object affinity information, i.e., intra-type relationships amongst the same types of objects. However, it is difficult to learn accurate intra-type relationships in the presence of noise and outliers. Existing HOCC methods consider the p nearest neighbours based on Euclidean distance for the intra-type relationships, which leads to incomplete and inaccurate intra-type relationships. In this paper, we propose a novel HOCC method that incorporates multiple subspace learning with a heterogeneous manifold ensemble to learn complete and accurate intra-type relationships. Multiple subspace learning reconstructs the similarity between any pair of objects that belong to the same subspace. The heterogeneous manifold ensemble is created based on two-types of intra-type relationships learnt using p-nearest-neighbour graph and multiple subspaces learning. Moreover, in order to make sure the robustness of clustering process, we introduce a sparse error matrix into matrix decomposition and develop a novel iterative algorithm. Empirical experiments show that the proposed method achieves improved results over the state-of-art HOCC methods for FScore and NMI.
Resumo:
Clustering is an important technique in organising and categorising web scale documents. The main challenges faced in clustering the billions of documents available on the web are the processing power required and the sheer size of the datasets available. More importantly, it is nigh impossible to generate the labels for a general web document collection containing billions of documents and a vast taxonomy of topics. However, document clusters are most commonly evaluated by comparison to a ground truth set of labels for documents. This paper presents a clustering and labeling solution where the Wikipedia is clustered and hundreds of millions of web documents in ClueWeb12 are mapped on to those clusters. This solution is based on the assumption that the Wikipedia contains such a wide range of diverse topics that it represents a small scale web. We found that it was possible to perform the web scale document clustering and labeling process on one desktop computer under a couple of days for the Wikipedia clustering solution containing about 1000 clusters. It takes longer to execute a solution with finer granularity clusters such as 10,000 or 50,000. These results were evaluated using a set of external data.
Resumo:
This chapter focuses on the implementation of the TS (Tagaki-Sugino) fuzzy controller for the Doubly Fed Induction Generator (DFIG) based wind generator. The conventional PI control loops for mantaining desired active power and DC capacitor voltage is compared with the TS fuzzy controllers. DFIG system is represented by a third-order model where electromagnetic transients of the stator are neglected. The effectiveness of the TS-fuzzy controller on the rotor speed oscillations and the DC capacitor voltage variations of the DFIG damping controller on converter ratings is also investigated. The results from the time domain simulations are presented to elucidate the effectiveness of the TS-fuzzy controller over the conventional PI controller in the DFIG system. The proposed TS-fuzzy con-troller can improve the fault ride through capability of DFIG compared to the conventional PI controller.
Resumo:
Modern power systems have become more complex due to the growth in load demand, the installation of Flexible AC Transmission Systems (FACTS) devices and the integration of new HVDC links into existing AC grids. On the other hand, the introduction of the deregulated and unbundled power market operational mechanism, together with present changes in generation sources including connections of large renewable energy generation with intermittent feature in nature, have further increased the complexity and uncertainty for power system operation and control. System operators and engineers have to confront a series of technical challenges from the operation of currently interconnected power systems. Among the many challenges, how to evaluate the steady state and dynamic behaviors of existing interconnected power systems effectively and accurately using more powerful computational analysis models and approaches becomes one of the key issues in power engineering. The traditional computing techniques have been widely used in various fields for power system analysis with varying degrees of success. The rapid development of computational intelligence, such as neural networks, fuzzy systems and evolutionary computation, provides tools and opportunities to solve the complex technical problems in power system planning, operation and control.
Resumo:
Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods which have been recently employed to analyse PNSD data, however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K-means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and Silhouette width validation values and the K-means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K-means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectra to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help researchers immensely in analysing PNSD data for characterisation and source apportionment purposes.
Resumo:
For future planetary robot missions, multi-robot-systems can be considered as a suitable platform to perform space mission faster and more reliable. In heterogeneous robot teams, each robot can have different abilities and sensor equipment. In this paper we describe a lunar demonstration scenario where a team of mobile robots explores an unknown area and identifies a set of objects belonging to a lunar infrastructure. Our robot team consists of two exploring scout robots and a mobile manipulator. The mission goal is to locate the objects within a certain area, to identify the objects, and to transport the objects to a base station. The robots have a different sensor setup and different capabilities. In order to classify parts of the lunar infrastructure, the robots have to share the knowledge about the objects. Based on the different sensing capabilities, several information modalities have to be shared and combined by the robots. In this work we propose an approach using spatial features and a fuzzy logic based reasoning for distributed object classification.
Resumo:
Semantic perception and object labeling are key requirements for robots interacting with objects on a higher level. Symbolic annotation of objects allows the usage of planning algorithms for object interaction, for instance in a typical fetchand-carry scenario. In current research, perception is usually based on 3D scene reconstruction and geometric model matching, where trained features are matched with a 3D sample point cloud. In this work we propose a semantic perception method which is based on spatio-semantic features. These features are defined in a natural, symbolic way, such as geometry and spatial relation. In contrast to point-based model matching methods, a spatial ontology is used where objects are rather described how they "look like", similar to how a human would described unknown objects to another person. A fuzzy based reasoning approach matches perceivable features with a spatial ontology of the objects. The approach provides a method which is able to deal with senor noise and occlusions. Another advantage is that no training phase is needed in order to learn object features. The use-case of the proposed method is the detection of soil sample containers in an outdoor environment which have to be collected by a mobile robot. The approach is verified using real world experiments.