150 resultados para Labeling hierarchical clustering


100.00% 100.00%



The appearance of patterns could be found in different modalities of a domain, where the different modalities refer to the data sources that constitute different aspects of a domain. Particularly, the domain of our discussion refers to crime and the different modalities refer to the different data sources such as offender data, weapon data, etc. in crime domain. In addition, patterns also exist in different levels of granularity for each modality. In order to have a thorough understanding a domain, it is important to reveal the hidden patterns through the data explorations at different levels of granularity and for each modality. Therefore, this paper presents a new model for identifying patterns that exist in different levels of granularity for different modes of crime data. A hierarchical clustering approach - growing self organising maps (GSOM) has been deployed. Furthermore, the model is enhanced with experiments that exhibit the significance of exploring data at different granularities.


90.00% 90.00%



This paper presents a comparison of applying different clustering algorithms on a point cloud constructed from the depth maps captured by a RGBD camera such as Microsoft Kinect. The depth sensor is capable of returning images, where each pixel represents the distance to its corresponding point not the RGB data. This is considered as the real novelty of the RGBD camera in computer vision compared to the common video-based and stereo-based products. Depth sensors captures depth data without using markers, 2D to 3D-transition or determining feature points. The captured depth map then cluster the 3D depth points into different clusters to determine the different limbs of the human-body. The 3D points clustering is achieved by different clustering techniques. Our Experiments show good performance and results in using clustering to determine different human-body limbs.


90.00% 90.00%



This paper proposes a novel application of Visual Assessment of Tendency (VAT)-based hierarchical clustering algorithms (VAT, iVAT, and clusiVAT) for trajectory analysis. We introduce a new clustering based anomaly detection framework named iVAT+ and clusiVAT+ and use it for trajectory anomaly detection. This approach is based on partitioning the VAT-generated Minimum Spanning Tree based on an efficient thresholding scheme. The trajectories are classified as normal or anomalous based on the number of paths in the clusters. On synthetic datasets with fixed and variable numbers of clusters and anomalies, we achieve 98 % classification accuracy. Our two-stage clusiVAT method is applied to 26,039 trajectories of vehicles and pedestrians from a parking lot scene from the real life MIT trajectories dataset. The first stage clusters the trajectories ignoring directionality. The second stage divides the clusters obtained from the first stage by considering trajectory direction. We show that our novel two-stage clusiVAT approach can produce natural and informative trajectory clusters on this real life dataset while finding representative anomalies.


80.00% 80.00%



Development of polarized immune responses controls resistance and susceptibility to many microorganisms. However, studies of several infectious, allergic, and autoimmune diseases have shown that chronic type-1 and type-2 cytokine responses can also cause significant morbidity and mortality if left unchecked. We used mouse cDNA microarrays to molecularly phenotype the gene expression patterns that characterize two disparate but equally lethal forms of liver pathology that develop in Schistosoma mansoni infected mice polarized for type-1 and type-2 cytokine responses. Hierarchical clustering analysis identified at least three groups of genes associated with a polarized type-2 response and two linked with an extreme type-1 cytokine phenotype. Predictions about liver fibrosis,  apoptosis, and granulocyte recruitment and activation generated by the microarray studies were confirmed later by traditional biological assays. The data show that cDNA microarrays are useful not only for determining  coordinated gene expression profiles but are also highly effective for molecularly “fingerprinting” diseased tissues. Moreover, they illustrate the potential of genome-wide approaches for generating comprehensive views on the molecular and biochemical mechanisms regulating infectious  disease pathogenesis.


80.00% 80.00%



The Information Bottleneck method aims to extract a compact representation which preserves the maximum relevant information. The sub-optimality in agglomerative Information Bottleneck (aIB) algorithm restricts the applications of Information Bottleneck method. In this paper, the concept of density-based chains is adopted to evaluate the information loss among the neighbors of an element, rather than the information loss between pairs of elements. The DaIB algorithm is then presented to alleviate the sub-optimality problem in aIB while simultaneously keeping the useful hierarchical clustering tree-structure. The experiment results on the benchmark data sets show that the DaIB algorithm can get more relevant information and higher precision than aIB algorithm, and the paired t-test indicates that these improvements are statistically significant.


80.00% 80.00%



Biologically human brain processes information in both uniimodal and multimodal approaches. In fact, information is progressively abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has exponentially produced various sources of data, which could be likened to being the state of multimodality in human brain. Therefore, this is an inspiration to develop a methodology for exploring multimodal data and further identifying multi-view patterns. Specifically, we propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. A structurally adaptive neural network is deployed to implement the proposed model. Furthermore, the acquisition of multi-view patterns with the proposed model is
demonstrated and discussed with some experimental results.


80.00% 80.00%



The human brain processes information in both unimodal and multimodal fashion where information is progressively captured, accumulated, abstracted and seamlessly fused. Subsequently, the fusion of multimodal inputs allows a holistic understanding of a problem. The proliferation of technology has produced various sources of electronic data and continues to do so exponentially. Finding patterns from such multi-source and multimodal data could be compared to the multimodal and multidimensional information processing in the human brain. Therefore, such brain functionality could be taken as an inspiration to develop a methodology for exploring multimodal and multi-source electronic data and further identifying multi-view patterns. In this paper, we first propose a brain inspired conceptual model that allows exploration and identification of patterns at different levels of granularity, different types of hierarchies and different types of modalities. Secondly, we present a cluster driven approach for the implementation of the proposed brain inspired model. Particularly, the Growing Self Organising Maps (GSOM) based cross-clustering approach is discussed. Furthermore, the acquisition of multi-view patterns with clusters driven implementation is demonstrated with experimental results.


80.00% 80.00%



Humans perceive entities such as objects, patterns, events, etc. as concepts, which are the basic units in human intelligence and communications. In addition, perceptions of these entities could be abstracted and generalised at multiple levels of granularity. In particular, such granulation allows the formation and usage of concepts in human intelligence. Such natural granularity in human intelligence could inspire and motivate the design and development of pattern identification approach in Data Mining. In our opinion, a pattern could be perceived at multiple levels of granularity and thus we advocate for the co-existence of hierarchy and granularity. In addition, granular patterns exist across different sources of data (multimodality). In this paper, we present a cognitive model that incorporates the characteristics of Hierarchy, Granularity and Multimodality for multi-view patterns identification in crime domain. Such framework is implemented with Growing Self Organising Maps (GSOM) and some experimental results are presented and discussed.


80.00% 80.00%



Despite extensive research on the benefits of reverse logistics (RL), it has yet to become commonplace in the construction industry. Furthermore, the uptake and number of studies on RL remains very limited within the Australian context and particularly related to the construction industry. This paper is aimed at filling that knowledge gap by employing an exploratory approach to examine the critical barriers faced by South Australian construction organizations in implementing RL practices. Semi-structured interviews and a ranking approach facilitated the treatment of qualitative data through quantitative coding using cloud-based applications. The research identified 12 barriers to RL implementation, four of them very significant according to the responses of the interviewees: the regulatory environment, additional costs involved, lack of recognition in the construction supply chain, and extra effort required. The study also explored their inter-relationships through the Co-occurrence Index. The study proposes some remedial measures for RL implementation in South Australia based on the barriers identified.


40.00% 40.00%



This paper proposes a hyperlink-based web page similarity measurement and two matrix-based hierarchical web page clustering algorithms. The web page similarity measurement incorporates hyperlink transitivity and page importance within the concerned web page space. One clustering algorithm takes cluster overlapping into account, another one does not. These algorithxms do not require predefined similarity thresholds for clustering, and are independent of the page order. The primary evaluations show the effectiveness of the proposed algorithms in clustering improvement.


40.00% 40.00%



Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management.


30.00% 30.00%



The rapid increase of web complexity and size makes web searched results far from satisfaction in many cases due to a huge amount of information returned by search engines. How to find intrinsic relationships among the web pages at a higher level to implement efficient web searched information management and retrieval is becoming a challenge problem. In this paper, we propose an approach to measure web page similarity. This approach takes hyperlink transitivity and page importance into consideration. From this new similarity measurement, an effective hierarchical web page clustering algorithm is proposed. The primary evaluations show the effectiveness of the new similarity measurement and the improvement of web page clustering. The proposed page similarity, as well as the matrix-based hyperlink analysis methods, could be applied to other web-based research areas..


30.00% 30.00%



We present results on an extension to our approach for automatic sports video annotation. Sports video is augmented with accelerometer data from wrist bands worn by umpires in the game. We solve the problem of automatic segmentation and robust gesture classification using a hierarchical hidden Markov model in conjunction with a filler model. The hierarchical model allows us to consider gestures at different levels of abstraction and the filler model allows us to handle extraneous umpire movements. Results are presented for labeling video for a game of Cricket.


30.00% 30.00%



An improved evolving model, i.e., Evolving Tree (ETree) with Fuzzy c-Means (FCM), is proposed for undertaking text document visualization problems in this study. ETree forms a hierarchical tree structure in which nodes (i.e., trunks) are allowed to grow and split into child nodes (i.e., leaves), and each node represents a cluster of documents. However, ETree adopts a relatively simple approach to split its nodes. Thus, FCM is adopted as an alternative to perform node splitting in ETree. An experimental study using articles from a flagship conference of Universiti Malaysia Sarawak (UNIMAS), i.e., Engineering Conference (ENCON), is conducted. The experimental results are analyzed and discussed, and the outcome shows that the proposed ETree-FCM model is effective for undertaking text document clustering and visualization problems.


30.00% 30.00%



A fundamental task in pervasive computing is reliable acquisition of contexts from sensor data. This is crucial to the operation of smart pervasive systems and services so that they might behave efficiently and appropriately upon a given context. Simple forms of context can often be extracted directly from raw data. Equally important, or more, is the hidden context and pattern buried inside the data, which is more challenging to discover. Most of existing approaches borrow methods and techniques from machine learning, dominantly employ parametric unsupervised learning and clustering techniques. Being parametric, a severe drawback of these methods is the requirement to specify the number of latent patterns in advance. In this paper, we explore the use of Bayesian nonparametric methods, a recent data modelling framework in machine learning, to infer latent patterns from sensor data acquired in a pervasive setting. Under this formalism, nonparametric prior distributions are used for data generative process, and thus, they allow the number of latent patterns to be learned automatically and grow with the data - as more data comes in, the model complexity can grow to explain new and unseen patterns. In particular, we make use of the hierarchical Dirichlet processes (HDP) to infer atomic activities and interaction patterns from honest signals collected from sociometric badges. We show how data from these sensors can be represented and learned with HDP. We illustrate insights into atomic patterns learned by the model and use them to achieve high-performance clustering. We also demonstrate the framework on the popular Reality Mining dataset, illustrating the ability of the model to automatically infer typical social groups in this dataset. Finally, our framework is generic and applicable to a much wider range of problems in pervasive computing where one needs to infer high-level, latent patterns and contexts from sensor data.