32 resultados para Information Mining


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Video event detection is an effective way to automatically understand the semantic content of the video. However, due to the mismatch between low-level visual features and high-level semantics, the research of video event detection encounters a number of challenges, such as how to extract the suitable information from video, how to represent the event, how to build up reasoning mechanism to infer the event according to video information. In this paper, we propose a novel event detection method. The method detects the video event based on the semantic trajectory, which is a high-level semantic description of the moving object’s trajectory in the video. The proposed method consists of three phases to transform low-level visual features to middle-level raw trajectory information and then to high-level semantic trajectory information. Event reasoning is then carried out with the assistance of semantic trajectory information and background knowledge. Additionally, to release the users’ burden in manual event definition, a method is further proposed to automatically discover the event-related semantic trajectory pattern from the sample semantic trajectories. Furthermore, in order to effectively use the discovered semantic trajectory patterns, the associative classification-based event detection framework is adopted to discover the possibly occurred event. Empirical studies show our methods can effectively and efficiently detect video events.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The thesis has researched a set of critical problems in data mining and has proposed four advanced pattern mining algorithm to discover the most interesting and useful data patterns highly relevant to the user’s application targets from the data is represented in complex structures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information portals are seen as an appropriate platform for personalised healthcare and wellbeing information provision. Efficient content management is a core capability of a successful smart health information portal (SHIP) and domain expertise is a vital input to content management when it comes to matching user profiles with the appropriate resources. The rate of generation of new health-related content far exceeds the numbers that can be manually examined by domain experts for relevance to a specific topic and audience. In this paper we investigate automated content discovery as a plausible solution to this shortcoming that capitalises on the existing database of expert-endorsed content as an implicit store of knowledge to guide such a solution. We propose a novel content discovery technique based on a text analytics approach that utilises an existing content repository to acquire new and relevant content. We also highlight the contribution of this technique towards realisation of smart content management for SHIPs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Discovering frequent patterns plays an essential role in many data mining applications. The aim of frequent patterns is to obtain the information about the most common patterns that appeared together. However, designing an efficient model to mine these patterns is still demanding due to the capacity of current database size. Therefore, we propose an Efficient Frequent Pattern Mining Model (EFP-M2) to mine the frequent patterns in timely manner. The result shows that the algorithm in EFP-M2l is outperformed at least at 2 orders of magnitudes against the benchmarked FP-Growth.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Indirect pattern is considered as valuable and hidden information in transactional database. It represents the property of high dependencies between two items that are rarely occurred together but indirectly appeared via another items. Indirect pattern mining is very important because it can reveal a new knowledge in certain domain applications. Therefore, we propose an Indirect Pattern Mining Algorithm (IPMA) in an attempt to mine the indirect patterns from data repository. IPMA embeds with a measure called Critical Relative Support (CRS) measure rather than the common interesting measures. The result shows that IPMA is successful in generating the indirect patterns with the various threshold values.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We examine a recent proposal for data-privatization by testing it against well-known attacks, we show that all of these attacks successfully retrieve a relatively large (and unacceptable) portion of the original data. We then indicate how the data-privatization method examined can be modified to assist it to withstand these attacks and compare the performance of the two approaches. We also show that the new method has better privacy and lower information loss than the former method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a novel data mining framework for the exploration and extraction of actionable knowledge from data generated by electricity meters. Although a rich source of information for energy consumption analysis, electricity meters produce a voluminous, fast-paced, transient stream of data that conventional approaches are unable to address entirely. In order to overcome these issues, it is important for a data mining framework to incorporate functionality for interim summarization and incremental analysis using intelligent techniques. The proposed Incremental Summarization and Pattern Characterization (ISPC) framework demonstrates this capability. Stream data is structured in a data warehouse based on key dimensions enabling rapid interim summarization. Independently, the IPCL algorithm incrementally characterizes patterns in stream data and correlates these across time. Eventually, characterized patterns are consolidated with interim summarization to facilitate an overall analysis and prediction of energy consumption trends. Results of experiments conducted using the actual data from electricity meters confirm applicability of the ISPC framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An Android application uses a permission system to regulate the access to system resources and users' privacy-relevant information. Existing works have demonstrated several techniques to study the required permissions declared by the developers, but little attention has been paid towards used permissions. Besides, no specific permission combination is identified to be effective for malware detection. To fill these gaps, we have proposed a novel pattern mining algorithm to identify a set of contrast permission patterns that aim to detect the difference between clean and malicious applications. A benchmark malware dataset and a dataset of 1227 clean applications has been collected by us to evaluate the performance of the proposed algorithm. Valuable findings are obtained by analyzing the returned contrast permission patterns. © 2013 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The autism spectrum disorder (ASD) is increasingly being recognized as a major public health issue which affects approximately 0.5-0.6% of the population. Promoting the general awareness of the disorder, increasing the engagement with the affected individuals and their carers, and understanding the success of penetration of the current clinical recommendations in the target communities, is crucial in driving research as well as policy. The aim of the present work is to investigate if Twitter, as a highly popular platform for information exchange, can be used as a data-mining source which could aid in the aforementioned challenges. Specifically, using a large data set of harvested tweets, we present a series of experiments which examine a range of linguistic and semantic aspects of messages posted by individuals interested in ASD. Our findings, the first of their nature in the published scientific literature, strongly motivate additional research on this topic and present a methodological basis for further work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The thesis has studied a number of critical problems in data mining for customer behavior analysis and has proposed novel techniques for better modeling of the customers’ decision making process, more efficient analysis of their travel behavior, and more effective identification of their emerging preference.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

 This research investigated the proliferation of malicious applications on smartphones and a framework that can efficiently detect and classify such applications based on behavioural patterns was proposed. Additionally the causes and impact of unauthorised disclosure of personal information by clean applications were examined and countermeasures to protect smartphone users’ privacy were proposed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mobile Health (mHealth) is now emerging with Internet of Things (IoT), Cloud and big data along with the prevalence of smart wearable devices and sensors. There is also the emergence of smart environments such as smart homes, cars, highways, cities, factories and grids. Presently, it is difficult to quickly forecast or prevent urgent health situations in real-time as health data are analyzed offline by a physician. Sensors are expected to be overloaded by demands of providing health data from IoT networks and smart environments. This paper proposes to resolve the problems by introducing an inference system so that life-threatening situations can be prevented in advance based on a short and long term health status prediction. This prediction is inferred from personal health information that is built by big data in Cloud. The inference system can also resolve the problem of data overload in sensor nodes by reducing data volume and frequency to reduce workload in sensor nodes. This paper presents a novel idea of tracking down and predicting a personal health status as well as intelligent functionality of inference in sensor nodes to interface IoT networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Privacy preserving on data mining and data release has attracted an increasing research interest over a number of decades. Differential privacy is one influential privacy notion that offers a rigorous and provable privacy guarantee for data mining and data release. Existing studies on differential privacy assume that in a data set, records are sampled independently. However, in real-world applications, records in a data set are rarely independent. The relationships among records are referred to as correlated information and the data set is defined as correlated data set. A differential privacy technique performed on a correlated data set will disclose more information than expected, and this is a serious privacy violation. Although recent research was concerned with this new privacy violation, it still calls for a solid solution for the correlated data set. Moreover, how to decrease the large amount of noise incurred via differential privacy in correlated data set is yet to be explored. To fill the gap, this paper proposes an effective correlated differential privacy solution by defining the correlated sensitivity and designing a correlated data releasing mechanism. With consideration of the correlated levels between records, the proposed correlated sensitivity can significantly decrease the noise compared with traditional global sensitivity. The correlated data releasing mechanism correlated iteration mechanism is designed based on an iterative method to answer a large number of queries. Compared with the traditional method, the proposed correlated differential privacy solution enhances the privacy guarantee for a correlated data set with less accuracy cost. Experimental results show that the proposed solution outperforms traditional differential privacy in terms of mean square error on large group of queries. This also suggests the correlated differential privacy can successfully retain the utility while preserving the privacy.