967 resultados para 080109 Pattern Recognition and Data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multi-document summarization addressing the problem of information overload has been widely utilized in the various real-world applications. Most of existing approaches adopt term-based representation for documents which limit the performance of multi-document summarization systems. In this paper, we proposed a novel pattern-based topic model (PBTMSum) for the task of the multi-document summarization. PBTMSum combining pattern mining techniques with LDA topic modelling could generate discriminative and semantic rich representations for topics and documents so that the most representative and non-redundant sentences can be selected to form a succinct and informative summary. Extensive experiments are conducted on the data of document understanding conference (DUC) 2007. The results prove the effectiveness and efficiency of our proposed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic detection of suspicious activities in CCTV camera feeds is crucial to the success of video surveillance systems. Such a capability can help transform the dumb CCTV cameras into smart surveillance tools for fighting crime and terror. Learning and classification of basic human actions is a precursor to detecting suspicious activities. Most of the current approaches rely on a non-realistic assumption that a complete dataset of normal human actions is available. This paper presents a different approach to deal with the problem of understanding human actions in video when no prior information is available. This is achieved by working with an incomplete dataset of basic actions which are continuously updated. Initially, all video segments are represented by Bags-Of-Words (BOW) method using only Term Frequency-Inverse Document Frequency (TF-IDF) features. Then, a data-stream clustering algorithm is applied for updating the system's knowledge from the incoming video feeds. Finally, all the actions are classified into different sets. Experiments and comparisons are conducted on the well known Weizmann and KTH datasets to show the efficacy of the proposed approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Drivers' ability to react to unpredictable events deteriorates when exposed to highly predictable and uneventful driving tasks. Particularly, highway design reduces the driving task mainly to a lane-keeping one. It contributes to hypovigilance and road crashes as drivers are often not aware that their driving behaviour is impaired. Monotony increases fatigue, however, the fatigue community has mainly focused on endogenous factors leading to fatigue such as sleep deprivation. This paper focuses on the exogenous factor monotony which contributes to hypovigilance. Objective measurements of the effects of monotonous driving conditions on the driver and the vehicle's dynamics is systematically reviewed with the aim of justifying the relevance of the need for a mathematical framework that could predict hypovigilance in real-time. Although electroencephalography (EEG) is one of the most reliable measures of vigilance, it is obtrusive. This suggests to predict from observable variables the time when the driver is hypovigilant. Outlined is a vision for future research in the modelling of driver vigilance decrement due to monotonous driving conditions. A mathematical model for predicting drivers’ hypovigilance using information like lane positioning, steering wheel movements and eye blinks is provided. Such a modelling of driver vigilance should enable the future development of an in-vehicle device that detects driver hypovigilance in advance, thus offering the potential to enhance road safety and prevent road crashes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intelligent software agents are promising in improving the effectiveness of e-marketplaces for e-commerce. Although a large amount of research has been conducted to develop negotiation protocols and mechanisms for e-marketplaces, existing negotiation mechanisms are weak in dealing with complex and dynamic negotiation spaces often found in e-commerce. This paper illustrates a novel knowledge discovery method and a probabilistic negotiation decision making mechanism to improve the performance of negotiation agents. Our preliminary experiments show that the probabilistic negotiation agents empowered by knowledge discovery mechanisms are more effective and efficient than the Pareto optimal negotiation agents in simulated e-marketplaces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.