852 resultados para 080109 Pattern Recognition and Data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We summarize the various strands of research on peripheral vision and relate them to theories of form perception. After a historical overview, we describe quantifications of the cortical magnification hypothesis, including an extension of Schwartz's cortical mapping function. The merits of this concept are considered across a wide range of psychophysical tasks, followed by a discussion of its limitations and the need for non-spatial scaling. We also review the eccentricity dependence of other low-level functions including reaction time, temporal resolution, and spatial summation, as well as perimetric methods. A central topic is then the recognition of characters in peripheral vision, both at low and high levels of contrast, and the impact of surrounding contours known as crowding. We demonstrate how Bouma's law, specifying the critical distance for the onset of crowding, can be stated in terms of the retinocortical mapping. The recognition of more complex stimuli, like textures, faces, and scenes, reveals a substantial impact of mid-level vision and cognitive factors. We further consider eccentricity-dependent limitations of learning, both at the level of perceptual learning and pattern category learning. Generic limitations of extrafoveal vision are observed for the latter in categorization tasks involving multiple stimulus classes. Finally, models of peripheral form vision are discussed. We report that peripheral vision is limited with regard to pattern categorization by a distinctly lower representational complexity and processing speed. Taken together, the limitations of cognitive processing in peripheral vision appear to be as significant as those imposed on low-level functions and by way of crowding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We summarize the various strands of research on peripheral vision and relate them to theories of form perception. After a historical overview, we describe quantifications of the cortical magnification hypothesis, including an extension of Schwartz's cortical mapping function. The merits of this concept are considered across a wide range of psychophysical tasks, followed by a discussion of its limitations and the need for non-spatial scaling. We also review the eccentricity dependence of other low-level functions including reaction time, temporal resolution, and spatial summation, as well as perimetric methods. A central topic is then the recognition of characters in peripheral vision, both at low and high levels of contrast, and the impact of surrounding contours known as crowding. We demonstrate how Bouma's law, specifying the critical distance for the onset of crowding, can be stated in terms of the retinocortical mapping. The recognition of more complex stimuli, like textures, faces, and scenes, reveals a substantial impact of mid-level vision and cognitive factors. We further consider eccentricity-dependent limitations of learning, both at the level of perceptual learning and pattern category learning. Generic limitations of extrafoveal vision are observed for the latter in categorization tasks involving multiple stimulus classes. Finally, models of peripheral form vision are discussed. We report that peripheral vision is limited with regard to pattern categorization by a distinctly lower representational complexity and processing speed. Taken together, the limitations of cognitive processing in peripheral vision appear to be as significant as those imposed on low-level functions and by way of crowding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a modification for the high-order neural network (HONN) is presented. Third order networks are considered for achieving translation, rotation and scale invariant pattern recognition. They require however much storage and computation power for the task. The proposed modified HONN takes into account a priori knowledge of the binary patterns that have to be learned, achieving significant gain in computation time and memory requirements. This modification enables the efficient computation of HONNs for image fields of greater that 100 × 100 pixels without any loss of pattern information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the explosive growth of the volume and complexity of document data (e.g., news, blogs, web pages), it has become a necessity to semantically understand documents and deliver meaningful information to users. Areas dealing with these problems are crossing data mining, information retrieval, and machine learning. For example, document clustering and summarization are two fundamental techniques for understanding document data and have attracted much attention in recent years. Given a collection of documents, document clustering aims to partition them into different groups to provide efficient document browsing and navigation mechanisms. One unrevealed area in document clustering is that how to generate meaningful interpretation for the each document cluster resulted from the clustering process. Document summarization is another effective technique for document understanding, which generates a summary by selecting sentences that deliver the major or topic-relevant information in the original documents. How to improve the automatic summarization performance and apply it to newly emerging problems are two valuable research directions. To assist people to capture the semantics of documents effectively and efficiently, the dissertation focuses on developing effective data mining and machine learning algorithms and systems for (1) integrating document clustering and summarization to obtain meaningful document clusters with summarized interpretation, (2) improving document summarization performance and building document understanding systems to solve real-world applications, and (3) summarizing the differences and evolution of multiple document sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantitative methods can help us understand how underlying attributes contribute to movement patterns. Applying principal components analysis (PCA) to whole-body motion data may provide an objective data-driven method to identify unique and statistically important movement patterns. Therefore, the primary purpose of this study was to determine if athletes’ movement patterns can be differentiated based on skill level or sport played using PCA. Motion capture data from 542 athletes performing three sport-screening movements (i.e. bird-dog, drop jump, T-balance) were analyzed. A PCA-based pattern recognition technique was used to analyze the data. Prior to analyzing the effects of skill level or sport on movement patterns, methodological considerations related to motion analysis reference coordinate system were assessed. All analyses were addressed as case-studies. For the first case study, referencing motion data to a global (lab-based) coordinate system compared to a local (segment-based) coordinate system affected the ability to interpret important movement features. Furthermore, for the second case study, where the interpretability of PCs was assessed when data were referenced to a stationary versus a moving segment-based coordinate system, PCs were more interpretable when data were referenced to a stationary coordinate system for both the bird-dog and T-balance task. As a result of the findings from case study 1 and 2, only stationary segment-based coordinate systems were used in cases 3 and 4. During the bird-dog task, elite athletes had significantly lower scores compared to recreational athletes for principal component (PC) 1. For the T-balance movement, elite athletes had significantly lower scores compared to recreational athletes for PC 2. In both analyses the lower scores in elite athletes represented a greater range of motion. Finally, case study 4 reported differences in athletes’ movement patterns who competed in different sports, and significant differences in technique were detected during the bird-dog task. Through these case studies, this thesis highlights the feasibility of applying PCA as a movement pattern recognition technique in athletes. Future research can build on this proof-of-principle work to develop robust quantitative methods to help us better understand how underlying attributes (e.g. height, sex, ability, injury history, training type) contribute to performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the novel theory for performing multi-agent activity recognition without requiring large training corpora. The reduced need for data means that robust probabilistic recognition can be performed within domains where annotated datasets are traditionally unavailable. Complex human activities are composed from sequences of underlying primitive activities. We do not assume that the exact temporal ordering of primitives is necessary, so can represent complex activity using an unordered bag. Our three-tier architecture comprises low-level video tracking, event analysis and high-level inference. High-level inference is performed using a new, cascading extension of the Rao–Blackwellised Particle Filter. Simulated annealing is used to identify pairs of agents involved in multi-agent activity. We validate our framework using the benchmarked PETS 2006 video surveillance dataset and our own sequences, and achieve a mean recognition F-Score of 0.82. Our approach achieves a mean improvement of 17% over a Hidden Markov Model baseline.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of implicit, previously un-known, and potentially useful information from data. Numerous re-searchers have been developing security technology and exploring new methods to detect cyber-attacks with the DARPA 1998 dataset for Intrusion Detection and the modified versions of this dataset KDDCup99 and NSL-KDD, but until now no one have examined the performance of the Top 10 data mining algorithms selected by experts in data mining. The compared classification learning algorithms in this thesis are: C4.5, CART, k-NN and Naïve Bayes. The performance of these algorithms are compared with accuracy, error rate and average cost on modified versions of NSL-KDD train and test dataset where the instances are classified into normal and four cyber-attack categories: DoS, Probing, R2L and U2R. Additionally the most important features to detect cyber-attacks in all categories and in each category are evaluated with Weka’s Attribute Evaluator and ranked according to Information Gain. The results show that the classification algorithm with best performance on the dataset is the k-NN algorithm. The most important features to detect cyber-attacks are basic features such as the number of seconds of a network connection, the protocol used for the connection, the network service used, normal or error status of the connection and the number of data bytes sent. The most important features to detect DoS, Probing and R2L attacks are basic features and the least important features are content features. Unlike U2R attacks, where the content features are the most important features to detect attacks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Following the workshop on new developments in daily licensing practice in November 2011, we brought together fourteen representatives from national consortia (from Denmark, Germany, Netherlands and the UK) and publishers (Elsevier, SAGE and Springer) met in Copenhagen on 9 March 2012 to discuss provisions in licences to accommodate new developments. The one day workshop aimed to: present background and ideas regarding the provisions KE Licensing Expert Group developed; introduce and explain the provisions the invited publishers currently use;ascertain agreement on the wording for long term preservation, continuous access and course packs; give insight and more clarity about the use of open access provisions in licences; discuss a roadmap for inclusion of the provisions in the publishers’ licences; result in report to disseminate the outcome of the meeting. Participants of the workshop were: United Kingdom: Lorraine Estelle (Jisc Collections) Denmark: Lotte Eivor Jørgensen (DEFF), Lone Madsen (Southern University of Denmark), Anne Sandfær (DEFF/Knowledge Exchange) Germany: Hildegard Schaeffler (Bavarian State Library), Markus Brammer (TIB) The Netherlands: Wilma Mossink (SURF), Nol Verhagen (University of Amsterdam), Marc Dupuis (SURF/Knowledge Exchange) Publishers: Alicia Wise (Elsevier), Yvonne Campfens (Springer), Bettina Goerner (Springer), Leo Walford (Sage) Knowledge Exchange: Keith Russell The main outcome of the workshop was that it would be valuable to have a standard set of clauses which could used in negotiations, this would make concluding licences a lot easier and more efficient. The comments on the model provisions the Licensing Expert group had drafted will be taken into account and the provisions will be reformulated. Data and text mining is a new development and demand for access to allow for this is growing. It would be easier if there was a simpler way to access materials so they could be more easily mined. However there are still outstanding questions on how authors of articles that have been mined can be properly attributed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.