997 resultados para scene localization


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we present a system for pedestrian detection involving scenes captured by mobile bus surveillance cameras in busy city streets. Our approach integrates scene localization, foreground and background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data. In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarities and second stage further clusters these aligned frames in terms of lighting. This produces clusters of images which are differential in viewpoint and lighting. A kernel density estimation (KDE) method for colour and gradient foreground-background separation are then used to construct background model for each image cluster which is subsequently used to detect all foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be identified. We have tested our system on a set of real bus video datasets and the experimental results verify that our system works well in practice.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

At the highest level of competitive sport, nearly all performances of athletes (both training and competitive) are chronicled using video. Video is then often viewed by expert coaches/analysts who then manually label important performance indicators to gauge performance. Stroke-rate and pacing are important performance measures in swimming, and these are previously digitised manually by a human. This is problematic as annotating large volumes of video can be costly, and time-consuming. Further, since it is difficult to accurately estimate the position of the swimmer at each frame, measures such as stroke rate are generally aggregated over an entire swimming lap. Vision-based techniques which can automatically, objectively and reliably track the swimmer and their location can potentially solve these issues and allow for large-scale analysis of a swimmer across many videos. However, the aquatic environment is challenging due to fluctuations in scene from splashes, reflections and because swimmers are frequently submerged at different points in a race. In this paper, we temporally segment races into distinct and sequential states, and propose a multimodal approach which employs individual detectors tuned to each race state. Our approach allows the swimmer to be located and tracked smoothly in each frame despite a diverse range of constraints. We test our approach on a video dataset compiled at the 2012 Australian Short Course Swimming Championships.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a topological localization method based on optical flow information. We analyse the statistical characteristics of the optical flow signal and demonstrate that the flow vectors can be used to identify and describe key locations in the environment. The key locations (nodes) correspond to significant scene changes and depth discontinuities. Since optical flow vectors contain position, magnitude and angle information, for each node, we extract low and high order statistical moments of the vectors and use them as descriptors for that node. Once a database of nodes and their corresponding optical flow features is created, the robot can perform topological localization by using the Mahalanobis distance between the current frame and the database. This is supported by field trials, which illustrate the repeatability of the proposed method for detecting and describing key locations in indoor and outdoor environments in challenging and diverse lighting conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose the use of optical flow information as a method for detecting and describing changes in the environment, from the perspective of a mobile camera. We analyze the characteristics of the optical flow signal and demonstrate how robust flow vectors can be generated and used for the detection of depth discontinuities and appearance changes at key locations. To successfully achieve this task, a full discussion on camera positioning, distortion compensation, noise filtering, and parameter estimation is presented. We then extract statistical attributes from the flow signal to describe the location of the scene changes. We also employ clustering and dominant shape of vectors to increase the descriptiveness. Once a database of nodes (where a node is a detected scene change) and their corresponding flow features is created, matching can be performed whenever nodes are encountered, such that topological localization can be achieved. We retrieve the most likely node according to the Mahalanobis and Chi-square distances between the current frame and the database. The results illustrate the applicability of the technique for detecting and describing scene changes in diverse lighting conditions, considering indoor and outdoor environments and different robot platforms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method for localization and positioning in an indoor environment is presented. The method is based on representing the scene as a set of 2D views and predicting the appearances of novel views by linear combinations of the model views. The method is accurate under weak perspective projection. Analysis of this projection as well as experimental results demonstrate that in many cases it is sufficient to accurately describe the scene. When weak perspective approximation is invalid, an iterative solution to account for the perspective distortions can be employed. A simple algorithm for repositioning, the task of returning to a previously visited position defined by a single view, is derived from this method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple sound sources often contain harmonics that overlap and may be degraded by environmental noise. The auditory system is capable of teasing apart these sources into distinct mental objects, or streams. Such an "auditory scene analysis" enables the brain to solve the cocktail party problem. A neural network model of auditory scene analysis, called the AIRSTREAM model, is presented to propose how the brain accomplishes this feat. The model clarifies how the frequency components that correspond to a give acoustic source may be coherently grouped together into distinct streams based on pitch and spatial cues. The model also clarifies how multiple streams may be distinguishes and seperated by the brain. Streams are formed as spectral-pitch resonances that emerge through feedback interactions between frequency-specific spectral representaion of a sound source and its pitch. First, the model transforms a sound into a spatial pattern of frequency-specific activation across a spectral stream layer. The sound has multiple parallel representations at this layer. A sound's spectral representation activates a bottom-up filter that is sensitive to harmonics of the sound's pitch. The filter activates a pitch category which, in turn, activate a top-down expectation that allows one voice or instrument to be tracked through a noisy multiple source environment. Spectral components are suppressed if they do not match harmonics of the top-down expectation that is read-out by the selected pitch, thereby allowing another stream to capture these components, as in the "old-plus-new-heuristic" of Bregman. Multiple simultaneously occuring spectral-pitch resonances can hereby emerge. These resonance and matching mechanisms are specialized versions of Adaptive Resonance Theory, or ART, which clarifies how pitch representations can self-organize durin learning of harmonic bottom-up filters and top-down expectations. The model also clarifies how spatial location cues can help to disambiguate two sources with similar spectral cures. Data are simulated from psychophysical grouping experiments, such as how a tone sweeping upwards in frequency creates a bounce percept by grouping with a downward sweeping tone due to proximity in frequency, even if noise replaces the tones at their interection point. Illusory auditory percepts are also simulated, such as the auditory continuity illusion of a tone continuing through a noise burst even if the tone is not present during the noise, and the scale illusion of Deutsch whereby downward and upward scales presented alternately to the two ears are regrouped based on frequency proximity, leading to a bounce percept. Since related sorts of resonances have been used to quantitatively simulate psychophysical data about speech perception, the model strengthens the hypothesis the ART-like mechanisms are used at multiple levels of the auditory system. Proposals for developing the model to explain more complex streaming data are also provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the improvements achieved in our mosaicking system to assist unmanned underwater vehicle navigation. A major advance has been attained in the processing of images of the ocean floor when light absorption effects are evident. Due to the absorption of natural light, underwater vehicles often require artificial light sources attached to them to provide the adequate illumination for processing underwater images. Unfortunately, these flashlights tend to illuminate the scene in a nonuniform fashion. In this paper a technique to correct non-uniform lighting is proposed. The acquired frames are compensated through a point-by-point division of the image by an estimation of the illumination field. Then, the gray-levels of the obtained image remapped to enhance image contrast. Experiments with real images are presented

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis investigates interactive scene reconstruction and understanding using RGB-D data only. Indeed, we believe that depth cameras will still be in the near future a cheap and low-power 3D sensing alternative suitable for mobile devices too. Therefore, our contributions build on top of state-of-the-art approaches to achieve advances in three main challenging scenarios, namely mobile mapping, large scale surface reconstruction and semantic modeling. First, we will describe an effective approach dealing with Simultaneous Localization And Mapping (SLAM) on platforms with limited resources, such as a tablet device. Unlike previous methods, dense reconstruction is achieved by reprojection of RGB-D frames, while local consistency is maintained by deploying relative bundle adjustment principles. We will show quantitative results comparing our technique to the state-of-the-art as well as detailed reconstruction of various environments ranging from rooms to small apartments. Then, we will address large scale surface modeling from depth maps exploiting parallel GPU computing. We will develop a real-time camera tracking method based on the popular KinectFusion system and an online surface alignment technique capable of counteracting drift errors and closing small loops. We will show very high quality meshes outperforming existing methods on publicly available datasets as well as on data recorded with our RGB-D camera even in complete darkness. Finally, we will move to our Semantic Bundle Adjustment framework to effectively combine object detection and SLAM in a unified system. Though the mathematical framework we will describe does not restrict to a particular sensing technology, in the experimental section we will refer, again, only to RGB-D sensing. We will discuss successful implementations of our algorithm showing the benefit of a joint object detection, camera tracking and environment mapping.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a new Bayesian framework for automatically determining the position (location and orientation) of an uncalibrated camera using the observations of moving objects and a schematic map of the passable areas of the environment. Our approach takes advantage of static and dynamic information on the scene structures through prior probability distributions for object dynamics. The proposed approach restricts plausible positions where the sensor can be located while taking into account the inherent ambiguity of the given setting. The proposed framework samples from the posterior probability distribution for the camera position via data driven MCMC, guided by an initial geometric analysis that restricts the search space. A Kullback-Leibler divergence analysis is then used that yields the final camera position estimate, while explicitly isolating ambiguous settings. The proposed approach is evaluated in synthetic and real environments, showing its satisfactory performance in both ambiguous and unambiguous settings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this article we describe a semantic localization dataset for indoor environments named ViDRILO. The dataset provides five sequences of frames acquired with a mobile robot in two similar office buildings under different lighting conditions. Each frame consists of a point cloud representation of the scene and a perspective image. The frames in the dataset are annotated with the semantic category of the scene, but also with the presence or absence of a list of predefined objects appearing in the scene. In addition to the frames and annotations, the dataset is distributed with a set of tools for its use in both place classification and object recognition tasks. The large number of labeled frames in conjunction with the annotation scheme make this dataset different from existing ones. The ViDRILO dataset is released for use as a benchmark for different problems such as multimodal place classification and object recognition, 3D reconstruction or point cloud data compression.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This essay explores the political significance of Balinese death/thrash fandom. In the early 1990s, the emergence of a death/thrash scene in Bali paralleled growing criticism of accelerated tourism development on the island. Specifically, locals protested the increasing ubiquity of Jakarta, 'the centre', cast as threatening to an authentically 'low', peripheral Balinese culture. Similarly, death/thrash enthusiasts also gravitated toward certain fringes, although they rejected dominant notions of Balinese-ness by gesturing elsewhere, toward a global scene. The essay explores the ways in which death/thrash enthusiasts engaged with local discourses by coveting their marginality, and aims to demonstrate how their articulations of 'alien-ness' contributed in important ways to a broader regionalism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The promotion of alternative music by deregulated television and recording industries, together with the increasingly felt presence of the metropolis, converged on Balinese cultural and physical landscapes in the 1990s. Mirroring developments in broader society, a regionalist discourse, which polarized notions of ‘centre’ and ‘periphery’, emerged among Balinese youth in the context of the local band scene. For certain musicians, musical authenticity was firmly rooted in a cultural and geographical locale, and was articulated by their abhorrence for socializing at shopping malls. In contrast, these Balinese alternative (including punk) musicians sought authenticity in a metropolitan elsewhere. This article is a case study of the indigenization of a ‘global’ code in a non-western periphery. It contests arguments for the ‘post-imperial’ nature of globalization, and demonstrates the continued salience of centre–periphery dialectics in local discourses. At the same time, the study attests to the progressive role a metropolitan superculture can play in cultural renewal in the periphery.