985 resultados para Object vision


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The integration of CMOS cameras with embedded processors and wireless communication devices has enabled the development of distributed wireless vision systems. Wireless Vision Sensor Networks (WVSNs), which consist of wirelessly connected embedded systems with vision and sensing capabilities, provide wide variety of application areas that have not been possible to realize with the wall-powered vision systems with wired links or scalar-data based wireless sensor networks. In this paper, the design of a middleware for a wireless vision sensor node is presented for the realization of WVSNs. The implemented wireless vision sensor node is tested through a simple vision application to study and analyze its capabilities, and determine the challenges in distributed vision applications through a wireless network of low-power embedded devices. The results of this paper highlight the practical concerns for the development of efficient image processing and communication solutions for WVSNs and emphasize the need for cross-layer solutions that unify these two so-far-independent research areas.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Tesis en inglés. Eliminadas las páginas en blanco del pdf

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In recent years, Deep Learning techniques have shown to perform well on a large variety of problems both in Computer Vision and Natural Language Processing, reaching and often surpassing the state of the art on many tasks. The rise of deep learning is also revolutionizing the entire field of Machine Learning and Pattern Recognition pushing forward the concepts of automatic feature extraction and unsupervised learning in general. However, despite the strong success both in science and business, deep learning has its own limitations. It is often questioned if such techniques are only some kind of brute-force statistical approaches and if they can only work in the context of High Performance Computing with tons of data. Another important question is whether they are really biologically inspired, as claimed in certain cases, and if they can scale well in terms of "intelligence". The dissertation is focused on trying to answer these key questions in the context of Computer Vision and, in particular, Object Recognition, a task that has been heavily revolutionized by recent advances in the field. Practically speaking, these answers are based on an exhaustive comparison between two, very different, deep learning techniques on the aforementioned task: Convolutional Neural Network (CNN) and Hierarchical Temporal memory (HTM). They stand for two different approaches and points of view within the big hat of deep learning and are the best choices to understand and point out strengths and weaknesses of each of them. CNN is considered one of the most classic and powerful supervised methods used today in machine learning and pattern recognition, especially in object recognition. CNNs are well received and accepted by the scientific community and are already deployed in large corporation like Google and Facebook for solving face recognition and image auto-tagging problems. HTM, on the other hand, is known as a new emerging paradigm and a new meanly-unsupervised method, that is more biologically inspired. It tries to gain more insights from the computational neuroscience community in order to incorporate concepts like time, context and attention during the learning process which are typical of the human brain. In the end, the thesis is supposed to prove that in certain cases, with a lower quantity of data, HTM can outperform CNN.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In sports games, it is often necessary to perceive a large number of moving objects (e.g., the ball and players). In this context, the role of peripheral vision for processing motion information in the periphery is often discussed especially when motor responses are required. In an attempt to test the basal functionality of peripheral vision in those sports-games situations, a Multiple Object Tracking (MOT) task that requires to track a certain number of targets amidst distractors, was chosen. Participants’ primary task was to recall four targets (out of 10 rectangular stimuli) after six seconds of quasi-random motion. As a second task, a button had to be pressed if a target change occurred (Exp 1: stop vs. form change to a diamond for 0.5 s; Exp 2: stop vs. slowdown for 0.5 s). While eccentricities of changes (5-10° vs. 15-20°) were manipulated, decision accuracy (recall and button press correct), motor response time as well as saccadic reaction time were calculated as dependent variables. Results show that participants indeed used peripheral vision to detect changes, because either no or very late saccades to the changed target were executed in correct trials. Moreover, a saccade was more often executed when eccentricities were small. Response accuracies were higher and response times were lower in the stop conditions of both experiments while larger eccentricities led to higher response times in all conditions. Summing up, it could be shown that monitoring targets and detecting changes can be processed by peripheral vision only and that a monitoring strategy on the basis of peripheral vision may be the optimal one as saccades may be afflicted with certain costs. Further research is planned to address the question whether this functionality is also evident in sports tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In sports games, it is often necessary to perceive a large number of moving objects (e.g., the ball and players). In this context, the role of peripheral vision for processing motion information in the periphery is often discussed especially when motor responses are required. In an attempt to test the capability of using peripheral vision in those sports-games situations, a Multiple-Object-Tracking task that requires to track a certain number of targets amidst distractors, was chosen to determine the sensitivity of detecting target changes with peripheral vision only. Participants’ primary task was to recall four targets (out of 10 rectangular stimuli) after six seconds of quasi-random motion. As a second task, a button had to be pressed if a target change occurred (Exp 1: stop vs. form change to a diamond for 0.5 s; Exp 2: stop vs. slowdown for 0.5 s). Eccentricities of changes (5-10° vs. 15-20°) were manipulated, decision accuracy (recall and button press correct), motor response time and saccadic reaction time (change onset to saccade onset) were calculated and eye-movements were recorded. Results show that participants indeed used peripheral vision to detect changes, because either no or very late saccades to the changed target were executed in correct trials. Moreover, a saccade was more often executed when eccentricities were small. Response accuracies were higher and response times were lower in the stop conditions of both experiments while larger eccentricities led to higher response times in all conditions. Summing up, it could be shown that monitoring targets and detecting changes can be processed by peripheral vision only and that a monitoring strategy on the basis of peripheral vision may be the optimal one as saccades may be afflicted with certain costs. Further research is planned to address the question whether this functionality is also evident in sports tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the current study it is investigated whether peripheral vision can be used to monitor multi-ple moving objects and to detect single-target changes. For this purpose, in Experiment 1, a modified MOT setup with a large projection and a constant-position centroid phase had to be checked first. Classical findings regarding the use of a virtual centroid to track multiple ob-jects and the dependency of tracking accuracy on target speed could be successfully replicat-ed. Thereafter, the main experimental variations regarding the manipulation of to-be-detected target changes could be introduced in Experiment 2. In addition to a button press used for the detection task, gaze behavior was assessed using an integrated eye-tracking system. The anal-ysis of saccadic reaction times in relation to the motor response shows that peripheral vision is naturally used to detect motion and form changes in MOT because the saccade to the target occurred after target-change offset. Furthermore, for changes of comparable task difficulties, motion changes are detected better by peripheral vision than form changes. Findings indicate that capabilities of the visual system (e.g., visual acuity) affect change detection rates and that covert-attention processes may be affected by vision-related aspects like spatial uncertainty. Moreover, it is argued that a centroid-MOT strategy might reduce the amount of saccade-related costs and that eye-tracking seems to be generally valuable to test predictions derived from theories on MOT. Finally, implications for testing covert attention in applied settings are proposed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The perception of an object as a single entity within a visual scene requires that its features are bound together and segregated from the background and/or other objects. Here, we used magnetoencephalography (MEG) to assess the hypothesis that coherent percepts may arise from the synchronized high frequency (gamma) activity between neurons that code features of the same object. We also assessed the role of low frequency (alpha, beta) activity in object processing. The target stimulus (i.e. object) was a small patch of a concentric grating of 3c/°, viewed eccentrically. The background stimulus was either a blank field or a concentric grating of 3c/° periodicity, viewed centrally. With patterned backgrounds, the target stimulus emerged--through rotation about its own centre--as a circular subsection of the background. Data were acquired using a 275-channel whole-head MEG system and analyzed using Synthetic Aperture Magnetometry (SAM), which allows one to generate images of task-related cortical oscillatory power changes within specific frequency bands. Significant oscillatory activity across a broad range of frequencies was evident at the V1/V2 border, and subsequent analyses were based on a virtual electrode at this location. When the target was presented in isolation, we observed that: (i) contralateral stimulation yielded a sustained power increase in gamma activity; and (ii) both contra- and ipsilateral stimulation yielded near identical transient power changes in alpha (and beta) activity. When the target was presented against a patterned background, we observed that: (i) contralateral stimulation yielded an increase in high-gamma (>55 Hz) power together with a decrease in low-gamma (40-55 Hz) power; and (ii) both contra- and ipsilateral stimulation yielded a transient decrease in alpha (and beta) activity, though the reduction tended to be greatest for contralateral stimulation. The opposing power changes across different regions of the gamma spectrum with 'figure/ground' stimulation suggest a possible dual role for gamma rhythms in visual object coding, and provide general support of the binding-by-synchronization hypothesis. As the power changes in alpha and beta activity were largely independent of the spatial location of the target, however, we conclude that their role in object processing may relate principally to changes in visual attention.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To navigate successfully in a previously unexplored environment, a mobile robot must be able to estimate the spatial relationships of the objects of interest accurately. A Simultaneous Localization and Mapping (SLAM) sys- tem employs its sensors to build incrementally a map of its surroundings and to localize itself in the map simultaneously. The aim of this research project is to develop a SLAM system suitable for self propelled household lawnmowers. The proposed bearing-only SLAM system requires only an omnidirec- tional camera and some inexpensive landmarks. The main advantage of an omnidirectional camera is the panoramic view of all the landmarks in the scene. Placing landmarks in a lawn field to define the working domain is much easier and more flexible than installing the perimeter wire required by existing autonomous lawnmowers. The common approach of existing bearing-only SLAM methods relies on a motion model for predicting the robot’s pose and a sensor model for updating the pose. In the motion model, the error on the estimates of object positions is cumulated due mainly to the wheel slippage. Quantifying accu- rately the uncertainty of object positions is a fundamental requirement. In bearing-only SLAM, the Probability Density Function (PDF) of landmark position should be uniform along the observed bearing. Existing methods that approximate the PDF with a Gaussian estimation do not satisfy this uniformity requirement. This thesis introduces both geometric and proba- bilistic methods to address the above problems. The main novel contribu- tions of this thesis are: 1. A bearing-only SLAM method not requiring odometry. The proposed method relies solely on the sensor model (landmark bearings only) without relying on the motion model (odometry). The uncertainty of the estimated landmark positions depends on the vision error only, instead of the combination of both odometry and vision errors. 2. The transformation of the spatial uncertainty of objects. This thesis introduces a novel method for translating the spatial un- certainty of objects estimated from a moving frame attached to the robot into the global frame attached to the static landmarks in the environment. 3. The characterization of an improved PDF for representing landmark position in bearing-only SLAM. The proposed PDF is expressed in polar coordinates, and the marginal probability on range is constrained to be uniform. Compared to the PDF estimated from a mixture of Gaussians, the PDF developed here has far fewer parameters and can be easily adopted in a probabilistic framework, such as a particle filtering system. The main advantages of our proposed bearing-only SLAM system are its lower production cost and flexibility of use. The proposed system can be adopted in other domestic robots as well, such as vacuum cleaners or robotic toys when terrain is essentially 2D.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abandoned object detection (AOD) systems are required to run in high traffic situations, with high levels of occlusion. Systems rely on background segmentation techniques to locate abandoned objects, by detecting areas of motion that have stopped. This is often achieved by using a medium term motion detection routine to detect long term changes in the background. When AOD systems are integrated into person tracking system, this often results in two separate motion detectors being used to handle the different requirements. We propose a motion detection system that is capable of detecting medium term motion as well as regular motion. Multiple layers of medium term (static) motion can be detected and segmented. We demonstrate the performance of this motion detection system and as part of an abandoned object detection system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an object tracking system that utilises a hybrid multi-layer motion segmentation and optical flow algorithm. While many tracking systems seek to combine multiple modalities such as motion and depth or multiple inputs within a fusion system to improve tracking robustness, current systems have avoided the combination of motion and optical flow. This combination allows the use of multiple modes within the object detection stage. Consequently, different categories of objects, within motion or stationary, can be effectively detected utilising either optical flow, static foreground or active foreground information. The proposed system is evaluated using the ETISEO database and evaluation metrics and compared to a baseline system utilising a single mode foreground segmentation technique. Results demonstrate a significant improvement in tracking results can be made through the incorporation of the additional motion information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Performance evaluation of object tracking systems is typically performed after the data has been processed, by comparing tracking results to ground truth. Whilst this approach is fine when performing offline testing, it does not allow for real-time analysis of the systems performance, which may be of use for live systems to either automatically tune the system or report reliability. In this paper, we propose three metrics that can be used to dynamically asses the performance of an object tracking system. Outputs and results from various stages in the tracking system are used to obtain measures that indicate the performance of motion segmentation, object detection and object matching. The proposed dynamic metrics are shown to accurately indicate tracking errors when visually comparing metric results to tracking output, and are shown to display similar trends to the ETISEO metrics when comparing different tracking configurations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Object tracking systems require accurate segmentation of the objects from the background for effective tracking. Motion segmentation or optical flow can be used to segment incoming images. Whilst optical flow allows multiple moving targets to be separated based on their individual velocities, optical flow techniques are prone to errors caused by changing lighting and occlusions, both common in a surveillance environment. Motion segmentation techniques are more robust to fluctuating lighting and occlusions, but don't provide information on the direction of the motion. In this paper we propose a combined motion segmentation/optical flow algorithm for use in object tracking. The proposed algorithm uses the motion segmentation results to inform the optical flow calculations and ensure that optical flow is only calculated in regions of motion, and improve the performance of the optical flow around the edge of moving objects. Optical flow is calculated at pixel resolution and tracking of flow vectors is employed to improve performance and detect discontinuities, which can indicate the location of overlaps between objects. The algorithm is evaluated by attempting to extract a moving target within the flow images, given expected horizontal and vertical movement (i.e. the algorithms intended use for object tracking). Results show that the proposed algorithm outperforms other widely used optical flow techniques for this surveillance application.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the real time global vision system for the robot soccer team the RoboRoos. It has a highly optimised pipeline that includes thresholding, segmenting, colour normalising, object recognition and perspective and lens correction. It has a fast ‘paint’ colour calibration system that can calibrate in any face of the YUV or HSI cube. It also autonomously selects both an appropriate camera gain and colour gains robot regions across the field to achieve colour uniformity. Camera geometry calibration is performed automatically from selection of keypoints on the field. The system achieves a position accuracy of better than 15mm over a 4m × 5.5m field, and orientation accuracy to within 1°. It processes 614 × 480 pixels at 60Hz on a 2.0GHz Pentium 4 microprocessor.