Biblioteca Digital

164 resultados para Human visual processing

Mid-level concept learning with visual contextual ontologies and probabilistic inference for image annotation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.

Adaptive unsupervised learning of human actions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic detection of suspicious activities in CCTV camera feeds is crucial to the success of video surveillance systems. Such a capability can help transform the dumb CCTV cameras into smart surveillance tools for fighting crime and terror. Learning and classification of basic human actions is a precursor to detecting suspicious activities. Most of the current approaches rely on a non-realistic assumption that a complete dataset of normal human actions is available. This paper presents a different approach to deal with the problem of understanding human actions in video when no prior information is available. This is achieved by working with an incomplete dataset of basic actions which are continuously updated. Initially, all video segments are represented by Bags-Of-Words (BOW) method using only Term Frequency-Inverse Document Frequency (TF-IDF) features. Then, a data-stream clustering algorithm is applied for updating the system's knowledge from the incoming video feeds. Finally, all the actions are classified into different sets. Experiments and comparisons are conducted on the well known Weizmann and KTH datasets to show the efficacy of the proposed approach.

Is there something quantum-like about the human mental lexicon?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Following an early claim by Nelson & McEvoy suggesting that word associations can display `spooky action at a distance behaviour', a serious investigation of the potentially quantum nature of such associations is currently underway. In this paper quantum theory is proposed as a framework suitable for modelling the mental lexicon, specifically the results obtained from both intralist and extralist word association experiments. Some initial models exploring this hypothesis are discussed, and they appear to be capable of substantial agreement with pre-existing experimental data. The paper concludes with a discussion of some experiments that will be performed in order to test these models.

Structural health monitoring of bridges using acoustic emission technology and signal processing techniques

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Structural health monitoring (SHM) is the term applied to the procedure of monitoring a structure’s performance, assessing its condition and carrying out appropriate retrofitting so that it performs reliably, safely and efficiently. Bridges form an important part of a nation’s infrastructure. They deteriorate due to age and changing load patterns and hence early detection of damage helps in prolonging the lives and preventing catastrophic failures. Monitoring of bridges has been traditionally done by means of visual inspection. With recent developments in sensor technology and availability of advanced computing resources, newer techniques have emerged for SHM. Acoustic emission (AE) is one such technology that is attracting attention of engineers and researchers all around the world. This paper discusses the use of AE technology in health monitoring of bridge structures, with a special focus on analysis of recorded data. AE waves are stress waves generated by mechanical deformation of material and can be recorded by means of sensors attached to the surface of the structure. Analysis of the AE signals provides vital information regarding the nature of the source of emission. Signal processing of the AE waveform data can be carried out in several ways and is predominantly based on time and frequency domains. Short time Fourier transform and wavelet analysis have proved to be superior alternatives to traditional frequency based analysis in extracting information from recorded waveform. Some of the preliminary results of the application of these analysis tools in signal processing of recorded AE data will be presented in this paper.

Human brain regions required for the dividing and switching of attention between two features of a single object

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This combined PET and ERP study was designed to identify the brain regions activated in switching and divided attention between different features of a single object using matched sensory stimuli and motor response. The ERP data have previously been reported in this journal [64]. We now present the corresponding PET data. We identified partially overlapping neural networks with paradigms requiring the switching or dividing of attention between the elements of complex visual stimuli. Regions of activation were found in the prefrontal and temporal cortices and cerebellum. Each task resulted in different prefrontal cortical regions of activation lending support to the functional subspecialisation of the prefrontal and temporal cortices being based on the cognitive operations required rather than the stimuli themselves.

Improved detection and tracking of objects in surveillance video

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Surveillance networks are typically monitored by a few people, viewing several monitors displaying the camera feeds. It is then very difficult for a human operator to effectively detect events as they happen. Recently, computer vision research has begun to address ways to automatically process some of this data, to assist human operators. Object tracking, event recognition, crowd analysis and human identification at a distance are being pursued as a means to aid human operators and improve the security of areas such as transport hubs. The task of object tracking is key to the effective use of more advanced technologies. To recognize an event people and objects must be tracked. Tracking also enhances the performance of tasks such as crowd analysis or human identification. Before an object can be tracked, it must be detected. Motion segmentation techniques, widely employed in tracking systems, produce a binary image in which objects can be located. However, these techniques are prone to errors caused by shadows and lighting changes. Detection routines often fail, either due to erroneous motion caused by noise and lighting effects, or due to the detection routines being unable to split occluded regions into their component objects. Particle filters can be used as a self contained tracking system, and make it unnecessary for the task of detection to be carried out separately except for an initial (often manual) detection to initialise the filter. Particle filters use one or more extracted features to evaluate the likelihood of an object existing at a given point each frame. Such systems however do not easily allow for multiple objects to be tracked robustly, and do not explicitly maintain the identity of tracked objects. This dissertation investigates improvements to the performance of object tracking algorithms through improved motion segmentation and the use of a particle filter. A novel hybrid motion segmentation / optical flow algorithm, capable of simultaneously extracting multiple layers of foreground and optical flow in surveillance video frames is proposed. The algorithm is shown to perform well in the presence of adverse lighting conditions, and the optical flow is capable of extracting a moving object. The proposed algorithm is integrated within a tracking system and evaluated using the ETISEO (Evaluation du Traitement et de lInterpretation de Sequences vidEO - Evaluation for video understanding) database, and significant improvement in detection and tracking performance is demonstrated when compared to a baseline system. A Scalable Condensation Filter (SCF), a particle filter designed to work within an existing tracking system, is also developed. The creation and deletion of modes and maintenance of identity is handled by the underlying tracking system; and the tracking system is able to benefit from the improved performance in uncertain conditions arising from occlusion and noise provided by a particle filter. The system is evaluated using the ETISEO database. The dissertation then investigates fusion schemes for multi-spectral tracking systems. Four fusion schemes for combining a thermal and visual colour modality are evaluated using the OTCBVS (Object Tracking and Classification in and Beyond the Visible Spectrum) database. It is shown that a middle fusion scheme yields the best results and demonstrates a significant improvement in performance when compared to a system using either mode individually. Findings from the thesis contribute to improve the performance of semi-automated video processing and therefore improve security in areas under surveillance.

Measuring Visual Consistency in 3D Rendering Systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the major challenges facing a present day game development company is the removal of bugs from such complex virtual environments. This work presents an approach for measuring the correctness of synthetic scenes generated by a rendering system of a 3D application, such as a computer game. Our approach builds a database of labelled point clouds representing the spatiotemporal colour distribution for the objects present in a sequence of bug-free frames. This is done by converting the position that the pixels take over time into the 3D equivalent points with associated colours. Once the space of labelled points is built, each new image produced from the same game by any rendering system can be analysed by measuring its visual inconsistency in terms of distance from the database. Objects within the scene can be relocated (manually or by the application engine); yet the algorithm is able to perform the image analysis in terms of the 3D structure and colour distribution of samples on the surface of the object. We applied our framework to the publicly available game RacingGame developed for Microsoft(R) Xna(R). Preliminary results show how this approach can be used to detect a variety of visual artifacts generated by the rendering system in a professional quality game engine.

Is there something quantum-like about the human mental lexicon?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This talk proceeds from the premise that IR should engage in a more substantial dialogue with cognitive science. After all, how users decide relevance, or how they chose terms to modify a query are processes rooted in human cognition. Recently, there has been a growing literature applying quantum theory (QT) to model cognitive phenomena. This talk will survey recent research, in particular, modelling interference effects in human decision making. One aspect of QT will be illustrated - how quantum entanglement can be used to model word associations in human memory. The implications of this will be briefly discussed in terms of a new approach for modelling concept combinations. Tentative links to human adductive reasoning will also be drawn. The basic theme behind this talk is QT can potentially provide a new genre of information processing models (including search) more aligned with human cognition.

Fused HMM Adaptation of Synchronous HMMs for Audio-Visual Speaker Verification

Relevância:

30.00% 30.00%

Publicador:

Visual speech recognition across multiple views

Relevância:

30.00% 30.00%

Publicador:

Computer vision onboard UAVs for civilian tasks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer vision is much more than a technique to sense and recover environmental information from an UAV. It should play a main role regarding UAVs’ functionality because of the big amount of information that can be extracted, its possible uses and applications, and its natural connection to human driven tasks, taking into account that vision is our main interface to world understanding. Our current research’s focus lays on the development of techniques that allow UAVs to maneuver in spaces using visual information as their main input source. This task involves the creation of techniques that allow an UAV to maneuver towards features of interest whenever a GPS signal is not reliable or sufficient, e.g. when signal dropouts occur (which usually happens in urban areas, when flying through terrestrial urban canyons or when operating on remote planetary bodies), or when tracking or inspecting visual targets—including moving ones—without knowing their exact UMT coordinates. This paper also investigates visual serving control techniques that use velocity and position of suitable image features to compute the references for flight control. This paper aims to give a global view of the main aspects related to the research field of computer vision for UAVs, clustered in four main active research lines: visual serving and control, stereo-based visual navigation, image processing algorithms for detection and tracking, and visual SLAM. Finally, the results of applying these techniques in several applications are presented and discussed: this study will encompass power line inspection, mobile target tracking, stereo distance estimation, mapping and positioning.

Image synthesis based on a model of human vision

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading. However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer. This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach. A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures. A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering. This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision.

Unmanned Aerial Vehicles UAVs attitude, height, motion estimation and control using visual systems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an implementation of an aircraft pose and motion estimator using visual systems as the principal sensor for controlling an Unmanned Aerial Vehicle (UAV) or as a redundant system for an Inertial Measure Unit (IMU) and gyros sensors. First, we explore the applications of the unified theory for central catadioptric cameras for attitude and heading estimation, explaining how the skyline is projected on the catadioptric image and how it is segmented and used to calculate the UAV’s attitude. Then we use appearance images to obtain a visual compass, and we calculate the relative rotation and heading of the aerial vehicle. Additionally, we show the use of a stereo system to calculate the aircraft height and to measure the UAV’s motion. Finally, we present a visual tracking system based on Fuzzy controllers working in both a UAV and a camera pan and tilt platform. Every part is tested using the UAV COLIBRI platform to validate the different approaches, which include comparison of the estimated data with the inertial values measured onboard the helicopter platform and the validation of the tracking schemes on real flights.

Labelled silhouettes for human pose estimation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a new method of using foreground silhouette images for human pose estimation. Labels are introduced to the silhouette images, providing an extra layer of information that can be used in the model fitting process. The pixels in the silhouettes are labelled according to the corresponding body part in the model of the current fit, with the labels propagated into the silhouette of the next frame to be used in the fitting for the next frame. Both single and multi-view implementations are detailed, with results showing performance improvements over only using standard unlabelled silhouettes.

Communication of business process models via virtual environment simulations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Games and related virtual environments have been a much-hyped area of the entertainment industry. The classic quote is that games are now approaching the size of Hollywood box office sales [1]. Books are now appearing that talk up the influence of games on business [2], and it is one of the key drivers of present hardware development. Some of this 3D technology is now embedded right down at the operating system level via the Windows Presentation Foundations – hit Windows/Tab on your Vista box to find out... In addition to this continued growth in the area of games, there are a number of factors that impact its development in the business community. Firstly, the average age of gamers is approaching the mid thirties. Therefore, a number of people who are in management positions in large enterprises are experienced in using 3D entertainment environments. Secondly, due to the pressure of demand for more computational power in both CPU and Graphical Processing Units (GPUs), your average desktop, any decent laptop, can run a game or virtual environment. In fact, the demonstrations at the end of this paper were developed at the Queensland University of Technology (QUT) on a standard Software Operating Environment, with an Intel Dual Core CPU and basic Intel graphics option. What this means is that the potential exists for the easy uptake of such technology due to 1. a broad range of workers being regularly exposed to 3D virtual environment software via games; 2. present desktop computing power now strong enough to potentially roll out a virtual environment solution across an entire enterprise. We believe such visual simulation environments can have a great impact in the area of business process modeling. Accordingly, in this article we will outline the communication capabilities of such environments, giving fantastic possibilities for business process modeling applications, where enterprises need to create, manage, and improve their business processes, and then communicate their processes to stakeholders, both process and non-process cognizant. The article then concludes with a demonstration of the work we are doing in this area at QUT.

«
1
2
3
4
5
6
7
8
9
10
11
»