944 resultados para Visual Speech Recognition, Multiple Views, Frontal View, Profile View


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents an investigation into event detection in crowded scenes, where the event of interest co-occurs with other activities and only binary labels at the clip level are available. The proposed approach incorporates a fast feature descriptor from the MPEG domain, and a novel multiple instance learning (MIL) algorithm using sparse approximation and random sensing. MPEG motion vectors are used to build particle trajectories that represent the motion of objects in uniform video clips, and the MPEG DCT coefficients are used to compute a foreground map to remove background particles. Trajectories are transformed into the Fourier domain, and the Fourier representations are quantized into visual words using the K-Means algorithm. The proposed MIL algorithm models the scene as a linear combination of independent events, where each event is a distribution of visual words. Experimental results show that the proposed approaches achieve promising results for event detection compared to the state-of-the-art.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a long-term experiment where a mobile robot uses adaptive spherical views to localize itself and navigate inside a non-stationary office environment. The office contains seven members of staff and experiences a continuous change in its appearance over time due to their daily activities. The experiment runs as an episodic navigation task in the office over a period of eight weeks. The spherical views are stored in the nodes of a pose graph and they are updated in response to the changes in the environment. The updating mechanism is inspired by the concepts of long- and short-term memories. The experimental evaluation is done using three performance metrics which evaluate the quality of both the adaptive spherical views and the navigation over time.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a new multi-scale place recognition system inspired by the recent discovery of overlapping, multi-scale spatial maps stored in the rodent brain. By training a set of Support Vector Machines to recognize places at varying levels of spatial specificity, we are able to validate spatially specific place recognition hypotheses against broader place recognition hypotheses without sacrificing localization accuracy. We evaluate the system in a range of experiments using cameras mounted on a motorbike and a human in two different environments. At 100% precision, the multiscale approach results in a 56% average improvement in recall rate across both datasets. We analyse the results and then discuss future work that may lead to improvements in both robotic mapping and our understanding of sensory processing and encoding in the mammalian brain.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Significant investments in developing technological innovations have been made in the Australian beef industry but with low adoption rates. By modelling the key variables and their interactions in the innovation adoption process, this research seeks to demonstrate the complexity and dynamics of the process. This research uses causal loop modelling and develops a holistic model of the current innovation adoption system in the Australian beef industry to show the complexity of dynamic interactions among multiple variables. It is suggested that innovation adoption is such an extremely complex issue, and we need to shift our views on this issue from a paradigm of linear thinking to systems thinking. Innovation adoption is more likely to be enhanced based on a full understanding of the complexity and dynamics of the system as a whole. The paper demonstrates to practitioners and developers of innovation the multiple variables and interactions impacting innovation adoption.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Despite significant changes in mainstream journalism in recent decades, journalistic fields beyond the news have been little explored. In an attempt to contribute to a deeper understanding of such fields, this article examines the role perceptions of 85 Australian travel journalists. By viewing travel journalism as a distinct field of practice that is affected by a unique mix of influences, this study identifies five dimensions of practitioners’ role perceptions. These relate to travel journalists’ views of themselves as Cultural Mediators, Critics, Entertainers, Information Providers and Travellers. In addition, the study examines in some depth the ethical standards of travel journalists. Determinants of these views and standards are explored. The study argues that, in light of travel journalists’ increasingly important role in reporting about foreign places, more remains to be done to promote travel stories that show a deeper understanding of other cultures and which contain a more critical appraisal of destinations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research investigated the prevalence of vision disorders in Queensland Indigenous primary school children, creating the first comprehensive visual profile of Indigenous children. Findings showed reduced convergence ability and reduced visual information processing skills were more common in Indigenous compared to non-Indigenous children. Reduced visual information processing skills were also associated with reduced reading outcomes in both groups of children. As early detection of visual disorders is important, the research also reviewed the delivery of screening programs across Queensland and proposed a model for improved coordination and service delivery of vision screening to Queensland school children.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper introduces a minimalistic approach to produce a visual hybrid map of a mobile robot’s working environment. The proposed system uses omnidirectional images along with odometry information to build an initial dense posegraph map. Then a two level hybrid map is extracted from the dense graph. The hybrid map consists of global and local levels. The global level contains a sparse topological map extracted from the initial graph using a dual clustering approach. The local level contains a spherical view stored at each node of the global level. The spherical views provide both an appearance signature for the nodes, which the robot uses to localize itself in the environment, and heading information when the robot uses the map for visual navigation. In order to show the usefulness of the map, an experiment was conducted where the map was used for multiple visual navigation tasks inside an office workplace.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper investigates how neuronal activation for naming photographs of objects is influenced by the addition of appropriate colour or sound. Behaviourally, both colour and sound are known to facilitate object recognition from visual form. However, previous functional imaging studies have shown inconsistent effects. For example, the addition of appropriate colour has been shown to reduce antero-medial temporal activation whereas the addition of sound has been shown to increase posterior superior temporal activation. Here we compared the effect of adding colour or sound cues in the same experiment. We found that the addition of either the appropriate colour or sound increased activation for naming photographs of objects in bilateral occipital regions and the right anterior fusiform. Moreover, the addition of colour reduced left antero-medial temporal activation but this effect was not observed for the addition of object sound. We propose that activation in bilateral occipital and right fusiform areas precedes the integration of visual form with either its colour or associated sound. In contrast, left antero-medial temporal activation is reduced because object recognition is facilitated after colour and form have been integrated.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The value of information technology (IT) is often realized when continuously being used after users’ initial acceptance. However, previous research on continuing IT usage is limited for dismissing the importance of mental goals in directing users’ behaviors and for inadequately accommodating the group context of users. This in-progress paper offers a synthesis of several literature to conceptualize continuing IT usage as multilevel constructs and to view IT usage behavior as directed and energized by a set of mental goals. Drawing from the self-regulation theory in the social psychology, this paper proposes a process model, positioning continuing IT usage as multiple-goal pursuit. An agent-based modeling approach is suggested to further explore causal and analytical implications of the proposed process model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In previous work we introduced a method to update the reference views in a topological map so that a mobile robot could continue to localize itself in a changing environment using omni-directional vision. In this work we extend this longterm updating mechanism to incorporate a spherical metric representation of the observed visual features for each node in the topological map. Using multi-view geometry we are then able to estimate the heading of the robot, in order to enable navigation between the nodes of the map, and to simultaneously adapt the spherical view representation in response to environmental changes. The results demonstrate the persistent performance of the proposed system in a long-term experiment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In this work, we introduce a method to update the reference views in a hybrid metrictopological map so that a mobile robot can continue to localize itself in a changing environment. The updating mechanism, based on the multi-store model of human memory, incorporates a spherical metric representation of the observed visual features for each node in the map, which enables the robot to estimate its heading and navigate using multi-view geometry, as well as representing the local 3D geometry of the environment. A series of experiments demonstrate the persistence performance of the proposed system in real changing environments, including analysis of the long-term stability.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The solutions proposed in this thesis contribute to improve gait recognition performance in practical scenarios that further enable the adoption of gait recognition into real world security and forensic applications that require identifying humans at a distance. Pioneering work has been conducted on frontal gait recognition using depth images to allow gait to be integrated with biometric walkthrough portals. The effects of gait challenging conditions including clothing, carrying goods, and viewpoint have been explored. Enhanced approaches are proposed on segmentation, feature extraction, feature optimisation and classification elements, and state-of-the-art recognition performance has been achieved. A frontal depth gait database has been developed and made available to the research community for further investigation. Solutions are explored in 2D and 3D domains using multiple images sources, and both domain-specific and independent modality gait features are proposed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Purpose To quantify the effects of driver age on night-time pedestrian conspicuity, and to determine whether individual differences in visual performance can predict drivers' ability to recognise pedestrians at night. Methods Participants were 32 visually normal drivers (20 younger: M = 24.4 years ± 6.4 years; 12 older: M = 72.0 years ± 5.0 years). Visual performance was measured in a laboratory-based testing session including visual acuity, contrast sensitivity, motion sensitivity and the useful field of view. Night-time pedestrian recognition distances were recorded while participants drove an instrumented vehicle along a closed road course at night; to increase the workload of drivers, auditory and visual distracter tasks were presented for some of the laps. Pedestrians walked in place, sideways to the oncoming vehicles, and wore either a standard high visibility reflective vest or reflective tape positioned on the movable joints (biological motion). Results Driver age and pedestrian clothing significantly (p < 0.05) affected the distance at which the drivers first responded to the pedestrians. Older drivers recognised pedestrians at approximately half the distance of the younger drivers and pedestrians were recognised more often and at longer distances when they wore a biological motion reflective clothing configuration than when they wore a reflective vest. Motion sensitivity was an independent predictor of pedestrian recognition distance, even when controlling for driver age. Conclusions The night-time pedestrian recognition capacity of older drivers was significantly worse than that of younger drivers. The distance at which drivers first recognised pedestrians at night was best predicted by a test of motion sensitivity.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Empirical evidence suggests impaired facial emotion recognition in schizophrenia. However, the nature of this deficit is the subject of ongoing research. The current study tested the hypothesis that a generalized deficit at an early stage of face-specific processing (i.e. putatively subserved by the fusiform gyrus) accounts for impaired facial emotion recognition in schizophrenia as opposed to the Negative Emotion-specific Deficit Model, which suggests impaired facial information processing at subsequent stages. Event-related potentials (ERPs) were recorded from 11 schizophrenia patients and 15 matched controls while performing a gender discrimination and a facial emotion recognition task. Significant reduction of the face-specific vertex positive potential (VPP) at a peak latency of 165 ms was confirmed in schizophrenia subjects whereas their early visual processing, as indexed by P1, was found to be intact. Attenuated VPP was found to correlate with subsequent P3 amplitude reduction and to predict accuracy when performing a facial emotion discrimination task. A subset of ten schizophrenia patients and ten matched healthy control subjects also performed similar tasks in the magnetic resonance imaging scanner. Patients showed reduced blood oxygenation level-dependent (BOLD) activation in the fusiform, inferior frontal, middle temporal and middle occipital gyrus as well as in the amygdala. Correlation analyses revealed that VPP and the subsequent P3a ERP components predict fusiform gyrus BOLD activation. These results suggest that problems in facial affect recognition in schizophrenia may represent flow-on effects of a generalized deficit in early visual processing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Previous neuroimaging research has attempted to demonstrate a preferential involvement of the human mirror neuron system (MNS) in the comprehension of effector-related action word (verb) meanings. These studies have assumed that Broca's area (or Brodmann's area 44) is the homologue of a monkey premotor area (F5) containing mouth and hand mirror neurons, and that action word meanings are shared with the mirror system due to a proposed link between speech and gestural communication. In an fMRI experiment, we investigated whether Broca's area shows mirror activity solely for effectors implicated in the MNS. Next, we examined the responses of empirically determined mirror areas during a language perception task comprising effector-specific action words, unrelated words and nonwords. We found overlapping activity for observation and execution of actions with all effectors studied, i.e., including the foot, despite there being no evidence of foot mirror neurons in the monkey or human brain. These "mirror" areas showed equivalent responses for action words, unrelated words and nonwords, with all of these stimuli showing increased responses relative to visual character strings. Our results support alternative explanations attributing mirror activity in Broca's area to covert verbalisation or hierarchical linearisation, and provide no evidence that the MNS makes a preferential contribution to comprehending action word meanings.