870 resultados para audio-visual information


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Previous behavioral studies reported a robust effect of increased naming latencies when objects to be named were blocked within semantic category, compared to items blocked between category. This semantic context effect has been attributed to various mechanisms including inhibition or excitation of lexico-semantic representations and incremental learning of associations between semantic features and names, and is hypothesized to increase demands on verbal self-monitoring during speech production. Objects within categories also share many visual structural features, introducing a potential confound when interpreting the level at which the context effect might occur. Consistent with previous findings, we report a significant increase in response latencies when naming categorically related objects within blocks, an effect associated with increased perfusion fMRI signal bilaterally in the hippocampus and in the left middle to posterior superior temporal cortex. No perfusion changes were observed in the middle section of the left middle temporal cortex, a region associated with retrieval of lexical-semantic information in previous object naming studies. Although a manipulation of visual feature similarity did not influence naming latencies, we observed perfusion increases in the perirhinal cortex for naming objects with similar visual features that interacted with the semantic context in which objects were named. These results provide support for the view that the semantic context effect in object naming occurs due to an incremental learning mechanism, and involves increased demands on verbal self-monitoring.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed a Hierarchical Look-Ahead Trajectory Model (HiLAM) that incorporates the firing pattern of medial entorhinal grid cells in a planning circuit that includes interactions with hippocampus and prefrontal cortex. We show the model’s flexibility in representing large real world environments using odometry information obtained from challenging video sequences. We acquire the visual data from a camera mounted on a small tele-operated vehicle. The camera has a panoramic field of view with its focal point approximately 5 cm above the ground level, similar to what would be expected from a rat’s point of view. Using established algorithms for calculating perceptual speed from the apparent rate of visual change over time, we generate raw dead reckoning information which loses spatial fidelity over time due to error accumulation. We rectify the loss of fidelity by exploiting the loop-closure detection ability of a biologically inspired, robot navigation model termed RatSLAM. The rectified motion information serves as a velocity input to the HiLAM to encode the environment in the form of grid cell and place cell maps. Finally, we show goal directed path planning results of HiLAM in two different environments, an indoor square maze used in rodent experiments and an outdoor arena more than two orders of magnitude larger than the indoor maze. Together these results bridge for the first time the gap between higher fidelity bio-inspired navigation models (HiLAM) and more abstracted but highly functional bio-inspired robotic mapping systems (RatSLAM), and move from simulated environments into real-world studies in rodent-sized arenas and beyond.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of spoken term detection (STD) is to find all occurrences of a specified query term in a large audio database. This process is usually divided into two steps: indexing and search. In a previous study, it was shown that knowing the topic of an audio document would help to improve the accuracy of indexing step which results in a better performance for STD system. In this paper, we propose the use of topic information not only in the indexing step, but also in the search step. Results of our experiments show that topic information could also be used in search step to improve the STD accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

'We need to talk (Performace Space)' is a 3 channel audio work with round table and custom cushions, examining the discursive framework of LEVEL as a feminist art collective. It was included in the exhibition 'Sexes', curated by Bec Dean, Jeff Khan and Deborah Kelly, at Performance Space. The audio works feature recontextualised excerpts from a series of dinner party conversations, which focused on the role of women and feminism in the 21st century. Placed in a specially constructed ‘lazy susan’, this audio installation speaks of the experience of sharing information, ideas and experiences ‘around the table’. The fabric patterns on the floor cushions have been designed from banners created in collective workshops with women in Brisbane and Melbourne, Australia, as a way of translating personal statements and political ideas into the everyday.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As a social species in a constantly changing environment, humans rely heavily on the informational richness and communicative capacity of the face. Thus, understanding how the brain processes information about faces in real-time is of paramount importance. The N170 is a high temporal resolution electrophysiological index of the brain's early response to visual stimuli that is reliably elicited in carefully controlled laboratory-based studies. Although the N170 has often been reported to be of greatest amplitude to faces, there has been debate regarding whether this effect might be an artifact of certain aspects of the controlled experimental stimulation schedules and materials. To investigate whether the N170 can be identified in more realistic conditions with highly variable and cluttered visual images and accompanying auditory stimuli we recorded EEG 'in the wild', while participants watched pop videos. Scene-cuts to faces generated a clear N170 response, and this was larger than the N170 to transitions where the videos cut to non-face stimuli. Within participants, wild-type face N170 amplitudes were moderately correlated to those observed in a typical laboratory experiment. Thus, we demonstrate that the face N170 is a robust and ecologically valid phenomenon and not an artifact arising as an unintended consequence of some property of the more typical laboratory paradigm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper outlines the approach taken by the Speech, Audio, Image and Video Technologies laboratory, and the Applied Data Mining Research Group (SAIVT-ADMRG) in the 2014 MediaEval Social Event Detection (SED) task. We participated in the event based clustering subtask (subtask 1), and focused on investigating the incorporation of image features as another source of data to aid clustering. In particular, we developed a descriptor based around the use of super-pixel segmentation, that allows a low dimensional feature that incorporates both colour and texture information to be extracted and used within the popular bag-of-visual-words (BoVW) approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis demonstrates that robots can learn about how the world changes, and can use this information to recognise where they are, even when the appearance of the environment has changed a great deal. The ability to localise in highly dynamic environments using vision only is a key tool for achieving long-term, autonomous navigation in unstructured outdoor environments. The proposed learning algorithms are designed to be unsupervised, and can be generated by the robot online in response to its observations of the world, without requiring information from a human operator or other external source.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: We term the visual field position from which the pupil appears most nearly circular as the pupillary circular axis (PCAx). The aim was to determine and compare the horizontal and vertical co-ordinates of the PCAx and optical axis from pupil shape and refraction information for only the horizontal meridian of the visual field. Method: The PCAx was determined from the changes with visual field angle in the ellipticity and orientation of pupil images out to ±90° from fixation along the horizontal meridian for the right eyes of 30 people. This axis was compared with the optical axis determined from the changes in the astigmatic components of the refractions for field angles out to ±35° in the same meridian. Results: The mean estimated horizontal and vertical field coordinates of the PCAx were (‒5.3±1.9°, ‒3.2±1.5°) compared with (‒4.8±5.1°, ‒1.5±3.4°) for the optical axis. The vertical co-ordinates of the two axes were just significantly different (p =0.03) but there was no significant correlation between them. Only the horizontal coordinate of the PCAx was significantly related to the refraction in the group. Conclusion: On average, the PCAx is displaced from the line-of-sight by about the same angle as the optical axis but there is more inter-subject variation in the position of the optical axis. When modelling the optical performance of the eye, it appears reasonable to assume that the pupil is circular when viewed along the line-of-sight.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The lateral amygdala (LA) receives information from auditory and visual sensory modalities, and uses this information to encode lasting memories that predict threat. One unresolved question about the amygdala is how multiple memories, derived from different sensory modalities, are organized at the level of neuronal ensembles. We previously showed that fear conditioning using an auditory conditioned stimulus (CS) was spatially allocated to a stable topography of neurons within the dorsolateral amygdala (LAd) (Bergstrom et al, 2011). Here, we asked how fear conditioning using a visual CS is topographically organized within the amygdala. To induce a lasting fear memory trace we paired either an auditory (2 khz, 55 dB, 20 s) or visual (1 Hz, 0.5 s on/0.5 s off, 35 lux, 20 s) CS with a mild foot shock unconditioned stimulus (0.6 mA, 0.5 s). To detect learning-induced plasticity in amygdala neurons, we used immunohistochemistry with an antibody for phosphorylated mitogen-activated protein kinase (pMAPK). Using a principal components analysis-based approach to extract and visualize spatial patterns, we uncovered two unique spatial patterns of activated neurons in the LA that were associated with auditory and visual fear conditioning. The first spatial pattern was specific to auditory cued fear conditioning and consisted of activated neurons topographically organized throughout the LAd and ventrolateral nuclei (LAvl) of the LA. The second spatial pattern overlapped for auditory and visual fear conditioning and was comprised of activated neurons located mainly within the LAvl. Overall, the density of pMAPK labeled cells throughout the LA was greatest in the auditory CS group, even though freezing in response to the visual and auditory CS was equivalent. There were no differences detected in the number of pMAPK activated neurons within the basal amygdala nuclei. Together, these results provide the first basic knowledge about the organizational structure of two different fear engrams within the amygdala and suggest they are dissociable at the level of neuronal ensembles within the LA

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a method for calculating odome- try in three-dimensions for car-like ground ve- hicles with an Ackerman-like steering model. In our approach we use the information from a single camera to derive the odometry in the plane and fuse it with roll and pitch informa- tion derived from an on-board IMU to extend to three-dimensions, thus providing odometric altitude as well as traditional x and y transla- tion. We have mounted the odometry module on a standard Toyota Prado SUV and present results from a car-park environment as well as from an off-road track.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction. Social media is becoming a vital source of information in disaster or emergency situations. While a growing number of studies have explored the use of social media in natural disasters by emergency staff, military personnel, medial and other professionals, very few studies have investigated the use of social media by members of the public. The purpose of this paper is to explore citizens’ information experiences in social media during times of natural disaster. Method. A qualitative research approach was applied. Data was collected via in-depth interviews. Twenty-five people who used social media during a natural disaster in Australia participated in the study. Analysis. Audio recordings of interviews and interview transcripts provided the empirical material for data analysis. Data was analysed using structural and focussed coding methods. Results. Eight key themes depicting various aspects of participants’ information experience during a natural disaster were uncovered by the study: connected; wellbeing; coping; help; brokerage; journalism; supplementary and characteristics. Conclusion. This study contributes insights into social media’s potential for developing community disaster resilience and promotes discussion about the value of civic participation in social media when such circumstances occur. These findings also contribute to our understanding of information experiences as a new informational research object.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research reported here addresses the problem of detecting and tracking independently moving objects from a moving observer in real-time, using corners as object tokens. Corners are detected using the Harris corner detector, and local image-plane constraints are employed to solve the correspondence problem. The approach relaxes the restrictive static-world assumption conventionally made, and is therefore capable of tracking independently moving and deformable objects. Tracking is performed without the use of any 3-dimensional motion model. The technique is novel in that, unlike traditional feature-tracking algorithms where feature detection and tracking is carried out over the entire image-plane, here it is restricted to those areas most likely to contain-meaningful image structure. Two distinct types of instantiation regions are identified, these being the “focus-of-expansion” region and “border” regions of the image-plane. The size and location of these regions are defined from a combination of odometry information and a limited knowledge of the operating scenario. The algorithms developed have been tested on real image sequences taken from typical driving scenarios. Implementation of the algorithm using T800 Transputers has shown that near-linear speedups are achievable, and that real-time operation is possible (half-video rate has been achieved using 30 processing elements).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The literacy demands of mathematics are very different to those in other subjects (Gough, 2007; O'Halloran, 2005; Quinnell, 2011; Rubenstein, 2007) and much has been written on the challenges that literacy in mathematics poses to learners (Abedi and Lord, 2001; Lowrie and Diezmann, 2007, 2009; Rubenstein, 2007). In particular, a diverse selection of visuals typifies the field of mathematics (Carter, Hipwell and Quinnell, 2012), placing unique literacy demands on learners. Such visuals include varied tables, graphs, diagrams and other representations, all of which are used to communicate information.