952 resultados para Eye-Tracking
Resumo:
This paper investigates the effects of experience on the intuitiveness of physical and visual interactions performed by airport security screeners. Using portable eye tracking glasses, 40 security screeners were observed in the field as they performed search, examination and interface interactions during airport security x-ray screening. Data from semi structured interviews was used to further explore the nature of visual and physical interactions. Results show there are positive relationships between experience and the intuitiveness of visual and physical interactions performed by security screeners. As experience is gained, security screeners are found to perform search, examination and interface interactions more intuitively. In addition to experience, results suggest that intuitiveness is affected by the nature and modality of activities performed. This inference was made based on the dominant processing styles associated with search and examination activities. The paper concludes by discussing the implications that this research has for the design of visual and physical interfaces. We recommend designing interfaces that build on users’ already established intuitive processes, and that reduce the cognitive load incurred during transitions between visual and physical interactions.
Resumo:
At present, the most reliable method to obtain end-user perceived quality is through subjective tests. In this paper, the impact of automatic region-of-interest (ROI) coding on perceived quality of mobile video is investigated. The evidence, which is based on perceptual comparison analysis, shows that the coding strategy improves perceptual quality. This is particularly true in low bit rate situations. The ROI detection method used in this paper is based on two approaches: - (1) automatic ROI by analyzing the visual contents automatically, and; - (2) eye-tracking based ROI by aggregating eye-tracking data across many users, used to both evaluate the accuracy of automatic ROI detection and the subjective quality of automatic ROI encoded video. The perceptual comparison analysis is based on subjective assessments with 54 participants, across different content types, screen resolutions, and target bit rates while comparing the two ROI detection methods. The results from the user study demonstrate that ROI-based video encoding has higher perceived quality compared to normal video encoded at a similar bit rate, particularly in the lower bit rate range.
Resumo:
This thesis examined passengers' intuitive navigation in airports. It aims to ensure that passengers can navigate fast and efficiently through these complex environments. Field research was conducted at two Australian international airports. Participants wore eye-tracking glasses while finding their way through the terminal. Insight was gained into the intuitive use of navigation elements in the airport environment. With a detailed understanding of how passengers' navigate, the findings from this research can be used to improve airport design and planning. This will assist passengers who don't regularly fly as well as those who are frequent flyers.
Resumo:
A visual world eye-tracking study investigated the activation and persistence of implicit causality information in spoken language comprehension. We showed that people infer the implicit causality of verbs as soon as they encounter such verbs in discourse, as is predicted by proponents of the immediate focusing account (Greene & McKoon, 1995; Koornneef & Van Berkum, 2006; Van Berkum, Koornneef, Otten, & Nieuwland, 2007). Interestingly, we observed activation of implicit causality information even before people encountered the causal conjunction. However, while implicit causality information was persistent as the discourse unfolded, it did not have a privileged role as a focusing cue immediately at the ambiguous pronoun when people were resolving its antecedent. Instead, our study indicated that implicit causality does not affect all referents to the same extent, rather it interacts with other cues in the discourse, especially when one of the referents is already prominently in focus.
Resumo:
Recent evidence from adult pronoun comprehension suggests that semantic factors such as verb transitivity affect referent salience and thereby anap- hora resolution. We tested whether the same semantic factors influence pronoun comprehension in young children. In a visual world study, 3-year- olds heard stories that began with a sentence containing either a high or a low transitivity verb. Looking behaviour to pictures depicting the subject and object of this sentence was recorded as children listened to a subsequent sentence containing a pronoun. Children showed a stronger preference to look to the subject as opposed to the object antecedent in the low transitivity condition. In addition there were general preferences (1) to look to the subject in both conditions and (2) to look more at both potential antecedents in the high transitivity condition. This suggests that children, like adults, are affected by semantic factors, specifically semantic prominence, when interpreting anaphoric pronouns.
Resumo:
Advertising is ubiquitous in the online community and more so in the ever-growing and popular online video delivery websites (e. g., YouTube). Video advertising is becoming increasingly popular on these websites. In addition to the existing pre-roll/post-roll advertising and contextual advertising, this paper proposes an in-stream video advertising strategy-Computational Affective Video-in-Video Advertising (CAVVA). Humans being emotional creatures are driven by emotions as well as rational thought. We believe that emotions play a major role in influencing the buying behavior of users and hence propose a video advertising strategy which takes into account the emotional impact of the videos as well as advertisements. Given a video and a set of advertisements, we identify candidate advertisement insertion points (step 1) and also identify the suitable advertisements (step 2) according to theories from marketing and consumer psychology. We formulate this two part problem as a single optimization function in a non-linear 0-1 integer programming framework and provide a genetic algorithm based solution. We evaluate CAVVA using a subjective user-study and eye-tracking experiment. Through these experiments, we demonstrate that CAVVA achieves a good balance between the following seemingly conflicting goals of (a) minimizing the user disturbance because of advertisement insertion while (b) enhancing the user engagement with the advertising content. We compare our method with existing advertising strategies and show that CAVVA can enhance the user's experience and also help increase the monetization potential of the advertising content.
Resumo:
Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.
Resumo:
My thesis studies how people pay attention to other people and the environment. How does the brain figure out what is important and what are the neural mechanisms underlying attention? What is special about salient social cues compared to salient non-social cues? In Chapter I, I review social cues that attract attention, with an emphasis on the neurobiology of these social cues. I also review neurological and psychiatric links: the relationship between saliency, the amygdala and autism. The first empirical chapter then begins by noting that people constantly move in the environment. In Chapter II, I study the spatial cues that attract attention during locomotion using a cued speeded discrimination task. I found that when the motion was expansive, attention was attracted towards the singular point of the optic flow (the focus of expansion, FOE) in a sustained fashion. The more ecologically valid the motion features became (e.g., temporal expansion of each object, spatial depth structure implied by distribution of the size of the objects), the stronger the attentional effects. However, compared to inanimate objects and cues, people preferentially attend to animals and faces, a process in which the amygdala is thought to play an important role. To directly compare social cues and non-social cues in the same experiment and investigate the neural structures processing social cues, in Chapter III, I employ a change detection task and test four rare patients with bilateral amygdala lesions. All four amygdala patients showed a normal pattern of reliably faster and more accurate detection of animate stimuli, suggesting that advantageous processing of social cues can be preserved even without the amygdala, a key structure of the “social brain”. People not only attend to faces, but also pay attention to others’ facial emotions and analyze faces in great detail. Humans have a dedicated system for processing faces and the amygdala has long been associated with a key role in recognizing facial emotions. In Chapter IV, I study the neural mechanisms of emotion perception and find that single neurons in the human amygdala are selective for subjective judgment of others’ emotions. Lastly, people typically pay special attention to faces and people, but people with autism spectrum disorders (ASD) might not. To further study social attention and explore possible deficits of social attention in autism, in Chapter V, I employ a visual search task and show that people with ASD have reduced attention, especially social attention, to target-congruent objects in the search array. This deficit cannot be explained by low-level visual properties of the stimuli and is independent of the amygdala, but it is dependent on task demands. Overall, through visual psychophysics with concurrent eye-tracking, my thesis found and analyzed socially salient cues and compared social vs. non-social cues and healthy vs. clinical populations. Neural mechanisms underlying social saliency were elucidated through electrophysiology and lesion studies. I finally propose further research questions based on the findings in my thesis and introduce my follow-up studies and preliminary results beyond the scope of this thesis in the very last section, Future Directions.
Resumo:
One of the great puzzles in the psychology of visual perception is that the visual world appears to be a coherent whole despite our viewing it through temporally discontinuous series of eye fixations. The investigators attempted to explain this puzzle from the perspective of sequential visual information integration. In recent years, investigators hypothesized that information maintained in the visual short-term memory (VSTM) could become visual mental images gradually during time delay in visual buffer and integrated with information perceived currently. Some elementary studies had been carried out to investigate the integration between VSTM and visual percepts, but further research is required to account for several questions on the spatial-temporal characteristics, information representation and mechanism of integrating sequential visual information. Based on the theory of similarity between visual mental image and visual perception, this research (including three studies) employed the temporal integration paradigm and empty cell localization task to further explore the spatial-temporal characteristics, information representation and mechanism of integrating sequential visual information (sequential arrays). The purpose of study 1 was to further explore the temporal characteristics of sequential visual information integration by examining the effects of encoding time of sequential stimuli on the integration of sequential visual information. The purpose of study 2 was to further explore the spatial characteristics of sequential visual information integration by investigating the effects of spatial characteristics change on the integration of sequential visual information. The purpose of study 3 was to explore the information representation of information maintained in the VSTM and integration mechanism in the process of integrating sequential visual information by employing the behavioral experiments and eye tracking technology. The results indicated that: (1) Sequential arrays could be integrated without strategic instruction. Increasing the duration of the first array could cause improvement in performance and increasing the duration of the second array could not improve the performance. Temporal correlation model was not fit to explain the sequential array integration under long-ISI conditions. (2) Stimuli complexity influenced not only the overall performance of sequential arrays but also the values of ISI at asymptotic level of performance. Sequential arrays still could be integrated when the spatial characteristics of sequential arrays changed. During ISI, constructing and manipulating of visual mental image of array 1 were two separate processing phases. (3) During integrating sequential arrays, people represented the pattern constituted by the objects' image maintained in the VSTM and the topological characteristics of the objects' image had some impact on fixation location. The image-perception integration hypothesis was supported when the number of dots in array 1 was less than empty cells, and the convert-and-compare hypothesis was supported when the number of the dot in array 1 was equal to or more than empty cells. These findings not only contribute to make people understand the process of sequential visual information integration better, but also have significant practical application in the design of visual interface.
Resumo:
As Levelt and Meyer (2000) noted, because studies of lexical access during multiword utterances production such as phrases and sentences, they raise two novel questions which studies of single word production do not. Firstly, does the access of different words in a sentence occur in a parallel or a serial fashion? Secondly, does the access of the different words in a sentence occur in an interactive or a discrete fashion? The latter question concerns the horizontal information flow (Smith & Wheeldon, 2004), which is a very important aspect of continuous speech production. A variant of the picture–word interference paradigm combining with eye-tracking technique and a dual task paradigm was used in 7 experiments to investigate the horizontal information flow of semantic and phonological information between nouns in spoken Mandarin Chinese sentences. The results suggested that: 1. Before speech onset, semantic information of different words accross the whole sentence has been activated, while phonological activation has been limited within the first phrase of the sentence. 2. Before speech onset, speaker will look ahead and check the semantic information of latter words as the first noun is beening processed, such looking ahead for phonological information can just occur within the first phrase of the sentence. 3. After speech onset, speaker will concentrate on the content words beyond the first one and will check the semantic information of other words with the same sentence. 4. The result suggested that the lexical accesses of multiple words during spoken sentence production are processed in a partly serial and partly parallel manner and stands for the Unit-by-Unit and Incremental view proposed by Levelt (2000). 5. The horizontal information flow during spoken sentence production is not an automatic process and is constrained by cognitive resource.
Resumo:
Mobile devices offer a common platform for both leisure and work-related tasks but this has resulted in a blurred boundary between home and work. In this paper we explore the security implications of this blurred boundary, both for the worker and the employer. Mobile workers may not always make optimum security-related choices when ‘on the go’ and more impulsive individuals may be particularly affected as they are considered more vulnerable to distraction. In this study we used a task scenario, in which 104 users were asked to choose a wireless network when responding to work demands while out of the office. Eye-tracking data was obtained from a subsample of 40 of these participants in order to explore the effects of impulsivity on attention. Our results suggest that impulsive people are more frequent users of public devices and networks in their day-to-day interactions and are more likely to access their social networks on a regular basis. However they are also likely to make risky decisions when working on-the-go, processing fewer features before making those decisions. These results suggest that those with high impulsivity may make more use of the mobile Internet options for both work and private purposes but they also show attentional behavior patterns that suggest they make less considered security-sensitive decisions. The findings are discussed in terms of designs that might support enhanced deliberation, both in the moment and also in relation to longer term behaviors that would contribute to a better work-life balance.
Resumo:
The advent of modern wireless technologies has seen a shift in focus towards the design and development of educational systems for deployment through mobile devices. The use of mobile phones, tablets and Personal Digital Assistants (PDAs) is steadily growing across the educational sector as a whole. Mobile learning (mLearning) systems developed for deployment on such devices hold great significance for the future of education. However, mLearning systems must be built around the particular learner’s needs based on both their motivation to learn and subsequent learning outcomes. This thesis investigates how biometric technologies, in particular accelerometer and eye-tracking technologies, could effectively be employed within the development of mobile learning systems to facilitate the needs of individual learners. The creation of personalised learning environments must enable the achievement of improved learning outcomes for users, particularly at an individual level. Therefore consideration is given to individual learning-style differences within the electronic learning (eLearning) space. The overall area of eLearning is considered and areas such as biometric technology and educational psychology are explored for the development of personalised educational systems. This thesis explains the basis of the author’s hypotheses and presents the results of several studies carried out throughout the PhD research period. These results show that both accelerometer and eye-tracking technologies can be employed as an Human Computer Interaction (HCI) method in the detection of student learning-styles to facilitate the provision of automatically adapted eLearning spaces. Finally the author provides recommendations for developers in the creation of adaptive mobile learning systems through the employment of biometric technology as a user interaction tool within mLearning applications. Further research paths are identified and a roadmap for future of research in this area is defined.
Resumo:
Existing work in Computer Science and Electronic Engineering demonstrates that Digital Signal Processing techniques can effectively identify the presence of stress in the speech signal. These techniques use datasets containing real or actual stress samples i.e. real-life stress such as 911 calls and so on. Studies that use simulated or laboratory-induced stress have been less successful and inconsistent. Pervasive, ubiquitous computing is increasingly moving towards voice-activated and voice-controlled systems and devices. Speech recognition and speaker identification algorithms will have to improve and take emotional speech into account. Modelling the influence of stress on speech and voice is of interest to researchers from many different disciplines including security, telecommunications, psychology, speech science, forensics and Human Computer Interaction (HCI). The aim of this work is to assess the impact of moderate stress on the speech signal. In order to do this, a dataset of laboratory-induced stress is required. While attempting to build this dataset it became apparent that reliably inducing measurable stress in a controlled environment, when speech is a requirement, is a challenging task. This work focuses on the use of a variety of stressors to elicit a stress response during tasks that involve speech content. Biosignal analysis (commercial Brain Computer Interfaces, eye tracking and skin resistance) is used to verify and quantify the stress response, if any. This thesis explains the basis of the author’s hypotheses on the elicitation of affectively-toned speech and presents the results of several studies carried out throughout the PhD research period. These results show that the elicitation of stress, particularly the induction of affectively-toned speech, is not a simple matter and that many modulating factors influence the stress response process. A model is proposed to reflect the author’s hypothesis on the emotional response pathways relating to the elicitation of stress with a required speech content. Finally the author provides guidelines and recommendations for future research on speech under stress. Further research paths are identified and a roadmap for future research in this area is defined.
Resumo:
The research project takes place within the technology acceptability framework which tries to understand the use made of new technologies, and concentrates more specifically on the factors that influence multi-touch devices’ (MTD) acceptance and intention to use. Why be interested in MTD? Nowadays, this technology is used in all kinds of human activities, e.g. leisure, study or work activities (Rogowski and Saeed, 2012). However, the handling or the data entry by means of gestures on multi-touch-sensitive screen imposes a number of constraints and consequences which remain mostly unknown (Park and Han, 2013). Currently, few researches in ergonomic psychology wonder about the implications of these new human-computer interactions on task fulfillment.This research project aims to investigate the cognitive, sensori-motor and motivational processes taking place during the use of those devices. The project will analyze the influences of the use of gestures and the type of gesture used: simple or complex gestures (Lao, Heng, Zhang, Ling, and Wang, 2009), as well as the personal self-efficacy feeling in the use of MTD on task engagement, attention mechanisms and perceived disorientation (Chen, Linen, Yen, and Linn, 2011) when confronted to the use of MTD. For that purpose, the various above-mentioned concepts will be measured within a usability laboratory (U-Lab) with self-reported methods (questionnaires) and objective indicators (physiological indicators, eye tracking). Globally, the whole research aims to understand the processes at stakes, as well as advantages and inconveniences of this new technology, to favor a better compatibility and adequacy between gestures, executed tasks and MTD. The conclusions will allow some recommendations for the use of the DMT in specific contexts (e.g. learning context).
Resumo:
Nowadays multi-touch devices (MTD) can be found in all kind of contexts. In the learning context, MTD availability leads many teachers to use them in their class room, to support the use of the devices by students, or to assume that it will enhance the learning processes. Despite the raising interest for MTD, few researches studying the impact in term of performance or the suitability of the technology for the learning context exist. However, even if the use of touch-sensitive screens rather than a mouse and keyboard seems to be the easiest and fastest way to realize common learning tasks (as for instance web surfing behaviour), we notice that the use of MTD may lead to a less favourable outcome. The complexity to generate an accurate fingers gesture and the split attention it requires (multi-tasking effect) make the use of gestures to interact with a touch-sensitive screen more difficult compared to the traditional laptop use. More precisely, it is hypothesized that efficacy and efficiency decreases, as well as the available cognitive resources making the users’ task engagement more difficult. Furthermore, the presented study takes into account the moderator effect of previous experiences with MTD. Two key factors of technology adoption theories were included in the study: familiarity and self-efficacy with the technology.Sixty university students, invited to a usability lab, are asked to perform information search tasks on an online encyclopaedia. The different tasks were created in order to execute the most commonly used mouse actions (e.g. right click, left click, scrolling, zooming, key words encoding…). Two different conditions were created: (1) MTD use and (2) laptop use (with keyboard and mouse). The cognitive load, self-efficacy, familiarity and task engagement scales were adapted to the MTD context. Furthermore, the eye-tracking measurement would offer additional information about user behaviours and their cognitive load.Our study aims to clarify some important aspects towards the usage of MTD and the added value compared to a laptop in a student learning context. More precisely, the outcomes will enhance the suitability of MTD with the processes at stakes, the role of previous knowledge in the adoption process, as well as some interesting insights into the user experience with such devices.