944 resultados para Visual Speech Recognition, Multiple Views, Frontal View, Profile View


Relevância:

100.00% 100.00%

Publicador:

Resumo:

These three manuscripts are presented as a PhD dissertation for the study of using GeoVis application to evaluate telehealth programs. The primary reason of this research was to understand how the GeoVis applications can be designed and developed using combined approaches of HC approach and cognitive fit theory and in terms utilized to evaluate telehealth program in Brazil. First manuscript The first manuscript in this dissertation presented a background about the use of GeoVisualization to facilitate visual exploration of public health data. The manuscript covered the existing challenges that were associated with an adoption of existing GeoVis applications. The manuscript combines the principles of Human Centered approach and Cognitive Fit Theory and a framework using a combination of these approaches is developed that lays the foundation of this research. The framework is then utilized to propose the design, development and evaluation of “the SanaViz” to evaluate telehealth data in Brazil, as a proof of concept. Second manuscript The second manuscript is a methods paper that describes the approaches that can be employed to design and develop “the SanaViz” based on the proposed framework. By defining the various elements of the HC approach and CFT, a mixed methods approach is utilized for the card sorting and sketching techniques. A representative sample of 20 study participants currently involved in the telehealth program at the NUTES telehealth center at UFPE, Recife, Brazil was enrolled. The findings of this manuscript helped us understand the needs of the diverse group of telehealth users, the tasks that they perform and helped us determine the essential features that might be necessary to be included in the proposed GeoVis application “the SanaViz”. Third manuscript The third manuscript involved mix- methods approach to compare the effectiveness and usefulness of the HC GeoVis application “the SanaViz” against a conventional GeoVis application “Instant Atlas”. The same group of 20 study participants who had earlier participated during Aim 2 was enrolled and a combination of quantitative and qualitative assessments was done. Effectiveness was gauged by the time that the participants took to complete the tasks using both the GeoVis applications, the ease with which they completed the tasks and the number of attempts that were taken to complete each task. Usefulness was assessed by System Usability Scale (SUS), a validated questionnaire tested in prior studies. In-depth interviews were conducted to gather opinions about both the GeoVis applications. This manuscript helped us in the demonstration of the usefulness and effectiveness of HC GeoVis applications to facilitate visual exploration of telehealth data, as a proof of concept. Together, these three manuscripts represent challenges of combining principles of Human Centered approach, Cognitive Fit Theory to design and develop GeoVis applications as a method to evaluate Telehealth data. To our knowledge, this is the first study to explore the usefulness and effectiveness of GeoVis to facilitate visual exploration of telehealth data. The results of the research enabled us to develop a framework for the design and development of GeoVis applications related to the areas of public health and especially telehealth. The results of our study showed that the varied users were involved with the telehealth program and the tasks that they performed. Further it enabled us to identify the components that might be essential to be included in these GeoVis applications. The results of our research answered the following questions; (a) Telehealth users vary in their level of understanding about GeoVis (b) Interaction features such as zooming, sorting, and linking and multiple views and representation features such as bar chart and choropleth maps were considered the most essential features of the GeoVis applications. (c) Comparing and sorting were two important tasks that the telehealth users would perform for exploratory data analysis. (d) A HC GeoVis prototype application is more effective and useful for exploration of telehealth data than a conventional GeoVis application. Future studies should be done to incorporate the proposed HC GeoVis framework to enable comprehensive assessment of the users and the tasks they perform to identify the features that might be necessary to be a part of the GeoVis applications. The results of this study demonstrate a novel approach to comprehensively and systematically enhance the evaluation of telehealth programs using the proposed GeoVis Framework.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Amnesic patients with early and seemingly isolated hippocampal injury show relatively normal recognition memory scores. The cognitive profile of these patients raises the possibility that this recognition performance is maintained mainly by stimulus familiarity in the absence of recollection of contextual information. Here we report electrophysiological data on the status of recognition memory in one of the patients, Jon. Jon's recognition of studied words lacks the event-related potential (ERP) index of recollection, viz., an increase in the late positive component (500–700 ms), under conditions that elicit it reliably in normal subjects. On the other hand, a decrease of the ERP amplitude between 300 and 500 ms, also reliably found in normal subjects, is well preserved. This so-called N400 effect has been linked to stimulus familiarity in previous ERP studies of recognition memory. In Jon, this link is supported by the finding that his recognized and unrecognized studied words evoked topographically distinct ERP effects in the N400 time window. These data suggest that recollection is more dependent on the hippocampal formation than is familiarity, consistent with the view that the hippocampal formation plays a special role in episodic memory, for which recollection is so critical.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"COO-2118-0035."

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a corpus-based descriptive analysis of the most prevalent transfer effects and connected speech processes observed in a comparison of 11 Vietnamese English speakers (6 females, 5 males) and 12 Australian English speakers (6 males, 6 females) over 24 grammatical paraphrase items. The phonetic processes are segmentally labelled in terms of IPA diacritic features using the EMU speech database system with the aim of labelling departures from native-speaker pronunciation. An analysis of prosodic features was made using ToBI framework. The results show many phonetic and prosodic processes which make non-native speakers’ speech distinct from native ones. The corpusbased methodology of analysing foreign accent may have implications for the evaluation of non-native accent, accented speech recognition and computer assisted pronunciation- learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, we introduce DeReEs-4v, an algorithm for unsupervised and automatic registration of two video frames captured depth-sensing cameras. DeReEs-4V receives two RGBD video streams from two depth-sensing cameras arbitrary located in an indoor space that share a minimum amount of 25% overlap between their captured scenes. The motivation of this research is to employ multiple depth-sensing cameras to enlarge the field of view and acquire a more complete and accurate 3D information of the environment. A typical way to combine multiple views from different cameras is through manual calibration. However, this process is time-consuming and may require some technical knowledge. Moreover, calibration has to be repeated when the location or position of the cameras change. In this research, we demonstrate how DeReEs-4V registration can be used to find the transformation of the view of one camera with respect to the other at interactive rates. Our algorithm automatically finds the 3D transformation to match the views from two cameras, requires no human interference, and is robust to camera movements while capturing. To validate this approach, a thorough examination of the system performance under different scenarios is presented. The system presented here supports any application that might benefit from the wider field-of-view provided by the combined scene from both cameras, including applications in 3D telepresence, gaming, people tracking, videoconferencing and computer vision.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Here we use two filtered speech tasks to investigate children’s processing of slow (<4 Hz) versus faster (∼33 Hz) temporal modulations in speech. We compare groups of children with either developmental dyslexia (Experiment 1) or speech and language impairments (SLIs, Experiment 2) to groups of typically-developing (TD) children age-matched to each disorder group. Ten nursery rhymes were filtered so that their modulation frequencies were either low-pass filtered (<4 Hz) or band-pass filtered (22 – 40 Hz). Recognition of the filtered nursery rhymes was tested in a picture recognition multiple choice paradigm. Children with dyslexia aged 10 years showed equivalent recognition overall to TD controls for both the low-pass and band-pass filtered stimuli, but showed significantly impaired acoustic learning during the experiment from low-pass filtered targets. Children with oral SLIs aged 9 years showed significantly poorer recognition of band pass filtered targets compared to their TD controls, and showed comparable acoustic learning effects to TD children during the experiment. The SLI samples were also divided into children with and without phonological difficulties. The children with both SLI and phonological difficulties were impaired in recognizing both kinds of filtered speech. These data are suggestive of impaired temporal sampling of the speech signal at different modulation rates by children with different kinds of developmental language disorder. Both SLI and dyslexic samples showed impaired discrimination of amplitude rise times. Implications of these findings for a temporal sampling framework for understanding developmental language disorders are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: a) multiply handicapped children have a high incidence of disorders affecting the visual system; b) assessment and management of visual disorders in this group of children presents a complex challenge; c) this study describes the results of visual function assessment in two children with neurological disability over a one-year period.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a study of the mathematical properties of voice as an audio signal -- This work includes signals in which the channel conditions are not ideal for emotion recognition -- Multiresolution analysis- discrete wavelet transform – was performed through the use of Daubechies Wavelet Family (Db1-Haar, Db6, Db8, Db10) allowing the decomposition of the initial audio signal into sets of coefficients on which a set of features was extracted and analyzed statistically in order to differentiate emotional states -- ANNs proved to be a system that allows an appropriate classification of such states -- This study shows that the extracted features using wavelet decomposition are enough to analyze and extract emotional content in audio signals presenting a high accuracy rate in classification of emotional states without the need to use other kinds of classical frequency-time features -- Accordingly, this paper seeks to characterize mathematically the six basic emotions in humans: boredom, disgust, happiness, anxiety, anger and sadness, also included the neutrality, for a total of seven states to identify

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Manual calibration of large and dynamic networks of cameras is labour intensive and time consuming. This is a strong motivator for the development of automatic calibration methods. Automatic calibration relies on the ability to find correspondences between multiple views of the same scene. If the cameras are sparsely placed, this can be a very difficult task. This PhD project focuses on the further development of uncalibrated wide baseline matching techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, cognitive load analysis via acoustic- and CAN-Bus-based driver performance metrics is employed to assess two different commercial speech dialog systems (SDS) during in-vehicle use. Several metrics are proposed to measure increases in stress, distraction and cognitive load and we compare these measures with statistical analysis of the speech recognition component of each SDS. It is found that care must be taken when designing an SDS as it may increase cognitive load which can be observed through increased speech response delay (SRD), changes in speech production due to negative emotion towards the SDS, and decreased driving performance on lateral control tasks. From this study, guidelines are presented for designing systems which are to be used in vehicular environments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: To investigate whether wearing different presbyopic vision corrections alters the pattern of eye and head movements when viewing and responding to driving-related traffic scenes. Methods: Participants included 20 presbyopes (mean age: 56.1 ± 5.7 years) who had no experience of wearing presbyopic vision corrections, apart from single vision (SV) reading spectacles. Each participant wore five different vision corrections: distance SV lenses, progressive addition spectacle lenses (PAL), bifocal spectacle lenses (BIF), monovision (MV) and multifocal contact lenses (MTF CL). For each visual condition, participants were required to view videotape recordings of traffic scenes, track a reference vehicle, and identify a series of peripherally presented targets. Digital numerical display panels were also included as near visual stimuli (simulating the visual displays of a vehicle speedometer and radio). Eye and head movements were measured, and the accuracy of target recognition was also recorded. Results: The path length of eye movements while viewing and responding to driving-related traffic scenes was significantly longer when wearing BIF and PAL than MV and MTF CL (both p ≤ 0.013). The path length of head movements was greater with SV, BIF, and PAL than MV and MTF CL (all p < 0.001). Target recognition and brake response times were not significantly affected by vision correction, whereas target recognition was less accurate when the near stimulus was located at eccentricities inferiorly and to the left, rather than directly below the primary position of gaze (p = 0.008), regardless of vision correction. Conclusions: Different presbyopic vision corrections alter eye and head movement patterns. The longer path length of eye and head movements and greater number of saccades associated with the spectacle presbyopic corrections may affect some aspects of driving performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presbyopia affects individuals from the age of 45 years onwards, resulting in difficulty in accurately focusing on near objects. There are many optical corrections available including spectacles or contact lenses that are designed to enable presbyopes to see clearly at both far and near distances. However, presbyopic vision corrections also disturb aspects of visual function under certain circumstances. The impact of these changes on activities of daily living such as driving are, however, poorly understood. Therefore, the aim of this study was to determine which aspects of driving performance might be affected by wearing different types of presbyopic vision corrections. In order to achieve this aim, three experiments were undertaken. The first experiment involved administration of a questionnaire to compare the subjective driving difficulties experienced when wearing a range of common presbyopic contact lens and spectacle corrections. The questionnaire was developed and piloted, and included a series of items regarding difficulties experienced while driving under day and night-time conditions. Two hundred and fifty five presbyopic patients responded to the questionnaire and were categorised into five groups, including those wearing no vision correction for driving (n = 50), bifocal spectacles (BIF, n = 54), progressive addition lenses spectacles (PAL, n = 50), monovision (MV, n = 53) and multifocal contact lenses (MTF CL, n = 48). Overall, ratings of satisfaction during daytime driving were relatively high for all correction types. However, MV and MTF CL wearers were significantly less satisfied with aspects of their vision during night-time than daytime driving, particularly with regard to disturbances from glare and haloes. Progressive addition lens wearers noticed more distortion of peripheral vision, while BIF wearers reported more difficulties with tasks requiring changes in focus and those who wore no vision correction for driving reported problems with intermediate and near tasks. Overall, the mean level of satisfaction for daytime driving was quite high for all of the groups (over 80%), with the BIF wearers being the least satisfied with their vision for driving. Conversely, at night, MTF CL wearers expressed the least satisfaction. Research into eye and head movements has become increasingly of interest in driving research as it provides a means of understanding how the driver responds to visual stimuli in traffic. Previous studies have found that wearing PAL can affect eye and head movement performance resulting in slower eye movement velocities and longer times to stabilize the gaze for fixation. These changes in eye and head movement patterns may have implications for driving safety, given that the visual tasks for driving include a range of dynamic search tasks. Therefore, the second study was designed to investigate the influence of different presbyopic corrections on driving-related eye and head movements under standardized laboratory-based conditions. Twenty presbyopes (mean age: 56.1 ± 5.7 years) who had no experience of wearing presbyopic vision corrections, apart from single vision reading spectacles, were recruited. Each participant wore five different types of vision correction: single vision distance lenses (SV), PAL, BIF, MV and MTF CL. For each visual condition, participants were required to view videotape recordings of traffic scenes, track a reference vehicle and identify a series of peripherally presented targets while their eye and head movements were recorded using the faceLAB® eye and head tracking system. Digital numerical display panels were also included as near visual stimuli (simulating the visual displays of a vehicle speedometer and radio). The results demonstrated that the path length of eye movements while viewing and responding to driving-related traffic scenes was significantly longer when wearing BIF and PAL than MV and MTF CL. The path length of head movements was greater with SV, BIF and PAL than MV and MTF CL. Target recognition was less accurate when the near stimulus was located at eccentricities inferiorly and to the left, rather than directly below the primary position of gaze, regardless of vision correction type. The third experiment aimed to investigate the real world driving performance of presbyopes while wearing different vision corrections measured on a closed-road circuit at night-time. Eye movements were recorded using the ASL Mobile Eye, eye tracking system (as the faceLAB® system proved to be impractical for use outside of the laboratory). Eleven participants (mean age: 57.25 ± 5.78 years) were fitted with four types of prescribed vision corrections (SV, PAL, MV and MTF CL). The measures of driving performance on the closed-road circuit included distance to sign recognition, near target recognition, peripheral light-emitting-diode (LED) recognition, low contrast road hazards recognition and avoidance, recognition of all the road signs, time to complete the course, and driving behaviours such as braking, accelerating, and cornering. The results demonstrated that driving performance at night was most affected by MTF CL compared to PAL, resulting in shorter distances to read signs, slower driving speeds, and longer times spent fixating road signs. Monovision resulted in worse performance in the task of distance to read a signs compared to SV and PAL. The SV condition resulted in significantly more errors made in interpreting information from in-vehicle devices, despite spending longer time fixating on these devices. Progressive addition lenses were ranked as the most preferred vision correction, while MTF CL were the least preferred vision correction for night-time driving. This thesis addressed the research question of how presbyopic vision corrections affect driving performance and the results of the three experiments demonstrated that the different types of presbyopic vision corrections (e.g. BIF, PAL, MV and MTF CL) can affect driving performance in different ways. Distance-related driving tasks showed reduced performance with MV and MTF CL, while tasks which involved viewing in-vehicle devices were significantly hampered by wearing SV corrections. Wearing spectacles such as SV, BIF and PAL induced greater eye and head movements in the simulated driving condition, however this did not directly translate to impaired performance on the closed- road circuit tasks. These findings are important for understanding the influence of presbyopic vision corrections on vision under real world driving conditions. They will also assist the eye care practitioner to understand and convey to patients the potential driving difficulties associated with wearing certain types of presbyopic vision corrections and accordingly to support them in the process of matching patients to optical corrections which meet their visual needs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.