Biblioteca Digital

56 resultados para Face recognition from video

Image, Video and 3D Data Registration: Medical, Satellite and Video Processing Applications with Quality Metrics

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Data registration refers to a series of techniques for matching or bringing similar objects or datasets together into alignment. These techniques enjoy widespread use in a diverse variety of applications, such as video coding, tracking, object and face detection and recognition, surveillance and satellite imaging, medical image analysis and structure from motion. Registration methods are as numerous as their manifold uses, from pixel level and block or feature based methods to Fourier domain methods.

This book is focused on providing algorithms and image and video techniques for registration and quality performance metrics. The authors provide various assessment metrics for measuring registration quality alongside analyses of registration techniques, introducing and explaining both familiar and state-of-the-art registration methodologies used in a variety of targeted applications.

Key features:
- Provides a state-of-the-art review of image and video registration techniques, allowing readers to develop an understanding of how well the techniques perform by using specific quality assessment criteria
- Addresses a range of applications from familiar image and video processing domains to satellite and medical imaging among others, enabling readers to discover novel methodologies with utility in their own research
- Discusses quality evaluation metrics for each application domain with an interdisciplinary approach from different research perspectives

Video Event Recognition by Dempster-Shafer Theory

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This paper presents an event recognition framework, based on Dempster-Shafer theory, that combines evidence of events from low-level computer vision analytics. The proposed method employing evidential network modelling of composite events, is able to represent uncertainty of event output from low level video analysis and infer high level events with semantic meaning along with degrees of belief. The method has been evaluated on videos taken of subjects entering and leaving a seated area. This has relevance to a number of transport scenarios, such as onboard buses and trains, and also in train stations and airports. Recognition results of 78% and 100% for four composite events are encouraging.

Emotion Recognition in Human-Computer Interaction

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The authors are concerned with the development of computer systems that are capable of using information from faces and voices to recognise people's emotions in real-life situations. The paper addresses the nature of the challenges that lie ahead, and provides an assessment of the progress that has been made in the areas of signal processing and analysis techniques (with regard to speech and face), and the psychological and linguistic analyses of emotion. Ongoing developmental work by the authors in each of these areas is described.

Programmable SoC processor for video object recognition and tracking applications

Relevância:

40.00% 40.00%

Publicador:

Hidden Conditional Random Fields for Visual Speech Recognition

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.

Robust Extraction of Text from Camera Images using Colour and Spatial Information Simultaneously

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The importance and use of text extraction from camera based coloured scene images is rapidly increasing with time. Text within a camera grabbed image can contain a huge amount of meta data about that scene. Such meta data can be useful for identification, indexing and retrieval purposes. While the segmentation and recognition of text from document images is quite successful, detection of coloured scene text is a new challenge for all camera based images. Common problems for text extraction from camera based images are the lack of prior knowledge of any kind of text features such as colour, font, size and orientation as well as the location of the probable text regions. In this paper, we document the development of a fully automatic and extremely robust text segmentation technique that can be used for any type of camera grabbed frame be it single image or video. A new algorithm is proposed which can overcome the current problems of text segmentation. The algorithm exploits text appearance in terms of colour and spatial distribution. When the new text extraction technique was tested on a variety of camera based images it was found to out perform existing techniques (or something similar). The proposed technique also overcomes any problems that can arise due to an unconstraint complex background. The novelty in the works arises from the fact that this is the first time that colour and spatial information are used simultaneously for the purpose of text extraction.

Reconstructing the dead man’s face - a violent death from Medieval Armagh

Relevância:

40.00% 40.00%

Publicador:

Automatic recognition of emotion from voice: a rough benchmark

Relevância:

40.00% 40.00%

Publicador:

Diagnostic accuracy and clinical management by realtime teledermatology. Results from the Northern Ireland arms of the UK Multicentre Teledermatology Trial

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Diagnostic accuracy and management recommendations of realtime teledermatology consultations using low-cost telemedicine equipment were evaluated. Patients were seen by a dermatologist over a video-link and a diagnosis and treatment plan were recorded. This was followed by a face-to-face consultation on the same day to confirm the earlier diagnosis and management plan. A total of 351 patients with 427 diagnoses participated. Sixty-seven per cent of the diagnoses made over the video-link agreed with the face-to-face diagnosis. Clinical management plans were recorded for 214 patients with 252 diagnoses. For this cohort, 44% of the patients were seen by the same dermatologist at both consultations, while 56% were seen by a different dermatologist. In 64% of cases the same management plan was recommended at both consultations; a sub-optimum treatment plan was recommended in 8% of cases; and in 9% of cases the video-link management plans were judged to be inappropriate. In 20% of cases the dermatologist was unable to recommend a suitable management plan by video-link. There were significant differences in the ability to recommend an optimum management plan by video-link when a different dermatologist made the reference management plan. The results indicate that a high proportion of dermatological conditions can be successfully managed by realtime teledermatology.

Preliminary results from the Northern Ireland arms of the UK Multicentre Teledermatology Trial:is clinical management by realtime teledermatology possible?

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Results from phase 1 of the UK Multicentre Teledermatology Trial demonstrated the diagnostic accuracy of realtime teledermatology using low-cost equipment. Phase 2 of the trial aimed to assess its effectiveness as a management tool for dermatological disease. Teledermatology consultations were organized between two health centres and two hospitals in Northern Ireland using low-cost videoconferencing equipment. For 205 patients seen by a dermatologist over the video-link a diagnosis and management plan were recorded. A subsequent face-to-face consultation was arranged on the same day to confirm the diagnosis and treatment regime. A comparison of these management plans revealed that the same plan was recommended in 64% of cases; the teledermatologist was unable to advocate a suitable management plan in 19% of cases; a suboptimal treatment plan was suggested by the teledermatologist in 6% of cases; and in 11% of cases, the teledermatologist suggested an inappropriate treatment plan. These findings indicate that appropriate clinical management was possible in approximately two-thirds of dermatology consultations via the video-link.

A connectionist perspective of the transition from Face-to-Face to online teaching in Higher Education

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Existing research shows a slow transition to online education by many university teaching staff. A mixed methods approach is used to survey teacher educators in three jurisdictions in the UK who have made the transition to online teaching, followed by focus group and individual interviews to triangulate the data. The eight tenets of connectivism are used as a lens for analysis. Findings reveal sound pedagogical reasons for the limited choice of online tools and tutors highlight two elements, namely, self-fulfilment and their desire to continually develop as an educator, as the rationale for adopting informal professional development in the 21st century.

Robust Audio-Visual Speech Recognition under Noisy Audio-Video Conditions

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

Auditory gist: Recognition of very short sounds from timbre cues

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Sounds such as the voice or musical instruments can be recognized on the basis of timbre alone. Here, sound recognition was investigated with severely reduced timbre cues. Short snippets of naturally recorded sounds were extracted from a large corpus. Listeners were asked to report a target category (e.g., sung voices) among other sounds (e.g., musical instruments). All sound categories covered the same pitch range, so the task had to be solved on timbre cues alone. The minimum duration for which performance was above chance was found to be short, on the order of a few milliseconds, with the best performance for voice targets. Performance was independent of pitch and was maintained when stimuli contained less than a full waveform cycle. Recognition was not generally better when the sound snippets were time-aligned with the sound onset compared to when they were extracted with a random starting time. Finally, performance did not depend on feedback or training, suggesting that the cues used by listeners in the artificial gating task were similar to those relevant for longer, more familiar sounds. The results show that timbre cues for sound recognition are available at a variety of time scales, including very short ones.

Angels with dirty faces? European identity, politics of representation and recognition of Romani interests

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The contradiction between acknowledgement of cultural differences and their accommodation in public has been a constant theme in studies of diverse societies. This review essay discusses five volumes that grapple with questions of Romani inclusion and the problems Roma face across Europe. The volumes under review point to problems faced by Romani communities and analyse the various legal, political and social challenges that situation of the Roma poses to institutions of contemporary societies. The essay reviews the challenging nature of the status of Roma as we move away from the one-sided towards more reciprocal relationship engagement of state with society in general, and the multiply excluded groups, in particular. The essay finds that the role Roma play in these relationships is either over-, or under-estimated by the literature, largely as a result of limited opportunities to acknowledge and, in effect, accommodate Roma who are rarely understood as actors in their own right.

Perception and Automatic Recognition of Laughter from Whole-Body Motion: Continuous and Categorical Perspectives

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Despite its importance in social interactions, laughter remains little studied in affective computing. Intelligent virtual agents are often blind to users’ laughter and unable to produce convincing laughter themselves. Respiratory, auditory, and facial laughter signals have been investigated but laughter-related body movements have received less attention. The aim of this study is threefold. First, to probe human laughter perception by analyzing patterns of categorisations of natural laughter animated on a minimal avatar. Results reveal that a low dimensional space can describe perception of laughter “types”. Second, to investigate observers’ perception of laughter (hilarious, social, awkward, fake, and non-laughter) based on animated avatars generated from natural and acted motion-capture data. Significant differences in torso and limb movements are found between animations perceived as laughter and those perceived as non-laughter. Hilarious laughter also differs from social laughter. Different body movement features were indicative of laughter in sitting and standing avatar postures. Third, to investigate automatic recognition of laughter to the same level of certainty as observers’ perceptions. Results show recognition rates of the Random Forest model approach human rating levels. Classification comparisons and feature importance analyses indicate an improvement in recognition of social laughter when localized features and nonlinear models are used.

«
1
2
3
4
»