153 resultados para documentary video
Resumo:
Sound design for documentary is an under-researched field. The specific context of representation of emotional or mental states is particularly open to clichéd treatment. Such treatment in the media often ‘perpetuates inaccurate or negative assumptions about mental health issues in the wider community’ (Francis et al 2005: 11) by employing, for example, either jarring sound/music combinations to signify ‘madness’ or overtly saccharine music to educe sympathy. This project adopted a practice-based approach to discovering a considered aesthetic treatment designed to elicit a more empathetic audience response. A more discriminating engagement with the intentions of the film was cultivated by abandoning both the ‘representational naturalism’ and the ‘distilled, evocative realism’ of documentary sound design (Davies 2007: 18) in favour of a more lyrical or musical approach. To achieve this we manipulated perspective, tonal character and perceptions of space in the final mixing stage. The project was funded by the Film Australia National Interest Program, ABC TV and the Pacific Film and Television Commission. As a crucial contributor to the aesthetic of the project I was nominated in the funding application, and ultimately received an AFI Award for Best Sound in a Documentary in 2008. The film was honoured by The Film Critics Circle of Australia, The Slamdance Film Festival in Utah and The Sydney Film Festival. It has been favourably reviewed in national and international print media (The Age, Sydney Morning Herald, among others) as well as online film/culture zines and blogs.
Resumo:
'Delivery' (2005) was an installation work at MetroArts, Brisbane that incorporated drawings, paintings, video projections and temporary architectural structures. The work made central use out of a mock public event, staged in a Gold Coast park by the artist. Documentary footage of the ambiguous event comprised one of the video projections and formed the basic iconographic palette upon which the rest of the works were based. Using 3D animation as well as conventional drawing and paintign approaches, the works conveyed a palpable sense of fragmentation and social dislocation - a quality that was heightened by the reflective panels that bisected the exhibition space. The work was [part of the MetroArts Artistic Program in 2005 and its video elements were included in the 2008 exhibition Video Ground, curated by Rachel O'Reilly for Multimedia Art Asia Pacific (MAAP)/Bangkok Experimental Film Festival (Touring show). The work was the subject of a feature article by Mark Pennings in Eyeline magazine, and also appeared on the front cover of that issue.
Resumo:
Prevailing video adaptation solutions change the quality of the video uniformly throughout the whole frame in the bitrate adjustment process; while region-of-interest (ROI)-based solutions selectively retains the quality in the areas of the frame where the viewers are more likely to pay more attention to. ROI-based coding can improve perceptual quality and viewer satisfaction while trading off some bandwidth. However, there has been no comprehensive study to measure the bitrate vs. perceptual quality trade-off so far. The paper proposes an ROI detection scheme for videos, which is characterized with low computational complexity and robustness, and measures the bitrate vs. quality trade-off for ROI-based encoding using a state-of-the-art H.264/AVC encoder to justify the viability of this type of encoding method. The results from the subjective quality test reveal that ROI-based encoding achieves a significant perceptual quality improvement over the encoding with uniform quality at the cost of slightly more bits. Based on the bitrate measurements and subjective quality assessments, the bitrate and the perceptual quality estimation models for non-scalable ROI-based video coding (AVC) are developed, which are found to be similar to the models for scalable video coding (SVC).
Resumo:
Video is commonly used as a method for recording embodied interaction for purposes of analysis and design and has been proposed as a useful ‘material’ for interaction designers to engage with. But video is not a straight forward reproduction of embodied activity – in themselves video recordings ‘flatten’ the space of embodied interaction, they impose a perspective on unfolding action, and remove the embodied spatial and social context within which embodied interaction unfolds. This does not mean that video is not a useful medium with which to engage as part of a process of investigating and designing for embodied interaction – but crucially, it requires that as people attempting to engage with video, designers own bodies and bodily understandings must be engaged with and brought into play. This paper describes and reflects upon our experiences of engaging with video in two different activities as part of a larger research project investigating the design of gestural interfaces for a dental surgery context.
Resumo:
We address the problem of face recognition on video by employing the recently proposed probabilistic linear discrimi-nant analysis (PLDA). The PLDA has been shown to be robust against pose and expression in image-based face recognition. In this research, the method is extended and applied to video where image set to image set matching is performed. We investigate two approaches of computing similarities between image sets using the PLDA: the closest pair approach and the holistic sets approach. To better model face appearances in video, we also propose the heteroscedastic version of the PLDA which learns the within-class covariance of each individual separately. Our experi-ments on the VidTIMIT and Honda datasets show that the combination of the heteroscedastic PLDA and the closest pair approach achieves the best performance.
Resumo:
Introduction The suitability of video conferencing (VC) technology for clinical purposes relevant to geriatric medicine is still being established. This project aimed to determine the validity of the diagnosis of dementia via VC. Methods This was a multisite, noninferiority, prospective cohort study. Patients, aged 50 years and older, referred by their primary care physician for cognitive assessment, were assessed at 4 memory disorder clinics. All patients were assessed independently by 2 specialist physicians. They were allocated one face-to-face (FTF) assessment (Reference standard – usual clinical practice) and an additional assessment (either usual FTF assessment or a VC assessment) on the same day. Each specialist physician had access to the patient chart and the results of a battery of standardized cognitive assessments administered FTF by the clinic nurse. Percentage agreement (P0) and the weighted kappa statistic with linear weight (Kw) were used to assess inter-rater reliability across the 2 study groups on the diagnosis of dementia (cognition normal, impaired, or demented). Results The 205 patients were allocated to group: Videoconference (n = 100) or Standard practice (n = 105); 106 were men. The average age was 76 (SD 9, 51–95) and the average Standardized Mini-Mental State Examination Score was 23.9 (SD 4.7, 9–30). Agreement for the Videoconference group (P0= 0.71; Kw = 0.52; P < .0001) and agreement for the Standard Practice group (P0= 0.70; Kw = 0.50; P < .0001) were both statistically significant (P < .05). The summary kappa statistic of 0.51 (P = .84) indicated that VC was not inferior to FTF assessment. Conclusions Previous studies have shown that preliminary standardized assessment tools can be reliably administered and scored via VC. This study focused on the geriatric assessment component of the interview (interpretation of standardized assessments, taking a history and formulating a diagnosis by medical specialist) and identified high levels of agreement for diagnosing dementia. A model of service incorporating either local or remote administered standardized assessments, and remote specialist assessment, is a reliable process for enabling the diagnosis of dementia for isolated older adults.
Resumo:
Having a good automatic anomalous human behaviour detection is one of the goals of smart surveillance systems’ domain of research. The automatic detection addresses several human factor issues underlying the existing surveillance systems. To create such a detection system, contextual information needs to be considered. This is because context is required in order to correctly understand human behaviour. Unfortunately, the use of contextual information is still limited in the automatic anomalous human behaviour detection approaches. This paper proposes a context space model which has two benefits: (a) It provides guidelines for the system designers to select information which can be used to describe context; (b)It enables a system to distinguish between different contexts. A comparative analysis is conducted between a context-based system which employs the proposed context space model and a system which is implemented based on one of the existing approaches. The comparison is applied on a scenario constructed using video clips from CAVIAR dataset. The results show that the context-based system outperforms the other system. This is because the context space model allows the system to considering knowledge learned from the relevant context only.
Resumo:
A century ago, as the Western world embarked on a period of traumatic change, the visual realism of photography and documentary film brought print and radio news to life. The vision that these new mediums threw into stark relief was one of intense social and political upheaval: the birth of modernity fired and tempered in the crucible of the Great War. As millions died in this fiery chamber and the influenza pandemic that followed, lines of empires staggered to their fall, and new geo-political boundaries were scored in the raw, red flesh of Europe. The decade of 1910 to 1919 also heralded a prolific period of artistic experimentation. It marked the beginning of the social and artistic age of modernity and, with it, the nascent beginnings of a new art form: film. We still live in the shadow of this violent, traumatic and fertile age; haunted by the ghosts of Flanders and Gallipoli and its ripples of innovation and creativity. Something happened here, but to understand how and why is not easy; for the documentary images we carry with us in our collective cultural memory have become what Baudrillard refers to as simulacra. Detached from their referents, they have become referents themselves, to underscore other, grand narratives in television and Hollywood films. The personal histories of the individuals they represent so graphically–and their hope, love and loss–are folded into a national story that serves, like war memorials and national holidays, to buttress social myths and values. And, as filmic images cross-pollinate, with each iteration offering a new catharsis, events that must have been terrifying or wondrous are abstracted. In this paper we first discuss this transformation through reference to theories of documentary and memory–this will form a conceptual framework for a subsequent discussion of the short film Anmer. Produced by the first author in 2010, Anmer is a visual essay on documentary, simulacra and the symbolic narratives of history. Its form, structure and aesthetic speak of the confluence of documentary, history, memory and dream. Located in the first decade of the twentieth century, its non-linear narratives of personal tragedy and poetic dreamscapes are an evocative reminder of the distance between intimate experience, grand narratives, and the mythologies of popular films. This transformation of documentary sources not only played out in the processes of the film’s production, but also came to form its theme.
Resumo:
The time consuming and labour intensive task of identifying individuals in surveillance video is often challenged by poor resolution and the sheer volume of stored video. Faces or identifying marks such as tattoos are often too coarse for direct matching by machine or human vision. Object tracking and super-resolution can then be combined to facilitate the automated detection and enhancement of areas of interest. The object tracking process enables the automatic detection of people of interest, greatly reducing the amount of data for super-resolution. Smaller regions such as faces can also be tracked. A number of instances of such regions can then be utilized to obtain a super-resolved version for matching. Performance improvement from super-resolution is demonstrated using a face verification task. It is shown that there is a consistent improvement of approximately 7% in verification accuracy, using both Eigenface and Elastic Bunch Graph Matching approaches for automatic face verification, starting from faces with an eye to eye distance of 14 pixels. Visual improvement in image fidelity from super-resolved images over low-resolution and interpolated images is demonstrated on a small database. Current research and future directions in this area are also summarized.
Resumo:
Effective streaming of video can be achieved by providing more bits to the most important region in the frame at the cost of reduced bits in the less important regions. This strategy can be beneficial for delivering high quality videos in mobile devices, especially when the availability of bandwidth is usually low and limited. While the state-of-the-art video codecs such as H.264 may have been optimised for perceived quality, it is hypothesised that users will give more attention to interesting region/object when watching videos. Therefore, giving a higher quality to region of interest (ROI)while reducing quality of other areas may result in improving the overall perceived quality without necessarily increasing the bitrate. In this paper, the impact of ROI-based encoded video on perceived quality is investigated by conducting a user study for varous target bitrates. The results from the user study demonstrate that ROI-based video coding has superior perceived quality compared to normal encoded video at the same bitrate in the lower bitrate range.
Resumo:
Topographic structural complexity of a reef is highly correlated to coral growth rates, coral cover and overall levels of biodiversity, and is therefore integral in determining ecological processes. Modeling these processes commonly includes measures of rugosity obtained from a wide range of different survey techniques that often fail to capture rugosity at different spatial scales. Here we show that accurate estimates of rugosity can be obtained from video footage captured using underwater video cameras (i.e., monocular video). To demonstrate the accuracy of our method, we compared the results to in situ measurements of a 2m x 20m area of forereef from Glovers Reef atoll in Belize. Sequential pairs of images were used to compute fine scale bathymetric reconstructions of the reef substrate from which precise measurements of rugosity and reef topographic structural complexity can be derived across multiple spatial scales. To achieve accurate bathymetric reconstructions from uncalibrated monocular video, the position of the camera for each image in the video sequence and the intrinsic parameters (e.g., focal length) must be computed simultaneously. We show that these parameters can be often determined when the data exhibits parallax-type motion, and that rugosity and reef complexity can be accurately computed from existing video sequences taken from any type of underwater camera from any reef habitat or location. This technique provides an infinite array of possibilities for future coral reef research by providing a cost-effective and automated method of determining structural complexity and rugosity in both new and historical video surveys of coral reefs.
Resumo:
Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.
Resumo:
From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.
Resumo:
In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.