946 resultados para : multimedia


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current mobile devices and streaming video services support high definition (HD) video, increasing expectation for more contents. HD video streaming generally requires large bandwidth, exerting pressures on existing networks. New generation of video compression codecs, such as VP9 and H.265/HEVC, are expected to be more effective for reducing bandwidth. Existing studies to measure the impact of its compression on users’ perceived quality have not been focused on mobile devices. Here we propose new Quality of Experience (QoE) models that consider both subjective and objective assessments of mobile video quality. We introduce novel predictors, such as the correlations between video resolution and size of coding unit, and achieve a high goodness-of-fit to the collected subjective assessment data (adjusted R-square >83%). The performance analysis shows that H.265 can potentially achieve 44% to 59% bit rate saving compared to H.264/AVC, slightly better than VP9 at 33% to 53%, depending on video content and resolution.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Large Display Arrays (LDAs) use Light Emitting Diodes (LEDs) in order to inform a viewing audience. A matrix of individually driven LEDs allows the area represented to display text, images and video. LDAs have undergone rapid development over the past 10 years in both the modular and semi-flexible formats. This thesis critically analyses the communication architecture and processor functionality of current LDAs and presents an alternative method, that is, Scalable Flexible Large Display Arrays (SFLDAs). SFLDAs are more adaptable to a variety of applications because of enhancements in scalability and flexibility. Scalability is the ability to configure SFLDAs from 0.8m2 to 200m2. Flexibility is increased functionality within the processors to handle changes in configuration and the use of a communication architecture that standardises two-way communication throughout the SFLDA. While common video platforms such as Digital Video Interface (DVI), Serial Digital Interface (SDI), and High Definition Multimedia Interface (HDMI) are considered as solutions for the communication architecture of SFLDAs, so too is modulation, fibre optic, capacitive coupling and Ethernet. From an analysis of these architectures, Ethernet was identified as the best solution. The use of Ethernet as the communication architecture in SFLDAs means that both hardware and software modules are capable of interfacing to the SFLDAs. The Video to Ethernet Processor Unit (VEPU), Scoreboard, Image and Control Software (SICS) and Ethernet to LED Processor Unit (ELPU) have been developed to form the key components in designing and implementing the first SFLDA. Data throughput rate and spectrophotometer tests were used to measure the effectiveness of Ethernet within the SFLDA constructs. The result of testing and analysis of these architectures showed that Ethernet satisfactorily met the requirements of SFLDAs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Rationale, aims and objectives: Patients with both cardiac disease and diabetes have poorer health outcomes than patients with only one chronic condition. While evidence indicates that internet based interventions may improve health outcomes for patients with a chronic disease, there is no literature on internet programs specific to cardiac patients with comorbid diabetes. Therefore this study aimed to develop a specific web-based program, then to explore patients’ perspectives on the usefulness of a new program. Methods: The interpretive approach using semi-structured interviews on a purposive sample of eligible patients with type 2 diabetes and a cardiac condition in a metropolitan hospital in Brisbane, Australia. Thematic analysis was undertaken to describe the perceived usefulness of a newly developed Heart2heart webpage. Results: Themes identified included confidence in hospital health professionals and reliance on doctors to manage conditions. Patients found the webpage useful for managing their conditions at home. Conclusions: The new Heart2heart webpage provided a positive and useful resource. Further research on to determine the potential influence of this resource on patients’ self-management behaviours is paramount. Implications for practice include using multimedia strategies for providing information to patients’ comorbidities of cardiac disease and type 2 diabetes, and further development on enhancement of such strategies

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new digital story exploring female agency and violence against women Launched this month, ‘Provocare’ is a multimedia verse thriller created by Meg Vann, writer; Mez Breeze, interaction designer; and Donna Hancox, research lead for Creative Industries at Queensland University of Technology (QUT). It is the first work to be commissioned and produced for ‘Queensland Writers on the International Stage’, an Arts Queensland funded programme created by QUT and The Writing Platform.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Australian Media Law details and explains the complex case law, legislation and regulations governing media practice in areas as diverse as journalism, advertising, multimedia and broadcasting. It examines the issues affecting traditional forms of media such as television, radio, film and newspapers as well as for recent forms such as the internet, online forums and digital technology, in a clear and accessible format. New additions to the fifth edition include: - the implications of new anti-terrorism legislation for journalists; - developments in privacy law, including Law Reform recommendations for a statutory cause of action to protect personal privacy in Australia and the expanding privacy jurisprudence in the United Kingdom and New Zealand; - liability for defamation of internet search engines and service providers; - the High Court decision in Roadshow v iiNet and the position of internet service providers in relation to copyright infringement via their services; - new suppression order regimes; - statutory reforms providing journalists with a rebuttable presumption of non-disclosure when called upon to reveal their sources in a court of law; - recent developments regarding whether journalists can use electronic devices to collect and disseminate information about court proceedings; - contempt committed by jurors via social media; and an examination of recent decisions on defamation, confidentiality, vilification, copyright and contempt.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Frog species have been declining worldwide at unprecedented rates in the past decades. There are many reasons for this decline including pollution, habitat loss, and invasive species [1]. To preserve, protect, and restore frog biodiversity, it is important to monitor and assess frog species. In this paper, a novel method using image processing techniques for analyzing Australian frog vocalisations is proposed. An FFT is applied to audio data to produce a spectrogram. Then, acoustic events are detected and isolated into corresponding segments through image processing techniques applied to the spectrogram. For each segment, spectral peak tracks are extracted with selected seeds and a region growing technique is utilised to obtain the contour of each frog vocalisation. Based on spectral peak tracks and the contour of each frog vocalisation, six feature sets are extracted. Principal component analysis reduces each feature set down to six principal components which are tested for classification performance with a k-nearest neighbor classifier. This experiment tests the proposed method of classification on fourteen frog species which are geographically well distributed throughout Queensland, Australia. The experimental results show that the best average classification accuracy for the fourteen frog species can be up to 87%.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There is a perceived tension in the relationship between the roles of art teacher and artist that led to the question: can an art teacher use their professional training and experience to establish an authentic artistic identity? This self-study tracked and analysed how the process of making her own art enabled an art teacher to also identify as an artist. Drawing on Lamina, the public exhibition of her multimedia artworks, the final exegesis proposes five conditions for art teachers in developing their own art practice: developing an identity as artist, using time and space mindfully, tolerating uncertainty, mentoring, and privileging the process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the availability of a huge amount of video data on various sources, efficient video retrieval tools are increasingly in demand. Video being a multi-modal data, the perceptions of ``relevance'' between the user provided query video (in case of Query-By-Example type of video search) and retrieved video clips are subjective in nature. We present an efficient video retrieval method that takes user's feedback on the relevance of retrieved videos and iteratively reformulates the input query feature vectors (QFV) for improved video retrieval. The QFV reformulation is done by a simple, but powerful feature weight optimization method based on Simultaneous Perturbation Stochastic Approximation (SPSA) technique. A video retrieval system with video indexing, searching and relevance feedback (RF) phases is built for demonstrating the performance of the proposed method. The query and database videos are indexed using the conventional video features like color, texture, etc. However, we use the comprehensive and novel methods of feature representations, and a spatio-temporal distance measure to retrieve the top M videos that are similar to the query. In feedback phase, the user activated iterative on the previously retrieved videos is used to reformulate the QFV weights (measure of importance) that reflect the user's preference, automatically. It is our observation that a few iterations of such feedback are generally sufficient for retrieving the desired video clips. The novel application of SPSA based RF for user-oriented feature weights optimization makes the proposed method to be distinct from the existing ones. The experimental results show that the proposed RF based video retrieval exhibit good performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research has made contributions to the area of spoken term detection (STD), defined as the process of finding all occurrences of a specified search term in a large collection of speech segments. The use of visual information in the form of lip movements of the speaker in addition to audio and the use of topic of the speech segments, and the expected frequency of words in the target speech domain, are proposed. By using these complementary information, improvement in the performance of STD has been achieved which enables efficient search of key words in large collection of multimedia documents.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This creative work is an original soundtrack for the multimedia performance adaptation of Oscar Wilde’s De Profundis, led by David Fenton and Brian Lucas and produced by Metro Arts. Intermediality offers unique challenges to the composer creating towards live performance. Given the text-based nature of the piece, and the prevalence of screen content, music had a distinct role to play in supporting the intermedial performance environment. Drawing from Oscar Wilde’s own writings in the initial stages [“...richer cadences…more curious effects” “…the cry of Marysas” and the “deferred resolution of Chopin”] , the deliberately risky compositional process experimented with improvised location recordings and found sounds, random and fragmented assemblages of vintage recordings, rough methods and obsolete recording technology, and the sonic kinship of the hissing sibilances of the sea, theatrical applause and the crackle of antique recording devices (which had just been invented in Wilde’s time) worked into wefts of sound. As the soundtrack emerged, is was clearly resistant to ‘concepts’ imposed from the outside, and as the field of possibilities expanded and engaged in dialogue with the other elements of the performance (live and projected) certain pieces were selected by the director and curated into the emerging work. Thus leitmotifs emerged, rather than being imposed from the outset, with a particular through line holding: if it was too obviously like ‘music’, (which is usually used in theatre as emotional lubrication and narrative signpost) it didn’t work, and if it sounded like avant-garde sound-art, it was too grating and detracted from the primacy of the text. As a composer I worked this sweet spot inbetween these two poles as well as serving David Fenton’s curation: he determined which compositions to incorporate, reiterate and omit as part of the process of writing text, action and image and the compositional process responded with organic elaborations and variations on these selections. Musical resolution was mostly deferred until the closing stages of the performance. The soundtrack was present for the duration of the show, and Artshub reviewed the musical component thus: “...the score by David Megarrity is a refined, understated ambient scaffolding.” It premiered at the Visy Theatre, Brisbane Powerhouse, on 22 April 2015.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

At present, the most reliable method to obtain end-user perceived quality is through subjective tests. In this paper, the impact of automatic region-of-interest (ROI) coding on perceived quality of mobile video is investigated. The evidence, which is based on perceptual comparison analysis, shows that the coding strategy improves perceptual quality. This is particularly true in low bit rate situations. The ROI detection method used in this paper is based on two approaches: - (1) automatic ROI by analyzing the visual contents automatically, and; - (2) eye-tracking based ROI by aggregating eye-tracking data across many users, used to both evaluate the accuracy of automatic ROI detection and the subjective quality of automatic ROI encoded video. The perceptual comparison analysis is based on subjective assessments with 54 participants, across different content types, screen resolutions, and target bit rates while comparing the two ROI detection methods. The results from the user study demonstrate that ROI-based video encoding has higher perceived quality compared to normal video encoded at a similar bit rate, particularly in the lower bit rate range.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

It is now widely acknowledged that student mental well-being is a critical factor in the tertiary student learning experience and is important to student learning success. The issue of student mental well-being also has implications for effective student transition out of university and into the world of work. It is therefore vital that intentional strategies are adopted by universities both within the formal curriculum, and outside it, to promote student well-being and to work proactively and preventatively to avoid a decline in student psychological well-being. This paper describes how the Queensland University of Technology Law School is using animation to teach students about the importance for their learning success of the protection of their mental well-being. Mayer and Moreno (2002) define an animation as an external representation with three main characteristics: (1) it is a pictorial representation, (2) it depicts apparent movement, and (3) it consists of objects that are artificially created through drawing or some other modelling technique. Research into the effectiveness of animation as a tool for tertiary student learning engagement is relatively new and growing field of enquiry. Nash argues, for example, that animations provide a “rich, immersive environment [that] encourages action and interactivity, which overcome an often dehumanizing learning management system approach” (Nash, 2009, 25). Nicholas states that contemporary millennial students in universities today, have been immersed in animated multimedia since their birth and in fact need multimedia to learn and communicate effectively (2008). However, it has also been established, for example through the work of Lowe (2003, 2004, 2008) that animations can place additional perceptual, attentional, and cognitive demands on students that they are not always equipped to cope with. There are many different genres of animation. The dominant style of animation used in the university learning environment is expository animation. This approach is a useful tool for visualising dynamic processes and is used to support student understanding of subjects and themes that might otherwise be perceived as theoretically difficult and disengaging. It is also a form of animation that can be constructed to avoid any potential negative impact on cognitive load that the animated genre might have. However, the nature of expository animation has limitations for engaging students, and can present as clinical and static. For this reason, the project applied Kombartzky, Ploetzner, Schlag, and Metz’s (2010) cognitive strategy for effective student learning from expository animation, and developed a hybrid form of animation that takes advantage of the best elements of expository animation techniques along with more engaging short narrative techniques. First, the paper examines the existing literature on the use of animation in tertiary educational contexts. Second, the paper describes how animation was used at QUT Law School to teach students about the issue of mental well-being and its importance to their learning success. Finally, the paper analyses the potential of the use of animation, and of the cognitive strategy and animation approach trialled in the project, as a teaching tool for the promotion of student learning about the importance of mental well-being.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The use of head-mounted displays (HMDs) can produce both positive and negative experiences. In an effort increase positive experiences and avoid negative ones, researchers have identified a number of variables that may cause sickness and eyestrain, although the exact nature of the relationship to HMDs may vary, depending on the tasks and the environments. Other non-sickness-related aspects of HMDs, such as users opinions and future decisions associated with task enjoyment and interest, have attracted little attention in the research community. In this thesis, user experiences associated with the use of monocular and bi-ocular HMDs were studied. These include eyestrain and sickness caused by current HMDs, the advantages and disadvantages of adjustable HMDs, HMDs as accessories for small multimedia devices, and the impact of individual characteristics and evaluated experiences on reported outcomes and opinions. The results indicate that today s commercial HMDs do not induce serious sickness or eyestrain. Reported adverse symptoms have some influence on HMD-related opinions, but the nature of the impact depends on the tasks and the devices used. As an accessory to handheld devices and as a personal viewing device, HMDs may increase use duration and enable users to perform tasks not suitable for small screens. Well-designed and functional, adjustable HMDs, especially monocular HMDs, increase viewing comfort and usability, which in turn may have a positive effect on product-related satisfaction. The role of individual characteristics in understanding HMD-related experiences has not changed significantly. Explaining other HMD-related experiences, especially forward-looking interests, also requires understanding more stable individual traits and motivations.