904 resultados para audio-visual automatic speech recognition
Resumo:
Starting point for these outputs is a large scale research project in collaboration with the Zurich University for the Arts and the Kunstmuseum Thun, looking at a redefinition of Social Sculpture (Joseph Beuys/ Bazon Brock, 1970) as a functional device re-deployed to expand the art discourse into a societal discourse. Although Beuys‘ version of a social sculpture involved notions of abstruse mysticism and reformulations of a national identity these were never-the less part of a social transformation that shifted and re-arranged power relations. Following Laclau and Mouffe in their contention that democray is a fundamentally antagonistic process and contesting Grant Kester’s understanding of a ethically based relational practice, this work is alignes itself with Hirschhorn’s claim to an aesthetic practice within communities, following the possibility to view a socially based practice from both ends of the ethics debate, whereby ethical aspects fuels the aethetic to “create situations that are beautiful because they are ethical and shocking because they are ethical, thus in turn aesthetic because they are ethical” (O’Donnell). This project sets out to engage in activities which interact with surrounding communities and evoce new imaginations of site, thereby understanding site as a catalysts for subjective emergences. Performance is tested as a site for social practice. Archival research into local audio/visual resources, such as the Swiss Radio Archive, the Swiss Military Film Archives and zoological film archives of the Basel Zoo, was instrumental to the navigation of this work, under theme of crisis, catastrophy, landscape, fallout, in order to create a visual language for an active performance site. Commissioned by the Kunstmuseum Thun in collaboration with the University for the Arts in Zurich as part of a year long exhibition programme, (other artists are Jeanne Van Heeswijk (NL) and San Keller (CH), ) this project brings together a series of different works in a new performace installation. The performance process includes a performance workshop with 30 school children from local Swiss schools and their teachers, which was conducted publicly in the museum spaces. It enabled the children to engage with an unexpected set of tribal and animalistic behaviours, looking at situations of flight and rescue, resulting in a large performance choreography orchestration without an apparent conductor, it includes a collaboration with renowned Swiss zoologist, Prof Klaus Zuberbühler(University of St Andrews) and the Colonal General Haldimann commander of the military base in Thun. The installation included 2 static video images, shot in an around spectacular local cave site (Beatus Caves) including 3 children. The project will culminate in an edited edition of the Oncurating Journal, (issue no, tbc, in 2012) including interviews and essays from project collaborators. (Army Commander General, Thun, Jörg Hess, performance script, Timothy Long, and others)
Resumo:
Duras’s theatre work has been profoundly neglected by UK theatre academics and practitioners, and Eden Cinema has almost no performance history in Britain. My project asked three interconnected research questions: how developing the performance contributes to understanding Duras’s theatre and specifically Eden Cinema’s problems of performability; how multimedia performance emphasising mediated sound and the live body reconfigures memory, autobiography, storytelling, gender and racial identity; how to locate a performance style appropriate for Durasian narratives of displacement and death which reflect the discontinuous and mutable form of Duras’s ‘texte/film/théâtre’. Drawing on my research interests in gender, post-colonial hybridity and performed deconstruction, I focused my staging decisions on the discontinuities and ambivalences of the text. I addressed performability by avoiding the temptation to resolve the strange ellipses in the text and instead evoked the text’s imperfect and fragmented memories, and its uncertain spatial and temporal locations, by means of a fluid theatrical form. The mise-en-scène represented imagined and remembered spaces simultaneously, and co-existing historical moments. The performance style counterpointed live and mediated action and audio-visual forms. A complex through-composed soundscape, comprising voice-over, sound and music, became a key means for evoking overlapping temporalities, interconnected narratives and fragmented memories that were dispersed across the performance. The disempowerment of the mother figure and the silent indigenous servant in the text was demonstrated through their spatial centrality but physical stillness. The servant’s colonial subaltern identity was paralleled and linked with the mother’s disenfranchisement through their proxemic relationships. I elicited a performance style which evoked ‘characters’, whose being was deferred across different regimes of reality and who ‘haunted’ the stage rather than inhabited it. I developed the project further in the additional written outcomes and presentations, and the subsequent performance of Savannah Bay where problems of performability intensify until embodiment is almost erased except via voice.
Resumo:
Backtracks aimed to investigate critical relationships between audio-visual technologies and live performance, emphasising technologies producing sound, contrasted with non-amplified bodily sound. Drawing on methodologies for studying avant garde theatre, live performance and the performing body, it was informed by work in critical and cultural theory by, for example, Steven Connor and Jonathan Rée, on the body's experience and interpretation of sound. The performance explored how shifting national boundaries, mobile workforces, complex family relationships, cultural pluralities and possibilities for bodily transformation have compelled a re-evaluation of what it means to feel 'at home' in modernity. Using montages of live and mediated images, disrupted narratives and sound, it evoked destablised identities which characterise contemporary lived experience, and enacted the displacement of certainties provided by family and nation, community and locality, body and selfhood. Homer's Odyssey framed the performance: elements could be traced in the mise-en-scène; in the physical presence of Athene, the narrator and Penelope weaving mementoes from the past into her loom; and in voice-overs from Homer's work. The performance drew on personal experiences and improvisations, structured around notions of journey. It presented incomplete narratives, memories, repressed anxieties and dreams through different combinations of sounds, music, mediated images, movement, voice and bodily sound. The theme of travel was intensified by performers carrying suitcases and umbrellas, by soundtracks incorporating travel effects, and by the distorted video images of forms of transport playing across 'screens' which proliferated across the space (sails, umbrellas, the loom, actors' bodies). The performance experimented with giving sound and silence performative dimensions, including presenting sound in visual and imagistic ways, for example by using signs from deaf sign language. Through-composed soundtracks of live and recorded song, music, voice-over, and noise exploited the viscerality of sound and disrupted cognitive interpretation by phenomenological, somatic experience, thereby displacing the impulse for closure/destination/home.
Resumo:
Digital imaging technologies enable a mastery of the visual that in recent mainstream cinema frequently manifests as certain kinds of spatial reach, orientation and motion. In such a context Michael Bay’s Transformers franchise can be framed as a digital re-tooling of a familiar fantasy of vehicular propulsion, US car culture writ large in digitally crafted spectacles of diegetic speed, the vehicular chase film ‘2.0’. Movement is central to these films, calling up Scott Bukatman’s observation that in spectacular visual media ‘movement has become more than a tool of bodily knowledge; it has become an end in itself’ (2003: 125). Not all movements and not all instances of vehicular propulsion are the same however. How might we evaluate what is at stake in a film’s assertion of movement as an end in itself, and the form that assertion takes, its articulations of diegetic velocity, corporeality, and spatial penetration? Deploying an attentiveness towards the specificity of aesthetic detail and affective impact in Bay’s delineation of movement, this essay suggests that the franchise poses questions about the relationship of human movement to machine movement that exceed their narrative basis. Identifying a persistent rotational trope in the franchise that in its audio-visual articulation combines oddly anachronistic elements (evoking the mechanical rather than the digital), the article argues that the films prioritise certain fantasies of transformation and spatial penetration, and certain modes of corporeality, as one response to contemporary debates about digital technologisation, sustainable energy, and cinematic spectacle. In this way the franchise also represents a particular moment in a more widely discernible preoccupation in contemporary cinema with what we might call a ‘rotational aesthetics’ of action, a machine movement made possible by the digital, but which invokes earlier histories and fantasies of animation, propulsion, mechanization and mechanization to particular ends.
Resumo:
This Capstone Project attempts to determine the ability of normal hearing children to resolve spectral information, and the relationship between spectral resolution ability and speech recognition ability in noise. This study also examines how these abilities develop with age.
Resumo:
Dynamic Time Warping (DTW), a pattern matching technique traditionally used for restricted vocabulary speech recognition, is based on a temporal alignment of the input signal with the template models. The principal drawback of DTW is its high computational cost as the lengths of the signals increase. This paper shows extended results over our previously published conference paper, which introduces an optimized version of the DTW I hat is based on the Discrete Wavelet Transform (DWT). (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
In this thesis, a new algorithm has been proposed to segment the foreground of the fingerprint from the image under consideration. The algorithm uses three features, mean, variance and coherence. Based on these features, a rule system is built to help the algorithm to efficiently segment the image. In addition, the proposed algorithm combine split and merge with modified Otsu. Both enhancements techniques such as Gaussian filter and histogram equalization are applied to enhance and improve the quality of the image. Finally, a post processing technique is implemented to counter the undesirable effect in the segmented image. Fingerprint recognition system is one of the oldest recognition systems in biometrics techniques. Everyone have a unique and unchangeable fingerprint. Based on this uniqueness and distinctness, fingerprint identification has been used in many applications for a long period. A fingerprint image is a pattern which consists of two regions, foreground and background. The foreground contains all important information needed in the automatic fingerprint recognition systems. However, the background is a noisy region that contributes to the extraction of false minutiae in the system. To avoid the extraction of false minutiae, there are many steps which should be followed such as preprocessing and enhancement. One of these steps is the transformation of the fingerprint image from gray-scale image to black and white image. This transformation is called segmentation or binarization. The aim for fingerprint segmentation is to separate the foreground from the background. Due to the nature of fingerprint image, the segmentation becomes an important and challenging task. The proposed algorithm is applied on FVC2000 database. Manual examinations from human experts show that the proposed algorithm provides an efficient segmentation results. These improved results are demonstrating in diverse experiments.
Resumo:
This paper examines a popular music song (Heartbeats by Jose Gonzalez) as a sign system in television advertising. The study was conducted through qualitative questionnaires in connection to an audio-visual method of analysis called Masking. The method facilitates the analysis of isolated parts in the audio-visual spectrum by masking/hiding parts of the audio-visual totality.The survey had seven respondents where a hermeneutic epistemological approach was used. For the analysis Cooper's theory of brand identity (Practical and Symbolic Attitudes to Buying Brands) was used together with an interaction model for music in audio-visual advertising called "Modes of music-image interaction”. The results showed that the music was associated with values as genuine, honest, responsibility, purity, independence and innovation. The music's symbolic values helped to position the brand in a lifestyle context. The music also helped to express the target group’s identity and attitudes by being innovative and independent. It also enhanced the perception of the visual colour rendition in the film. In general the television advertisement perceived more positive and entertaining when the music was present. In other words the music's social and cultural position contributed to raise the film's credibility. A deeper social and cultural value was created in the movie through resonance between symbolic values of the music and symbolic values of the film.
Resumo:
Literacy is an invaluable asset to have, and has allowed for communication, documentation and the spreading of ideas since the beginning of the written language. With technological advancements, and new possibilities to communicate, it is important to question the degree to which people’s abilities to utilise these new methods have developed in relation to these emerging technologies. The purpose of this bachelor’s thesis is to analyse the state of students’ at Dalarna University mulitimodal literacy, as well as their experience of multimodality in their education. This has led to the two main research questions: What is the state of the students at Dalarna University multimodal literacy? And: How have the students at Dalarna University experienced multimodality in education? The paper is based on a mixed-method study that incorporates both a quantitative and qualitative aspect to it. The main thrust of the research paper is, however, based on a quantitative study that was conducted online and emailed to students via their program coordinators. The scope of the research is in audio-visual modes, i.e. audio, video and images, while textual literacy is presumed and serves as an inspiration to the study. The purpose of the study is to analyse the state of the students’ multimodal literacy and their experience of multimodality in education. The study revealed that the students at Dalarna University have most skill in image editing, while not being very literate in audio or video editing. The students seem to have had mediocre experience creating meaning through multimodality both in private use and in their respective educational institutions. The study also reveals that students prefer learning by means of video (rather than text or audio), yet are not able to create meaning (communicate) through it.
Resumo:
Parkinson’s disease (PD) is an increasing neurological disorder in an aging society. The motor and non-motor symptoms of PD advance with the disease progression and occur in varying frequency and duration. In order to affirm the full extent of a patient’s condition, repeated assessments are necessary to adjust medical prescription. In clinical studies, symptoms are assessed using the unified Parkinson’s disease rating scale (UPDRS). On one hand, the subjective rating using UPDRS relies on clinical expertise. On the other hand, it requires the physical presence of patients in clinics which implies high logistical costs. Another limitation of clinical assessment is that the observation in hospital may not accurately represent a patient’s situation at home. For such reasons, the practical frequency of tracking PD symptoms may under-represent the true time scale of PD fluctuations and may result in an overall inaccurate assessment. Current technologies for at-home PD treatment are based on data-driven approaches for which the interpretation and reproduction of results are problematic. The overall objective of this thesis is to develop and evaluate unobtrusive computer methods for enabling remote monitoring of patients with PD. It investigates first-principle data-driven model based novel signal and image processing techniques for extraction of clinically useful information from audio recordings of speech (in texts read aloud) and video recordings of gait and finger-tapping motor examinations. The aim is to map between PD symptoms severities estimated using novel computer methods and the clinical ratings based on UPDRS part-III (motor examination). A web-based test battery system consisting of self-assessment of symptoms and motor function tests was previously constructed for a touch screen mobile device. A comprehensive speech framework has been developed for this device to analyze text-dependent running speech by: (1) extracting novel signal features that are able to represent PD deficits in each individual component of the speech system, (2) mapping between clinical ratings and feature estimates of speech symptom severity, and (3) classifying between UPDRS part-III severity levels using speech features and statistical machine learning tools. A novel speech processing method called cepstral separation difference showed stronger ability to classify between speech symptom severities as compared to existing features of PD speech. In the case of finger tapping, the recorded videos of rapid finger tapping examination were processed using a novel computer-vision (CV) algorithm that extracts symptom information from video-based tapping signals using motion analysis of the index-finger which incorporates a face detection module for signal calibration. This algorithm was able to discriminate between UPDRS part III severity levels of finger tapping with high classification rates. Further analysis was performed on novel CV based gait features constructed using a standard human model to discriminate between a healthy gait and a Parkinsonian gait. The findings of this study suggest that the symptom severity levels in PD can be discriminated with high accuracies by involving a combination of first-principle (features) and data-driven (classification) approaches. The processing of audio and video recordings on one hand allows remote monitoring of speech, gait and finger-tapping examinations by the clinical staff. On the other hand, the first-principles approach eases the understanding of symptom estimates for clinicians. We have demonstrated that the selected features of speech, gait and finger tapping were able to discriminate between symptom severity levels, as well as, between healthy controls and PD patients with high classification rates. The findings support suitability of these methods to be used as decision support tools in the context of PD assessment.
Resumo:
Allt eftersom utvecklingen går framåt inom applikationer och system så förändras också sättet på vilket vi interagerar med systemet på. Hittills har navigering och användning av applikationer och system mestadels skett med händerna och då genom mus och tangentbord. På senare tid så har navigering via touch-skärmar och rösten blivit allt mer vanligt. Då man ska styra en applikation med hjälp av rösten är det viktigt att vem som helst kan styra applikationen, oavsett vilken dialekt man har. För att kunna se hur korrekt ett röstigenkännings-API (Application Programming Interface) uppfattar svenska dialekter så initierades denna studie med dokumentstudier om dialekters kännetecken och ljudkombinationer. Dessa kännetecken och ljudkombinationer låg till grund för de ord vi valt ut till att testa API:et med. Varje dialekt fick alltså ett ord uppbyggt för att vara extra svårt för API:et att uppfatta när det uttalades av just den aktuella dialekten. Därefter utvecklades en prototyp, närmare bestämt en android-applikation som fungerade som ett verktyg i datainsamlingen. Då arbetet innehåller en prototyp och en undersökning så valdes Design and Creation Research som forskningsstrategi med datainsamlingsmetoderna dokumentstudier och observationer för att få önskat resultat. Data samlades in via observationer med prototypen som hjälpmedel och med hjälp av dokumentstudier. Det empiriska data som registrerats via observationerna och med hjälp av applikationen påvisade att vissa dialekter var lättare för API:et att uppfatta korrekt. I vissa fall var resultaten väntade då vissa ord uppbyggda av ljudkombinationer i enlighet med teorin skulle uttalas väldigt speciellt av en viss dialekt. Ibland blev det väldigt låga resultat på just dessa ord men i andra fall förvånansvärt höga. Slutsatsen vi drog av detta var att de ord vi valt ut med en baktanke om att de skulle få låga resultat för den speciella dialekten endast visade sig stämma vid två tillfällen. Det var istället det ord innehållande sje- och tje-ljud som enligt teorin var gemensamma kännetecken för alla dialekter som fick lägst resultat överlag.
Resumo:
Loop-teknik i solistiska sammanhang är en idag väletablerad musicerande form dock är möjligheterna att använda tekniken i ensembleform ett mer eller mindre oprövat fält. I projektet “Audiovisuella loopar” som genomförts vid Högskolan Dalarna presenteras ett system för kollektiv livelooping där upp till fyra personer loop-musicerar tillsammans och där loopandet samtidigt kan spelas in och spelas upp som videoklipp. Tekniken har visat sig ha en stark kreativ potential. Med enbart en kort instruktion så startar en kollektiv process där den omedelbara feedbacken ger ett “fl ow” som lockar fram skaparglädjen hos musikanterna. Dessa erfarenheter pekar på spännande möjligheter att använda tekniken i musikundervisning och musikterapi.
Resumo:
This paper approaches the strategy in business management and aimed at identifying and outlining the interests and commitment of stakeholders in strategic resources management concerning production and implementation of wind turbine equipment of a Brazilian wind power company and also verifying if internal and external results deriving from such activities were sustainable, taking as main reference seminal publications and periodicals relevant to the research point that discuss the Resource Theory, Stakeholders and Sustainability. An analysis was carried out to assess how stakeholders, beyond the temporal context, intermediated the composition, development and management of the organization´s resources, as well as the social, environmental and economic results obtained from resources management in the production and supply of wind turbines to a Wind Power Plant located in the State of Ceara, in order to portray that Brazil sustainability can be an important competitive advantage source that creates value for shareholders and the community (Hart & Milstein, 2003). The strategy herein applied was the qualitative investigation using a single study case, which allowed for the thorough examination of an active organization operating in the Brazilian industry of wind power and also the resources used in the production and implementation of wind turbines supplied to the a Wind Power Plant in Ceara. Considering the content analysis and the triangulation principle, three qualitative data collection methods were applied to identify and characterize stakeholders’ interest and commitment in resource management of the organization operating in the Brazilian wind power industry, as follows, semistructured deep interview with managers of tactic-strategic level and analysts of organization´s value chain nine activities, analysis of public internal and external documents; and analysis of audio-visual material. Nonetheless, to identify the internal and external economic, social and environmental results of implementation and supply of wind turbines to the Wind Power Plant in Ceara, semistructured interviews were also carried out with the residents of the region. Results showed the BNDES (Brazilian Development Bank) and the organization head office were the stakeholders who exerted the strongest influence on resources related to production and implementation of the aerogenerator product at Trairi Wind Plant in Ceara. Concerning the organization resources, at the current stage of the Brazilian Wind Industry ,although the brand, reliability and reputation of the organization under study were valuable esources, rare, hard to imitate and exploited by the organization, it was noticed that opposed to RBV, they did not actually represent a source of competitive advantage . For the local community the social, economic and environmental results related to the wind turbines implementation were more positive than negative, despite the fact that the productive process caused negative environmental impacts such as the high emission of CO2 to transport wind turbines components to Trairi Wind Power Plant.
Resumo:
As indústrias criativas são hoje um tema de intenso debate na literatura acadêmica internacional e nas organizações públicas e governamentais. Essas indústrias nasceram como um conceito conciliador entre as indústrias culturais tradicionais, as artes criativas e as novas tecnologias de informação. O objetivo desta pesquisa foi fazer um levantamento bibliográfico sobre o tema e um mapeamento de um core dessas indústrias no país e no Estado de São Paulo. Para a realização deste mapeamento, utilizou-se de informações provenientes de fontes secundárias, como de relatórios de institutos de pesquisa, listas telefônicas e órgãos de classe. Os resultados apontam para um desenvolvimento mais pronunciado das indústrias criativas focadas em produção de bens culturais de massa, como Televisão e rádio, bem como, menos expressivamente porém, em audiovisual. No Estado de São Paulo, apenas 1,0% do PIB está associado às atividades das indústrias criativas, com esperada concentração na capital e região metropolitana. Este relatório aponta ainda algumas linhas de pesquisas futuras sobre o tema.