12 resultados para Audio visual speech recognition
em CentAUR: Central Archive University of Reading - UK
Resumo:
This workshop paper reports recent developments to a vision system for traffic interpretation which relies extensively on the use of geometrical and scene context. Firstly, a new approach to pose refinement is reported, based on forces derived from prominent image derivatives found close to an initial hypothesis. Secondly, a parameterised vehicle model is reported, able to represent different vehicle classes. This general vehicle model has been fitted to sample data, and subjected to a Principal Component Analysis to create a deformable model of common car types having 6 parameters. We show that the new pose recovery technique is also able to operate on the PCA model, to allow the structure of an initial vehicle hypothesis to be adapted to fit the prevailing context. We report initial experiments with the model, which demonstrate significant improvements to pose recovery.
Resumo:
The encoding of goal-oriented motion events varies across different languages. Speakers of languages without grammatical aspect (e.g., Swedish) tend to mention motion endpoints when describing events, e.g., “two nuns walk to a house,”, and attach importance to event endpoints when matching scenes from memory. Speakers of aspect languages (e.g., English), on the other hand, are more prone to direct attention to the ongoingness of motion events, which is reflected both in their event descriptions, e.g., “two nuns are walking.”, and in their non-verbal similarity judgements. This study examines to what extent native speakers of Swedish (n = 82) with English as a foreign language (FL) restructure their categorisation of goal-oriented motion as a function of their English proficiency and experience with the English language (e.g., exposure, learning). Seventeen monolingual native English speakers from the United Kingdom (UK) were engaged for comparison purposes. Data on motion event cognition were collected through a memory-based triads matching task, in which a target scene with an intermediate degree of endpoint orientation was matched with two alternative scenes with low and high degrees of endpoint orientation, respectively. Results showed that the preference among the Swedish speakers of L2 English to base their similarity judgements on ongoingness rather than event endpoints was correlated with their use of English in their everyday lives, such that those who often watched television in English approximated the ongoingness preference of the English native speakers. These findings suggest that event cognition patterns may be restructured through the exposure to FL audio-visual media. The results thus add to the emerging picture that learning a new language entails learning new ways of observing and reasoning about reality.
Resumo:
Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and in animated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.
Resumo:
Previous functional imaging studies have shown that facilitated processing of a visual object on repeated, relative to initial, presentation (i.e., repetition priming) is associated with reductions in neural activity in multiple regions, including fusiforin/lateral occipital cortex. Moreover, activity reductions have been found, at diminished levels, when a different exemplar of an object is presented on repetition. In one previous study, the magnitude of diminished priming across exemplars was greater in the right relative to the left fusiform, suggesting greater exemplar specificity in the right. Another previous study, however, observed fusiform lateralization modulated by object viewpoint, but not object exemplar. The present fMRI study sought to determine whether the result of differential fusiform responses for perceptually different exemplars could be replicated. Furthermore, the role of the left fusiform cortex in object recognition was investigated via the inclusion of a lexical/semantic manipulation. Right fusiform cortex showed a significantly greater effect of exemplar change than left fusiform, replicating the previous result of exemplar-specific fusiform lateralization. Right fusiform and lateral occipital cortex were not differentially engaged by the lexical/semantic manipulation, suggesting that their role in visual object recognition is predominantly in the. C visual discrimination of specific objects. Activation in left fusiform cortex, but not left lateral occipital cortex, was modulated by both exemplar change and lexical/semantic manipulation, with further analysis suggesting a posterior-to-anterior progression between regions involved in processing visuoperceptual and lexical/semantic information about objects. The results are consistent with the view that the right fusiform plays a greater role in processing specific visual form information about objects, whereas the left fusiform is also involved in lexical/semantic processing. (C) 2003 Elsevier Science (USA). All rights reserved.
Resumo:
It has been shown through a number of experiments that neural networks can be used for a phonetic typewriter. Algorithms can be looked on as producing self-organizing feature maps which correspond to phonemes. In the Chinese language the utterance of a Chinese character consists of a very simple string of Chinese phonemes. With this as a starting point, a neural network feature map for Chinese phonemes can be built up. In this paper, feature map structures for Chinese phonemes are discussed and tested. This research on a Chinese phonetic feature map is important both for Chinese speech recognition and for building a Chinese phonetic typewriter.
Resumo:
The academic discipline of television studies has been constituted by the claim that television is worth studying because it is popular. Yet this claim has also entailed a need to defend the subject against the triviality that is associated with the television medium because of its very popularity. This article analyses the many attempts in the later twentieth and twenty-first centuries to constitute critical discourses about television as a popular medium. It focuses on how the theoretical currents of Television Studies emerged and changed in the UK, where a disciplinary identity for the subject was founded by borrowing from related disciplines, yet argued for the specificity of the medium as an object of criticism. Eschewing technological determinism, moral pathologization and sterile debates about television's supposed effects, UK writers such as Raymond Williams addressed television as an aspect of culture. Television theory in Britain has been part of, and also separate from, the disciplinary fields of media theory, literary theory and film theory. It has focused its attention on institutions, audio-visual texts, genres, authors and viewers according to the ways that research problems and theoretical inadequacies have emerged over time. But a consistent feature has been the problem of moving from a descriptive discourse to an analytical and evaluative one, and from studies of specific texts, moments and locations of television to larger theories. By discussing some historically significant critical work about television, the article considers how academic work has constructed relationships between the different kinds of objects of study. The article argues that a fundamental tension between descriptive and politically activist discourses has confused academic writing about ›the popular‹. Television study in Britain arose not to supply graduate professionals to the television industry, nor to perfect the instrumental techniques of allied sectors such as advertising and marketing, but to analyse and critique the medium's aesthetic forms and to evaluate its role in culture. Since television cannot be made by ›the people‹, the empowerment that discourses of television theory and analysis aimed for was focused on disseminating the tools for critique. Recent developments in factual entertainment television (in Britain and elsewhere) have greatly increased the visibility of ›the people‹ in programmes, notably in docusoaps, game shows and other participative formats. This has led to renewed debates about whether such ›popular‹ programmes appropriately represent ›the people‹ and how factual entertainment that is often despised relates to genres hitherto considered to be of high quality, such as scripted drama and socially-engaged documentary television. A further aspect of this problem of evaluation is how television globalisation has been addressed, and the example that the issue has crystallised around most is the reality TV contest Big Brother. Television theory has been largely based on studying the texts, institutions and audiences of television in the Anglophone world, and thus in specific geographical contexts. The transnational contexts of popular television have been addressed as spaces of contestation, for example between Americanisation and national or regional identities. Commentators have been ambivalent about whether the discipline's role is to celebrate or critique television, and whether to do so within a national, regional or global context. In the discourses of the television industry, ›popular television‹ is a quantitative and comparative measure, and because of the overlap between the programming with the largest audiences and the scheduling of established programme types at the times of day when the largest audiences are available, it has a strong relationship with genre. The measurement of audiences and the design of schedules are carried out in predominantly national contexts, but the article refers to programmes like Big Brother that have been broadcast transnationally, and programmes that have been extensively exported, to consider in what ways they too might be called popular. Strands of work in television studies have at different times attempted to diagnose what is at stake in the most popular programme types, such as reality TV, situation comedy and drama series. This has centred on questions of how aesthetic quality might be discriminated in television programmes, and how quality relates to popularity. The interaction of the designations ›popular‹ and ›quality‹ is exemplified in the ways that critical discourse has addressed US drama series that have been widely exported around the world, and the article shows how the two critical terms are both distinct and interrelated. In this context and in the article as a whole, the aim is not to arrive at a definitive meaning for ›the popular‹ inasmuch as it designates programmes or indeed the medium of television itself. Instead the aim is to show how, in historically and geographically contingent ways, these terms and ideas have been dynamically adopted and contested in order to address a multiple and changing object of analysis.
Resumo:
Starting point for these outputs is a large scale research project in collaboration with the Zurich University for the Arts and the Kunstmuseum Thun, looking at a redefinition of Social Sculpture (Joseph Beuys/ Bazon Brock, 1970) as a functional device re-deployed to expand the art discourse into a societal discourse. Although Beuys‘ version of a social sculpture involved notions of abstruse mysticism and reformulations of a national identity these were never-the less part of a social transformation that shifted and re-arranged power relations. Following Laclau and Mouffe in their contention that democray is a fundamentally antagonistic process and contesting Grant Kester’s understanding of a ethically based relational practice, this work is alignes itself with Hirschhorn’s claim to an aesthetic practice within communities, following the possibility to view a socially based practice from both ends of the ethics debate, whereby ethical aspects fuels the aethetic to “create situations that are beautiful because they are ethical and shocking because they are ethical, thus in turn aesthetic because they are ethical” (O’Donnell). This project sets out to engage in activities which interact with surrounding communities and evoce new imaginations of site, thereby understanding site as a catalysts for subjective emergences. Performance is tested as a site for social practice. Archival research into local audio/visual resources, such as the Swiss Radio Archive, the Swiss Military Film Archives and zoological film archives of the Basel Zoo, was instrumental to the navigation of this work, under theme of crisis, catastrophy, landscape, fallout, in order to create a visual language for an active performance site. Commissioned by the Kunstmuseum Thun in collaboration with the University for the Arts in Zurich as part of a year long exhibition programme, (other artists are Jeanne Van Heeswijk (NL) and San Keller (CH), ) this project brings together a series of different works in a new performace installation. The performance process includes a performance workshop with 30 school children from local Swiss schools and their teachers, which was conducted publicly in the museum spaces. It enabled the children to engage with an unexpected set of tribal and animalistic behaviours, looking at situations of flight and rescue, resulting in a large performance choreography orchestration without an apparent conductor, it includes a collaboration with renowned Swiss zoologist, Prof Klaus Zuberbühler(University of St Andrews) and the Colonal General Haldimann commander of the military base in Thun. The installation included 2 static video images, shot in an around spectacular local cave site (Beatus Caves) including 3 children. The project will culminate in an edited edition of the Oncurating Journal, (issue no, tbc, in 2012) including interviews and essays from project collaborators. (Army Commander General, Thun, Jörg Hess, performance script, Timothy Long, and others)
Resumo:
Duras’s theatre work has been profoundly neglected by UK theatre academics and practitioners, and Eden Cinema has almost no performance history in Britain. My project asked three interconnected research questions: how developing the performance contributes to understanding Duras’s theatre and specifically Eden Cinema’s problems of performability; how multimedia performance emphasising mediated sound and the live body reconfigures memory, autobiography, storytelling, gender and racial identity; how to locate a performance style appropriate for Durasian narratives of displacement and death which reflect the discontinuous and mutable form of Duras’s ‘texte/film/théâtre’. Drawing on my research interests in gender, post-colonial hybridity and performed deconstruction, I focused my staging decisions on the discontinuities and ambivalences of the text. I addressed performability by avoiding the temptation to resolve the strange ellipses in the text and instead evoked the text’s imperfect and fragmented memories, and its uncertain spatial and temporal locations, by means of a fluid theatrical form. The mise-en-scène represented imagined and remembered spaces simultaneously, and co-existing historical moments. The performance style counterpointed live and mediated action and audio-visual forms. A complex through-composed soundscape, comprising voice-over, sound and music, became a key means for evoking overlapping temporalities, interconnected narratives and fragmented memories that were dispersed across the performance. The disempowerment of the mother figure and the silent indigenous servant in the text was demonstrated through their spatial centrality but physical stillness. The servant’s colonial subaltern identity was paralleled and linked with the mother’s disenfranchisement through their proxemic relationships. I elicited a performance style which evoked ‘characters’, whose being was deferred across different regimes of reality and who ‘haunted’ the stage rather than inhabited it. I developed the project further in the additional written outcomes and presentations, and the subsequent performance of Savannah Bay where problems of performability intensify until embodiment is almost erased except via voice.
Resumo:
Backtracks aimed to investigate critical relationships between audio-visual technologies and live performance, emphasising technologies producing sound, contrasted with non-amplified bodily sound. Drawing on methodologies for studying avant garde theatre, live performance and the performing body, it was informed by work in critical and cultural theory by, for example, Steven Connor and Jonathan Rée, on the body's experience and interpretation of sound. The performance explored how shifting national boundaries, mobile workforces, complex family relationships, cultural pluralities and possibilities for bodily transformation have compelled a re-evaluation of what it means to feel 'at home' in modernity. Using montages of live and mediated images, disrupted narratives and sound, it evoked destablised identities which characterise contemporary lived experience, and enacted the displacement of certainties provided by family and nation, community and locality, body and selfhood. Homer's Odyssey framed the performance: elements could be traced in the mise-en-scène; in the physical presence of Athene, the narrator and Penelope weaving mementoes from the past into her loom; and in voice-overs from Homer's work. The performance drew on personal experiences and improvisations, structured around notions of journey. It presented incomplete narratives, memories, repressed anxieties and dreams through different combinations of sounds, music, mediated images, movement, voice and bodily sound. The theme of travel was intensified by performers carrying suitcases and umbrellas, by soundtracks incorporating travel effects, and by the distorted video images of forms of transport playing across 'screens' which proliferated across the space (sails, umbrellas, the loom, actors' bodies). The performance experimented with giving sound and silence performative dimensions, including presenting sound in visual and imagistic ways, for example by using signs from deaf sign language. Through-composed soundtracks of live and recorded song, music, voice-over, and noise exploited the viscerality of sound and disrupted cognitive interpretation by phenomenological, somatic experience, thereby displacing the impulse for closure/destination/home.
Resumo:
Background: Few studies have investigated how individuals diagnosed with post-stroke Broca’s aphasia decompose words into their constituent morphemes in real-time processing. Previous research has focused on morphologically complex words in non-time-constrained settings or in syntactic frames, but not in the lexicon. Aims: We examined real-time processing of morphologically complex words in a group of five Greek-speaking individuals with Broca’s aphasia to determine: (1) whether their morphological decomposition mechanisms are sensitive to lexical (orthography and frequency) vs. morphological (stem-suffix combinatory features) factors during visual word recognition, (2) whether these mechanisms are different in inflected vs. derived forms during lexical access, and (3) whether there is a preferred unit of lexical access (syllables vs. morphemes) for inflected vs. derived forms. Methods & Procedures: The study included two real-time experiments. The first was a semantic judgment task necessitating participants’ categorical judgments for high- and low-frequency inflected real words and pseudohomophones of the real words created by either an orthographic error at the stem or a homophonous (but incorrect) inflectional suffix. The second experiment was a letter-priming task at the syllabic or morphemic boundary of morphologically transparent inflected and derived words whose stems and suffixes were matched for length, lemma and surface frequency. Outcomes & Results: The majority of the individuals with Broca’s aphasia were sensitive to lexical frequency and stem orthography, while ignoring the morphological combinatory information encoded in the inflectional suffix that control participants were sensitive to. The letter-priming task, on the other hand, showed that individuals with aphasia—in contrast to controls—showed preferences with regard to the unit of lexical access, i.e., they were overall faster on syllabically than morphemically parsed words and their morphological decomposition mechanisms for inflected and derived forms were modulated by the unit of lexical access. Conclusions: Our results show that in morphological processing, Greek-speaking persons with aphasia rely mainly on stem access and thus are only sensitive to orthographic violations of the stem morphemes, but not to illegal morphological combinations of stems and suffixes. This possibly indicates an intact orthographic lexicon but deficient morphological decomposition mechanisms, possibly stemming from an underspecification of inflectional suffixes in the participants’ grammar. Syllabic information, however, appears to facilitate lexical access and elicits repair mechanisms that compensate for deviant morphological parsing procedures.
Resumo:
Digital imaging technologies enable a mastery of the visual that in recent mainstream cinema frequently manifests as certain kinds of spatial reach, orientation and motion. In such a context Michael Bay’s Transformers franchise can be framed as a digital re-tooling of a familiar fantasy of vehicular propulsion, US car culture writ large in digitally crafted spectacles of diegetic speed, the vehicular chase film ‘2.0’. Movement is central to these films, calling up Scott Bukatman’s observation that in spectacular visual media ‘movement has become more than a tool of bodily knowledge; it has become an end in itself’ (2003: 125). Not all movements and not all instances of vehicular propulsion are the same however. How might we evaluate what is at stake in a film’s assertion of movement as an end in itself, and the form that assertion takes, its articulations of diegetic velocity, corporeality, and spatial penetration? Deploying an attentiveness towards the specificity of aesthetic detail and affective impact in Bay’s delineation of movement, this essay suggests that the franchise poses questions about the relationship of human movement to machine movement that exceed their narrative basis. Identifying a persistent rotational trope in the franchise that in its audio-visual articulation combines oddly anachronistic elements (evoking the mechanical rather than the digital), the article argues that the films prioritise certain fantasies of transformation and spatial penetration, and certain modes of corporeality, as one response to contemporary debates about digital technologisation, sustainable energy, and cinematic spectacle. In this way the franchise also represents a particular moment in a more widely discernible preoccupation in contemporary cinema with what we might call a ‘rotational aesthetics’ of action, a machine movement made possible by the digital, but which invokes earlier histories and fantasies of animation, propulsion, mechanization and mechanization to particular ends.