873 resultados para Audio-Visual Automatic Speech Recognition
Resumo:
Pinterest, la comunidad en línea donde sitúo mi objeto de estudio, permite a sus miembros crear colecciones audio - visuales a partir de imágenes (fijas o dinámicas), vídeos, audios, gráficos e incluso textos encontrados en el universo de internet. El estudio del <
Resumo:
Based on close examinations of instant message (IM) interactions, this chapter argues that an interactional sociolinguistic approach to computer-mediated language use could provide explanations for phenomena that previously could not be accounted for in computer-mediated discourse analysis (CMDA). Drawing on the theoretical framework of relational work (Locher, 2006), the analysis focuses on non-task oriented talk and its function in forming and establishing communication norms in the team, as well as micro-level phenomena, such as hesitation, backchannel signals and emoticons. The conclusions of this preliminary research suggest that the linguistic strategies used for substituting audio-visual signals are strategically used in discursive functions and have an important role in relational work
Resumo:
When referring to cinema and its emancipatory potential, realism, like Plato’s pharmakon, has signified both illness and cure, poison and medicine. On the one hand, realism is regarded as the main feature of so-called classical cinema, inherently conservative and thoroughly ideological, its main raison d’être being to reify and make a particular version of the status quo believable and to pass it out as ‘reality’ (Burch, 1990; MacCabe, 1974). On the other, realism has also been interpreted as a quest for truth and social justice, as in the positivist ethos that informs documentary (Zavattini, 1953). Even in the latter sense, however, the extent to which realism has served colonizing ends when used to investigate the ‘truth’ of the Other has also been noted, rendering the form profoundly suspicious (Chow, 2007, p. 150). For realism has been a Western form of representation, one that can be traced back to the invention of perspective in painting and that peaked with the secular worldview brought about by the Enlightenment. And like realism, the nation state too is a product of the Enlightenment, nationalism being, as it were, a secular replacement for the religious - that is enchanted or fantastic - worldview. In this way, realism, cinema and nation are inextricably linked, and equally strained under the current decline of the Enlightenment paradigm. This chapter looks at Y tu Mamá También by Alfonso Cuarón (2001), a highly successful road movie with documentary features, to explore the ways in which realism, cinema and nation interact with each other in the present conditions of ‘globalization’ as experienced in Mexico. The chapter compares and contrasts various interpretations of the role of realism in this film put forward by critics and scholars and other discourses about it circulating in the media with actual ways of audience engagement with it.
Resumo:
This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.
In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.
In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.
Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.
We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.
Resumo:
Sexual risk behavior among young adults is a serious public health concern; 50% will contract a sexually transmitted infection (STI) before the age of 25. The current study collected self-report personality and sexual history data, as well as neuroimaging, experimental behavioral (e.g., real-time hypothetical sexual decision making data), and self-report sexual arousal data from 120 heterosexual young adults ages 18-26. In addition, longitudinal changes in self-reported sexual behavior were collected from a subset (n = 70) of the participants. The primary aims of the study were (1) to predict differences in self-report sexual behavior and hypothetical sexual decision-making (in response to sexually explicit audio-visual cues) as a function of ventral striatum (VS) and amygdala activity, (2) test whether the association between sexual behavior/decision-making and brain function is moderated by gender, self-reported sexual arousal, and/or trait-level personality factors (i.e., self-control, impulsivity, and sensation seeking) and (3) to examine how the main effects of neural function and interaction effects predict sexual risk behavior over time. Our hypotheses were mostly supported across the sexual behavior and decision-making outcome variables, such that neural risk phenotypes (heightened reward-related ventral striatum activity coupled with decreased threat-related amygdala activity) were associated with greater lifetime sexual partners at baseline measured and over time (longitudinal analyses). Impulsivity moderated the relationship between neural function and self-reported number of sexual partners at baseline and follow up measures, as well as experimental condom use decision-making. Sexual arousal and sensation seeking moderated the relationship between neural function and baseline and follow up self-reports of number of sexual partners. Finally, unique gender differences were observed in the relationship between threat and reward-related neural reactivity and self-reported sexual risk behavior. The results of this study provide initial evidence for the potential role for neurobiological approaches to understanding sexual decision-making and risk behavior. With continued research, establishing biomarkers for sexual risk behavior could help inform the development of novel and more effective individually tailored sexual health prevention and intervention efforts.
Resumo:
El objetivo de este artículo es doble: por un lado explorar la habilidad de la Unión Europea para llevar a cabo una política audiovisual dirigida al Mercosur y promover las normas de la Convención sobre la diversidad de las expresiones culturales; por otro, analizar el impacto del modelo de política audiovisual de la UE en el desarrollo de la cooperación audiovisual con el Mercosur y centrarse en los principales vectores que configuran el paisaje audiovisual del Mercosur. El texto pretende destacar cómo y por qué la UE persigue una política audiovisual con esa región, cuáles son los propósitos y los límites de actuación. En este sentido, se preocupa por entender cómo la diplomacia audiovisual de la UE interactúa con otros actores, como las acciones gubernamentales llevadas a cabo desde la propia UE y el Mercosur, así como las prácticas del sector privado (Hollywwod y los grandes conglomerados de medios).
Resumo:
ARAUJO, Márcio V. ; ALSINA, Pablo J. ; MEDEIROS, Adelardo A. D. ; PEREIRA, Jonathan P.P. ; DOMINGOS, Elber C. ; ARAÚJO, Fábio M.U. ; SILVA, Jáder S. . Development of an Active Orthosis Prototype for Lower Limbs. In: INTERNATIONAL CONGRESS OF MECHANICAL ENGINEERING, 20., 2009, Gramado, RS. Proceedings… Gramado, RS: [s. n.], 2009
Resumo:
For those who are not new to the world of Japanese animation, known mainly as anime, the debate of "dub vs. sub" is by no means anything out of the ordinary, but rather a very heated argument amongst fans. The study will focus on the differences in the US English version between the two approaches of translating audio-visual media, namely subtitling (official subtitles and fanmade subtitles) and dubbing, in a qualitative context. More precisely, which of the two approaches can store the most information from the same audiovisual segment, in order to satisfy the needs of the anime audience. In order to draw substantial conclusions, the analysis will be conducted on a corpus of 1 episode from the first season of the popular mid-nineties TV animated series, Sailor Moon. The main objective of this research is to analyze the three versions and compare the findings to what anime fans expect each of them to provide, in terms of how culture specific terms are handled, how accurate the translation is, localization, censorship, and omission. As for the fans’ opinions, the study will include a survey regarding the personal preference of fans when it comes to choosing between the official subtitled version, the fanmade subtitles and the dubbed version.
Resumo:
Les parents à travers le monde chantent et parlent à leurs bébés. Ces deux types de vocalisations aux enfants préverbaux partagent plusieurs similarités de même que des différences, mais leurs conséquences sur les bébés demeurent méconnues. L’objectif de cette thèse était de documenter l’efficacité relative du chant et de la parole à capter l’attention des bébés sur de courtes périodes de temps (Étude 1) ainsi qu’à réguler l’affect des bébés en maintenant un état de satisfaction sur une période de temps prolongée (Étude 2). La première étude a exploré les réactions attentionnelles des bébés exposés à des enregistrements audio non familiers de chant et de parole. Lors de l’expérience 1, des bébés de 4 à 13 mois ont été exposés à de la parole joyeuse s’adressant au bébé (séquences de syllabes) et des berceuses fredonnées par la même femme. Ils ont écouté significativement plus longtemps la parole, qui contenait beaucoup plus de variabilité acoustique et d’expressivité que les berceuses. Dans l’expérience 2, des bébés d’âges comparables n’ont montré aucune écoute différentielle face à une version parlée ou chantée d’une chanson pour enfant turque, les deux versions étant exprimées de façon joyeuse / heureuse. Les bébés de l’expérience 3, ayant entendu la version chantée de la chanson turque ainsi qu’une version parlée de façon affectivement neutre ou s’adressant à l’adulte, ont écouté significativement plus longtemps la version chantée. Dans l’ensemble, la caractéristique vocale joyeuse plutôt que le mode vocal (chanté versus parlé) était le principal déterminant de l’attention du bébé, indépendamment de son âge. Dans la seconde étude, la régulation affective des bébés a été explorée selon l’exposition à des enregistrements audio non familiers de chant ou de parole. Les bébés ont été exposés à du chant ou de la parole jusqu’à ce qu’ils rencontrent un critère d’insatisfaction exprimée dans le visage. Lors de l’expérience 1, des bébés de 7 à 10 mois ont écouté des enregistrements de paroles s’adressant au bébé, de paroles s’adressant à l’adulte ou du chant dans une langue non familière (turque). Les bébés ont écouté le chant près de deux fois plus longtemps que les paroles avant de manifester de l’insatisfaction. Lors de l’expérience 2, des bébés ont été exposés à des enregistrements de paroles ou de chants issus d’interactions naturelles entre la mère et son bébé, dans une langue familière. Comme dans l’expérience 1, le chant s’adressant au bébé était considérablement plus efficace que les paroles pour retarder l’apparition du mécontentement. La construction temporelle du chant, avec notamment son rythme régulier, son tempo stable et ses répétitions, pourrait jouer un rôle important dans la régulation affective, afin de soutenir l’attention, rehausser la familiarité ou promouvoir l’écoute prédictive et l’entraînement. En somme, les études présentées dans cette thèse révèlent, pour la première fois, que le chant est un outil parental puissant, tout aussi efficace que la parole pour capter l’attention et plus efficace que la parole pour maintenir les bébés dans un état paisible. Ces découvertes soulignent l’utilité du chant dans la vie quotidienne et l’utilité potentielle du chant dans des contextes thérapeutiques variés impliquant des bébés.
Resumo:
Participation in group exhibition themed around the 25th anniversary of the Elba Benitez Gallery in Madrid. My work comprised a series of performances in which I translated reviews from the magazine Art Forum from 1990. The performances took place in various locations in London, throughout the run of the exhibition, and were streamed live to an iPad in the gallery in Madrid. I made audio visual recordings of the performances via the streaming media, which located me as the performer alongside the viewers in a single split image. These recordings were then archived in a shared folder held between the gallery and me, and which visitors to the exhibition could access when a performance was not taking place. The work extends my concerns with translation and performance, and with a consideration of how the mechanism of the gallery and the exhibition might be used to generate innovative viewing engagements facilitated by technology. The work also attempts to develop thinking and practice around the relationship between art works and their documentation - in this case the documentation and even its potential for distribution is generated as the work comes into being. The exhibition included works by Ignasi Aballí, Armando Andrade Tudela,Lothar Baumgarten, Carlos Bunga, Cabello/Carceller, Juan Cruz, Gintaras Didžiapetris, Fernanda Fragateiro, Hreinn Fridfinnsson, Carlos Garaicoa,Mario García Torres, David Goldblatt, Cristina Iglesias,Ana Mendieta, Vik Muniz, Ernesto Neto, Francisco Ruiz de Infante,Alexander Sokurov, Francesc Torres and Valentín Vallhonrat.
Resumo:
ARAUJO, Márcio V. ; ALSINA, Pablo J. ; MEDEIROS, Adelardo A. D. ; PEREIRA, Jonathan P.P. ; DOMINGOS, Elber C. ; ARAÚJO, Fábio M.U. ; SILVA, Jáder S. . Development of an Active Orthosis Prototype for Lower Limbs. In: INTERNATIONAL CONGRESS OF MECHANICAL ENGINEERING, 20., 2009, Gramado, RS. Proceedings… Gramado, RS: [s. n.], 2009
Resumo:
While humans can easily segregate and track a speaker's voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence.
Resumo:
Acompanha: Epidemias na escola? Só em filmes: possibilidades de contaminação na aprendizagem significativa
Resumo:
O presente relatório tem como principal objetivo desenvolver a competência da interculturalidade no âmbito da linguagem não-verbal de alunos portugueses, do 3º ciclo do ensino básico, com frequência à disciplina de Espanhol. A Linguagem não-verbal tem um contributo preponderante na comunicação e, quando há a pretensão de se conhecer outra cultura, é forçoso que se interpretem os diferentes sistemas onde cada indivíduo se integra e interage, porque comunicar eficazmente com o outro implica um conhecimento das estruturas simbólicas e dos códigos culturais intrínsecos, não só à cultura de um outro específico, mas também ao seu próprio contexto sociocultural, histórico-cultural e económico-cultural. Este trabalho faz apologia de um ensino intercultural que promova o diálogo entre culturas, sabendo-se de antemão que há representações que devem ser desconstruídas, bem como uma linguagem não-verbal específica que pode interferir na pragmática da interculturalidade. Trata-se de uma investigação-ação demarcada por dois momentos distintos: um primeiro estudo vocacionado para a consciencialização dos alunos de que a comunicação não-verbal é uma competência que se ensina e se aprende e um segundo estudo dedicado a aspetos culturais diferenciadores, entre Espanha e Portugal, na linguagem não-verbal , com enfoque nos gestos culturais e no tratamento do tempo. Os dados a analisar são: a transcrição de uma aula gravada, onde foram aplicados vários recursos audiovisuais e escritos consentâneos com as unidades programáticas, e as respostas a um questionário dirigido à turma de intervenção e a uma turma de nacionalidade espanhola que com ela colaborou. A implementação destas atividades/estratégias didáticas permitiu concluir que, por um lado, os alunos interpretam os diferentes códigos não -verbais à luz de uma perspetiva universal, por outro, há uma forte influência de estereótipos herdados e filtrados, a partir de diferentes marcos histórico-temporais. Este estudo sobre o não- verbal também se traduziu num alicerce bastante hábil para motivar à aprendizagem em geral e para enriquecer o conhecimento sobre a cultura do outro e a sua própria cultura, através da aquisição de códigos não- verbais comunicativo-funcionais.
Resumo:
Though the trend rarely receives attention, since the 1970s many American filmmakers have been taking sound and music tropes from children’s films, television shows, and other forms of media and incorporating those sounds into films intended for adult audiences. Initially, these references might seem like regressive attempts at targeting some nostalgic desire to relive childhood. However, this dissertation asserts that these children’s sounds are instead designed to reconnect audience members with the multi-faceted fantasies and coping mechanisms that once, through children’s media, helped these audience members manage life’s anxieties. Because sound is the sense that Western audiences most associate with emotion and memory, it offers audiences immediate connection with these barely conscious longings. The first chapter turns to children’s media itself and analyzes Disney’s 1950s forays into television. The chapter argues that by selectively repurposing the gentlest sonic devices from the studio’s films, television shows like Disneyland created the studio’s signature sentimental “Disney sound.” As a result, a generation of baby boomers like Steven Spielberg comes of age and longs to recreate that comforting sound world. The second chapter thus focuses on Spielberg, who incorporates Disney music in films like Close Encounters of the Third Kind (1977). Rather than recreate Disney’s sound world, Spielberg uses this music as a springboard into a new realm I refer to as “sublime refuge” - an acoustic haven that combines overpowering sublimity and soothing comfort into one fantastical experience. The second half of the dissertation pivots into more experimental children’s cartoons like Gerald McBoing-Boing (1951) - cartoons that embrace audio-visual dissonance in ways that soothe even as they create tension through a phenomenon I call “comfortable discord.” In the final chapter, director Wes Anderson reveals that these sonic tensions have just as much appeal to adults. In films like The Royal Tenenbaums (2001), Anderson demonstrates that comfortable discord can simultaneously provide a balm for anxiety and create an open-ended space that makes empathetic connections between characters possible. The dissertation closes with a call to rethink nostalgia, not as a romanticization of the past, but rather as a reconnection with forgotten affective channels.