873 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI
Resumo:
Notre mémoire prend en charge de re-conceptualiser notre nouvel environnement audio-visuel et l’expérience que nous en faisons. À l’ère du numérique et de la dissémination généralisée des images animées, nous circonscrivons une catégorie d’images que nous concevons comme la plus à même d’avoir un impact sur le développement humain. Nous les appelons des images-sons synchrono-photo-temporalisées. Plus spécifiquement, nous cherchons à mettre en lumière leur puissance d’affection et de contrôle en démontrant qu’elles ont une influence certaine sur le processus d’individuation, influence qui est grandement facilitée par l’isotopie structurelle qui existe entre le flux de conscience et leur flux d’écoulement. Par le biais des recherches de Bernard Stiegler, nous remarquons également l’important rôle que jouent l’attention et la mémoire dans le processus d’individuation. L’ensemble de notre réflexion nous fait réaliser à quel point le système d’éducation actuel québécois manque à sa tâche de formation citoyenne en ne dispensant pas un enseignement adéquat des images animées.
Resumo:
Notre mémoire prend en charge de re-conceptualiser notre nouvel environnement audio-visuel et l’expérience que nous en faisons. À l’ère du numérique et de la dissémination généralisée des images animées, nous circonscrivons une catégorie d’images que nous concevons comme la plus à même d’avoir un impact sur le développement humain. Nous les appelons des images-sons synchrono-photo-temporalisées. Plus spécifiquement, nous cherchons à mettre en lumière leur puissance d’affection et de contrôle en démontrant qu’elles ont une influence certaine sur le processus d’individuation, influence qui est grandement facilitée par l’isotopie structurelle qui existe entre le flux de conscience et leur flux d’écoulement. Par le biais des recherches de Bernard Stiegler, nous remarquons également l’important rôle que jouent l’attention et la mémoire dans le processus d’individuation. L’ensemble de notre réflexion nous fait réaliser à quel point le système d’éducation actuel québécois manque à sa tâche de formation citoyenne en ne dispensant pas un enseignement adéquat des images animées.
Resumo:
Signifying road-related events with warnings can be highly beneficial, especially when imminent attention is needed. This thesis describes how modality, urgency and situation can influence driver responses to multimodal displays used as warnings. These displays utilise all combinations of audio, visual and tactile modalities, reflecting different urgency levels. In this way, a new rich set of cues is designed, conveying information multimodally, to enhance reactions during driving, which is a highly visual task. The importance of the signified events to driving is reflected in the warnings, and safety-critical or non-critical situations are communicated through the cues. Novel warning designs are considered, using both abstract displays, with no semantic association to the signified event, and language-based ones, using speech. These two cue designs are compared, to discover their strengths and weaknesses as car alerts. The situations in which the new cues are delivered are varied, by simulating both critical and non-critical events and both manual and autonomous car scenarios. A novel set of guidelines for using multimodal driver displays is finally provided, considering the modalities utilised, the urgency signified, and the situation simulated.
Resumo:
In political debates, the media[tisation] can determine the use of language with the aim to increase their spectacularisation and polarisation, possibly by means of criticism and humour, respectively. These linguistic strategies are often used in order to shape what was defined by Goffman as one’s face. Politicians, in particular, can recur to facework in a double sense: shaping their own face positively and/or that of their opponents negatively. Starting from the sociologic theory of face by Goffman and Levinson, with the help of corpus analysis tools, this research investigated the ways in which various forms of criticism and forms of humour were conducted in 3 electoral debates on a national scale (Germany, Ireland, and New Zealand) and 1 debate for the municipal election in Rome. The transcripts were revised after automatic transcriptions were extracted or found online, of which the audio-visual content is available on the Internet. The CADS research aimed to investigate the role that criticism and humour played within each participant’s discourse, and to identify differences and similarities among the strategies used by political leaders and moderators in different countries, and in different cultural, political, and media contexts.
Resumo:
The recording and processing of voice data raises increasing privacy concerns for users and service providers. One way to address these issues is to move processing on the edge device closer to the recording so that potentially identifiable information is not transmitted over the internet. However, this is often not possible due to hardware limitations. An interesting alternative is the development of voice anonymization techniques that remove individual speakers characteristics while preserving linguistic and acoustic information in the data. In this work, a state-of-the-art approach to sequence-to-sequence speech conversion, ini- tially based on x-vectors and bottleneck features for automatic speech recognition, is explored to disentangle the two acoustic information using different pre-trained speech and speakers representation. Furthermore, different strategies for selecting target speech representations are analyzed. Results on public datasets in terms of equal error rate and word error rate show that good privacy is achieved with limited impact on converted speech quality relative to the original method.
Resumo:
Audiometry is the main way with which hearing is evaluated, because it is a universal and standardized test. Speech tests are difficult to standardize due to the variables involved, their performance in the presence of competitive noise is of great importance. Aim: To characterize speech intelligibility in silence and in competitive noise from individuals exposed to electronically amplified music. Material and Method: It was performed with 20 university students who presented normal hearing thresholds. The speech recognition rate (SRR) was performed after fourteen hours of sound rest after the exposure to electronically amplified music and once again after sound rest, being studied in three stages: without competitive noise, in the presence of Babble-type competitive noise, in monotic listening, in signal/ noise ratio of + 5 dB and with the signal/ noise ratio of 5 dB. Results: There was greater damage in the SRR after exposure to the music and with competitive noise, and as the signal/ noise ratio decreases, the performance of individuals in the test also decreased. Conclusion: The inclusion of competitive noise in the speech tests in the audiological routine is important, because it represents the real disadvantage experienced by individuals in daily listening.
Resumo:
Esta pesquisa investiga a relação entre os repertórios de ação coletiva adotados por organizações de movimentos sociais e a efetividade das instituições participativas (IPs) que tratam das políticas de comunicações no Brasil, ou seja, o Conselho de Comunicação Social do Congresso Nacional (CCS) e a 1ª Conferência Nacional de Comunicação (ConfeCom). A discussão gira em torno das ações implementadas pelo Coletivo Intervozes, organização da sociedade civil que atua nos movimentos sociais em prol do direito à comunicação e de sua democratização. Nesse contexto, dá-se ênfase às ações por um novo marco legal e regulatório das comunicações, consideradas como resultado dos problemas de efetividade observados no CCS e na ConfeCom. O trabalho está dividido em quatro capítulos. No primeiro, o destaque é para o Coletivo Intervozes, sua história, forma de organização, além de seus principais eixos de atuação e ações. No segundo, essencialmente teórico, enfatizam-se as definições conceituais que envolvem os movimentos sociais e a mudança institucional. O capítulo 3 é dedicado à análise dos problemas de efetividade nas IPs atinentes à área de comunicações e suas relações com os repertórios de ação coletiva. Como variáveis de análise, utiliza-se o acesso/representação da sociedade civil e as funções atribuídas às IPs. No último capítulo, analisa-se as características do movimento social que reivindica um novo marco legal e regulatório das comunicações e que surgiu como ação alternativa às IPs na defesa de mudanças institucionais para o setor. Como esta é uma pesquisa qualitativa, as análises foram feitas a partir de entrevistas semiestruturadas com membros do Coletivo Intervozes e especialistas da área; de acesso a documentos públicos produzidos pela organização e a dados bibliográficos, audiovisuais e sonoros referentes ao CCS e à ConfeCom.
Resumo:
Dissertação apresentada à Escola Superior de Comunicação Social como parte dos requisitos para obtenção de grau de mestre em Audiovisual e Multimédia.
Resumo:
One of the goals in the field of Music Information Retrieval is to obtain a measure of similarity between two musical recordings. Such a measure is at the core of automatic classification, query, and retrieval systems, which have become a necessity due to the ever increasing availability and size of musical databases. This paper proposes a method for calculating a similarity distance between two music signals. The method extracts a set of features from the audio recordings, models the features, and determines the distance between models. While further work is needed, preliminary results show that the proposed method has the potential to be used as a similarity measure for musical signals.
Resumo:
Tese de Doutoramento, Gestão Interdisciplinar da Paisagem, 11 Fevereiro de 2016, Universidade dos Açores.
Resumo:
A classical application of biosignal analysis has been the psychophysiological detection of deception, also known as the polygraph test, which is currently a part of standard practices of law enforcement agencies and several other institutions worldwide. Although its validity is far from gathering consensus, the underlying psychophysiological principles are still an interesting add-on for more informal applications. In this paper we present an experimental off-the-person hardware setup, propose a set of feature extraction criteria and provide a comparison of two classification approaches, targeting the detection of deception in the context of a role-playing interactive multimedia environment. Our work is primarily targeted at recreational use in the context of a science exhibition, where the main goal is to present basic concepts related with knowledge discovery, biosignal analysis and psychophysiology in an educational way, using techniques that are simple enough to be understood by children of different ages. Nonetheless, this setting will also allow us to build a significant data corpus, annotated with ground-truth information, and collected with non-intrusive sensors, enabling more advanced research on the topic. Experimental results have shown interesting findings and provided useful guidelines for future work. Pattern Recognition
Resumo:
Les méthodes modernes d’enseignement exigent de recréer le milieu de la langue étudiée, de faire parler les élèves dans des situations différentes. En Géorgie, l’enseignement de la langue étrangère s’effectue à partir de 6 ans, en même temps que celui de la langue maternelle. Les élèves apprennent à écrire en français après l’apprentissage de l’écriture en géorgien. A l’âge de 7-10 ans, ils connaissent déjà 3 alphabets différents : le géorgien, le latin et le cyrillique. L’objectif de cet article est de proposer une méthode qui pourra faciliter l’apprentissage du français aux non francophones grâce aux moyens audiovisuels qui sont très efficaces surtout au moment quand l’enfant ne sait ni lire, ni écrire en langue étrangère. Cependant, les moyens audiovisuels doivent être utilisés à des doses normales sans empêcher l’activité de l’élève.
Resumo:
Trabalho de Projeto para obtenção do grau de Mestre em Engenharia de Eletrónica e Telecomunicações
Resumo:
Relatório Final de Estágio apresentado à Escola Superior de Dança, com vista à obtenção do grau de Mestre em Ensino de Dança.