818 resultados para speaker diarization
Resumo:
Esta dissertação tem como objetivo revelar o potencial de avaliação subjetiva, explícita ou implícita, dos predicativos de um importante gênero argumentativo na sociedade: a sentença judicial. Para isso, busca analisar esse gênero sob a perspectiva da Linguística Sistêmico-Funcional, teoria introduzida pelo linguista inglês Michael Halliday, que cuida da linguagem como um sistema de estratos gramaticais e extralinguísticos, nos quais se desenvolvem os conceitos de contexto de situação (registro) e de cultura (gênero). A respeito dos aspectos linguísticos, a análise da oração sob o ponto de vista de uma das metafunções inerentes à linguagem, a interpessoal, revela as interações desenvolvidas na materialização de um texto. Após a apresentação dessa teoria linguística e do enquadramento do gênero sentença judicial nessa abordagem, com a desmistificação da neutralidade de seu emissor, o juiz, passou-se a descrever como o magistrado estrutura a sua fundamentação nas sentenças que tem de prolatar mediante o uso de recursos argumentativos. Em seguida, na pesquisa do corpus selecionado, sentenças judiciais da Justiça Federal do Estado do Rio de Janeiro proferidas no ano de 2011, foram analisadas as ocorrências da estrutura predicativa como estratégia argumentativa de convencimento, às partes e à sociedade, de que a decisão tomada para resolver o conflito judicial foi a mais ponderada e condizente com a verossimilhança dos fatos apresentados no decorrer do processo judicial
Resumo:
O avanço e a modernização tecnológica e farmacológica nos diversos campos de atenção a saúde, tem garantido a sobrevivência de muitas crianças, especialmente as que nascem ou são portadoras de distúrbios funcionais complexos. Entretanto, se por um lado estes avanços tecnológicos permitiram a sobrevivência de crianças com diferentes distúrbios fisiológicos, por outro, gerou crianças com necessidades especiais de saúde, como as estomizadas. A emergência da criança estomizada e seu crescimento no território nacional circunscreve a problemática do estudo, uma vez que para o atendimento das demandas de cuidados desta criança, é necessário transcender os cuidados habituais, uma vez que a criança estomizada requer cuidados específicos. Objeto de estudo: o cuidado familiar à criança com estomia intestinal no contexto domiciliar. Objetivos: descrever os locais e pessoas com quem os familiares cuidadores aprenderam a cuidar da criança com estomia intestinal, identificar as práticas de cuidados realizadas pelos cuidadores e discutir os desafios que os familiares cuidadores encontraram para atender as demandas de cuidados das crianças com estomias intestinais no domicilio. Acreditando que os familiares adquirem conhecimentos para cuidar das crianças com estomias a partir de sua interação com outros sujeitos no seu ambiente social, os alicerces teóricos deste estudo estão pautados na aprendizagem social de Vygotsky e no Cuidado Centrado na Família. Descrição metodológica: a pesquisa qualitativa foi desenvolvida segundo método criativo sensível, sendo implementadas as dinâmicas de criatividade e sensibilidade Mapa Falante e Corpo Saber no domicílio de seis grupos de familiares cuidadores. A análise de discurso francesa foi aplicada à interpretação e à explicação dos materiais empíricos emergentes do trabalho de campo. Resultados: O hospital, o ambulatório, o domicilio e o contexto societal, emergiram como locais de aprendizado dos familiares cuidadores, cuja mediação foi realizada por profissionais de saúde e pelas mães cuidadoras. A vivência diária dos familiares no cuidado a criança com colostomia e ileostomia, fizeram com que eles criassem novas possibilidades de cuidar através de tentativas, erros e acertos na busca por uma melhor qualidade de vida de seus filhos. Os desafios relacionados às dimensões subjetivas, da prática do cuidar, social e econômica representaram algumas das situações de enfrentamento vivenciadas pelos familiares. O estudo aponta a necessidade de repadronização dos cuidados, onde novos dispositivos tecnológicos de saúde sejam criados e disponibilizados para esta clientela. A falta desses dispositivos faz com que o familiar cuidador tenha a necessidade de adaptações no cuidado da criança a fim de tornar possível o atendimento das demandas relativas a cada etapa de seu desenvolvimento infantil. Além disso, novas políticas públicas de saúde devem ser pensadas a fim de atender integralmente às múltiplas demandas da criança portadora de colostomia ou ileostomia.
Resumo:
Esta pesquisa foca a imagem profissional do professor de educação física, considerando a notoriedade que essa profissão adquiriu na televisão ao longo dos últimos anos. O objetivo é explicitar alguns dos sentidos relacionados à abordagem da educação física presente no contexto de duas programações exibidas pela Rede Globo de televisão. A primeira foi retirada da décima oitava temporada da telenovela Malhação, e a segunda do quadro MEDIDA CERTA/ 90 DIAS PARA REPROGRAMAR O CORPO exibido pelo programa Fantástico no ano de 2011. A coleta dos dados ocorreu através do acesso ao site da emissora, que disponibiliza por um determinado tempo, os capítulos da telenovela, bem como as edições do quadro em foco. O método utilizado para nortear a análise do processo de produção dos sentidos foi o referencial teórico da Análise do Discurso segundo a perspectiva de Orlandi. Esse método nos permite percorrer trajetórias capazes de evidenciar sentidos explícitos e implícitos coexistentes no discurso. Assim foram trilhados caminhos específicos para compreensão de cada fenômeno, ou seja, para os dados retirados de Malhação lançamos mão da transcrição de diálogos, apresentação de cenas e consideração dos planos de câmera utilizados em sua produção. Para os dados provenientes do quadro MEDIDA CERTA, realizamos transcrição de falas e construímos categorias, que foram analisadas a partir de uma grade já existente, considerando o título, quem fala, o que é dito, o intermediário e as estratégias utilizadas para publicação do discurso. Quanto aos resultados, foi possível evidenciar que os sentidos relativos à educação física presentes na décima oitava temporada de Malhação foram inspirados basicamente no paradigma competitivo da área. Também foram conservados estereótipos ultrapassados acerca do profissional e da profissão, pois, os sentidos relativos ao professor de educação física idoso foram reduzidos à potencialização de características ligadas à desatualização. E no que confere aos resultados obtidos com a análise do segundo fenômeno, clarificamos que a educação física foi abordada com a utilização do discurso científico, a partir de estratégias pautadas no exemplo testado pelos jornalistas, e oferecidos aos telespectadores. A proposta presente no quadro MEDIDA CERTA colaborou para o entendimento de uma educação física baseada na perspectiva biológica, que se estabelece frente à imposição de um risco ao sujeito, se valorizando como fonte de salvação e encontrando-se fragmentada das questões sociais. As análises de ambos os fenômenos evidenciam significados que permeiam a educação física representada na mídia no atual momento histórico, propiciando também reflexões sobre alguns dos princípios que regem a elaboração de práticas corporais no âmbito da educação física.
Resumo:
This paper describes results obtained using the modified Kanerva model to perform word recognition in continuous speech after being trained on the multi-speaker Alvey 'Hotel' speech corpus. Theoretical discoveries have recently enabled us to increase the speed of execution of part of the model by two orders of magnitude over that previously reported by Prager & Fallside. The memory required for the operation of the model has been similarly reduced. The recognition accuracy reaches 95% without syntactic constraints when tested on different data from seven trained speakers. Real time simulation of a model with 9,734 active units is now possible in both training and recognition modes using the Alvey PARSIFAL transputer array. The modified Kanerva model is a static network consisting of a fixed nonlinear mapping (location matching) followed by a single layer of conventional adaptive links. A section of preprocessed speech is transformed by the non-linear mapping to a high dimensional representation. From this intermediate representation a simple linear mapping is able to perform complex pattern discrimination to form the output, indicating the nature of the speech features present in the input window.
Resumo:
Four types of neural networks which have previously been established for speech recognition and tested on a small, seven-speaker, 100-sentence database are applied to the TIMIT database. The networks are a recurrent network phoneme recognizer, a modified Kanerva model morph recognizer, a compositional representation phoneme-to-word recognizer, and a modified Kanerva model morph-to-word recognizer. The major result is for the recurrent net, giving a phoneme recognition accuracy of 57% from the si and sx sentences. The Kanerva morph recognizer achieves 66.2% accuracy for a small subset of the sa and sx sentences. The results for the word recognizers are incomplete.
Resumo:
As the use of found data increases, more systems are being built using adaptive training. Here transforms are used to represent unwanted acoustic variability, e.g. speaker and acoustic environment changes, allowing a canonical model that models only the "pure" variability of speech to be trained. Adaptive training may be described within a Bayesian framework. By using complexity control approaches to ensure robust parameter estimates, the standard point estimate adaptive training can be justified within this Bayesian framework. However during recognition there is usually no control over the amount of data available. It is therefore preferable to be able to use a full Bayesian approach to applying transforms during recognition rather than the standard point estimates. This paper discusses various approximations to Bayesian approaches including a new variational Bayes approximation. The application of these approaches to state-of-the-art adaptively trained systems using both CAT and MLLR transforms is then described and evaluated on a large vocabulary speech recognition task. © 2005 IEEE.
Resumo:
In this paper we present the process of designing an efficient speech corpus for the first unit selection speech synthesis system for Bulgarian, along with some significant preliminary results regarding the quality of the resulted system. As the initial corpus is a crucial factor for the quality delivered by the Text-to-Speech system, special effort has been given in designing a complete and efficient corpus for use in a unit selection TTS system. The targeted domain of the TTS system and hence that of the corpus is the news reports, and although it is a restricted one, it is characterized by an unlimited vocabulary. The paper focuses on issues regarding the design of an optimal corpus for such a framework and the ideas on which our approach was based on. A novel multi-stage approach is presented, with special attention given to language and speaker dependent issues, as they affect the entire process. The paper concludes with the presentation of our results and the evaluation experiments, which provide clear evidence of the quality level achieved. © 2011 Springer-Verlag.
Resumo:
Over the past 50 years, economic and technological developments have dramatically increased the human contribution to ambient noise in the ocean. The dominant frequencies of most human-made noise in the ocean is in the low-frequency range (defined as sound energy below 1000Hz), and low-frequency sound (LFS) may travel great distances in the ocean due to the unique propagation characteristics of the deep ocean (Munk et al. 1989). For example, in the Northern Hemisphere oceans low-frequency ambient noise levels have increased by as much as 10 dB during the period from 1950 to 1975 (Urick 1986; review by NRC 1994). Shipping is the overwhelmingly dominant source of low-frequency manmade noise in the ocean, but other sources of manmade LFS including sounds from oil and gas industrial development and production activities (seismic exploration, construction work, drilling, production platforms), and scientific research (e.g., acoustic tomography and thermography, underwater communication). The SURTASS LFA system is an additional source of human-produced LFS in the ocean, contributing sound energy in the 100-500 Hz band. When considering a document that addresses the potential effects of a low-frequency sound source on the marine environment, it is important to focus upon those species that are the most likely to be affected. Important criteria are: 1) the physics of sound as it relates to biological organisms; 2) the nature of the exposure (i.e. duration, frequency, and intensity); and 3) the geographic region in which the sound source will be operated (which, when considered with the distribution of the organisms will determine which species will be exposed). The goal in this section of the LFA/EIS is to examine the status, distribution, abundance, reproduction, foraging behavior, vocal behavior, and known impacts of human activity of those species may be impacted by LFA operations. To focus our efforts, we have examined species that may be physically affected and are found in the region where the LFA source will be operated. The large-scale geographic location of species in relation to the sound source can be determined from the distribution of each species. However, the physical ability for the organism to be impacted depends upon the nature of the sound source (i.e. explosive, impulsive, or non-impulsive); and the acoustic properties of the medium (i.e. seawater) and the organism. Non-impulsive sound is comprised of the movement of particles in a medium. Motion is imparted by a vibrating object (diaphragm of a speaker, vocal chords, etc.). Due to the proximity of the particles in the medium, this motion is transmitted from particle to particle in waves away from the sound source. Because the particle motion is along the same axis as the propagating wave, the waves are longitudinal. Particles move away from then back towards the vibrating source, creating areas of compression (high pressure) and areas of rarefaction (low pressure). As the motion is transferred from one particle to the next, the sound propagates away from the sound source. Wavelength is the distance from one pressure peak to the next. Frequency is the number of waves passing per unit time (Hz). Sound velocity (not to be confused with particle velocity) is the impedance is loosely equivalent to the resistance of a medium to the passage of sound waves (technically it is the ratio of acoustic pressure to particle velocity). A high impedance means that acoustic particle velocity is small for a given pressure (low impedance the opposite). When a sound strikes a boundary between media of different impedances, both reflection and refraction, and a transfer of energy can occur. The intensity of the reflection is a function of the intensity of the sound wave and the impedances of the two media. Two key factors in determining the potential for damage due to a sound source are the intensity of the sound wave and the impedance difference between the two media (impedance mis-match). The bodies of the vast majority of organisms in the ocean (particularly phytoplankton and zooplankton) have similar sound impedence values to that of seawater. As a result, the potential for sound damage is low; organisms are effectively transparent to the sound – it passes through them without transferring damage-causing energy. Due to the considerations above, we have undertaken a detailed analysis of species which met the following criteria: 1) Is the species capable of being physically affected by LFS? Are acoustic impedence mis-matches large enough to enable LFS to have a physical affect or allow the species to sense LFS? 2) Does the proposed SURTASS LFA geographical sphere of acoustic influence overlap the distribution of the species? Species that did not meet the above criteria were excluded from consideration. For example, phytoplankton and zooplankton species lack acoustic impedance mis-matches at low frequencies to expect them to be physically affected SURTASS LFA. Vertebrates are the organisms that fit these criteria and we have accordingly focused our analysis of the affected environment on these vertebrate groups in the world’s oceans: fishes, reptiles, seabirds, pinnipeds, cetaceans, pinnipeds, mustelids, sirenians (Table 1).
Resumo:
Recently there has been interest in combined gen- erative/discriminative classifiers. In these classifiers features for the discriminative models are derived from generative kernels. One advantage of using generative kernels is that systematic approaches exist how to introduce complex dependencies beyond conditional independence assumptions. Furthermore, by using generative kernels model-based compensation/adaptation tech- niques can be applied to make discriminative models robust to noise/speaker conditions. This paper extends previous work with combined generative/discriminative classifiers in several directions. First, it introduces derivative kernels based on context- dependent generative models. Second, it describes how derivative kernels can be incorporated in continuous discriminative models. Third, it addresses the issues associated with large number of classes and parameters when context-dependent models and high- dimensional features of derivative kernels are used. The approach is evaluated on two noise-corrupted tasks: small vocabulary AURORA 2 and medium-to-large vocabulary AURORA 4 task.
Resumo:
Recently there has been interest in combining generative and discriminative classifiers. In these classifiers features for the discriminative models are derived from the generative kernels. One advantage of using generative kernels is that systematic approaches exist to introduce complex dependencies into the feature-space. Furthermore, as the features are based on generative models standard model-based compensation and adaptation techniques can be applied to make discriminative models robust to noise and speaker conditions. This paper extends previous work in this framework in several directions. First, it introduces derivative kernels based on context-dependent generative models. Second, it describes how derivative kernels can be incorporated in structured discriminative models. Third, it addresses the issues associated with large number of classes and parameters when context-dependent models and high-dimensional feature-spaces of derivative kernels are used. The approach is evaluated on two noise-corrupted tasks: small vocabulary AURORA 2 and medium-to-large vocabulary AURORA 4 task. © 2011 IEEE.
Resumo:
Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.
Resumo:
Adaptation to speaker and environment changes is an essential part of current automatic speech recognition (ASR) systems. In recent years the use of multi-layer percpetrons (MLPs) has become increasingly common in ASR systems. A standard approach to handling speaker differences when using MLPs is to apply a global speaker-specific constrained MLLR (CMLLR) transform to the features prior to training or using the MLP. This paper considers the situation when there are both speaker and channel, communication link, differences in the data. A more powerful transform, front-end CMLLR (FE-CMLLR), is applied to the inputs to the MLP to represent the channel differences. Though global, these FE-CMLLR transforms vary from time-instance to time-instance. Experiments on a channel distorted dialect Arabic conversational speech recognition task indicates the usefulness of adapting MLP features using both CMLLR and FE-CMLLR transforms. © 2013 IEEE.
Resumo:
This paper presents an overview of the Text-to-Speech synthesis system developed at the Institute for Language and Speech Processing (ILSP). It focuses on the key issues regarding the design of the system components. The system currently fully supports three languages (Greek, English, Bulgarian) and is designed in such a way to be as language and speaker independent as possible. Also, experimental results are presented which show that the system produces high quality synthetic speech in terms of naturalness and intelligibility. The system was recently ranked among the first three systems worldwide in terms of achieved quality for the English language, at the international Blizzard Challenge 2013 workshop. © 2014 Springer International Publishing.
Resumo:
Based on biomimetic pattern recognition theory, we proposed a novel speaker-independent continuous speech keyword-spotting algorithm. Without endpoint detection and division, we can get the minimum distance curve between continuous speech samples and every keyword-training net through the dynamic searching to the feature-extracted continuous speech. Then we can count the number of the keywords by investigating the vale-value and the numbers of the vales in the curve. Experiments of small vocabulary continuous speech with various speaking rate have got good recognition results and proved the validity of the algorithm.
Resumo:
In this paper, we presents HyperSausage Neuron based on the High-Dimension Space(HDS), and proposes a new algorithm for speaker independent continuous digit speech recognition. At last, compared to HMM-based method, the recognition rate of HyperSausage Neuron method is higher than that of in HMM-based method.