993 resultados para emotional speech
Resumo:
This study explored pre-service secondary science teachers’ perceptions of classroom emotional climate in the context of the Bhutanese macro-social policy of Gross National Happiness. Drawing upon sociological perspectives of human emotions and using Interaction Ritual Theory this study investigated how pre-service science teachers may be supported in their professional development. It was a multi-method study involving video and audio recordings of teaching episodes supported by interviews and the researcher’s diary. Students also registered their perceptions of the emotional climate of their classroom at 3-minute intervals using audience response technology. In this way, emotional events were identified for video analysis. The findings of this study highlighted that the activities pre-service teachers engaged in matter to them. Positive emotional climate was identified in activities involving students’ presentations using video clips and models, coteaching, and interactive whole class discussions. Decreases in emotional climate were identified during formal lectures and when unprepared presenters led presentations. Emotions such as frustration and disappointment characterized classes with negative emotional climate. The enabling conditions to sustain a positive emotional climate are identified. Implications for sustaining macro-social policy about Gross National Happiness are considered in light of the climate that develops in science teacher education classes.
Resumo:
The literature demonstrates that understanding relating to the use of materials in product design has been investigated from both engineering and design perspectives. However, none of these studies have explored the consumers’ concepts of the materials; rather they have focused on participants’ discussions of material samples. Consumers’ emotional reactions to the materials themselves or the consumers’ reaction to the durability of the materials have not been previously explored in depth. This research has investigated these issues and has found that consumers have very specific concepts about materials. Furthermore, the combinations of consumer concepts that are likely to elicit an emotional judgement by the consumer have also been identified. It was found that consumers are conscious of the durability of their products and the materials that they are made from. This knowledge contributes to the support of environmentally conscious design, as well as user-centered design knowledge and practice. An understanding of the emotion consumers attribute to the effect wear and aging had on the materials’ physical appearance has been achieved. This understanding of consumers’ emotional reactions to materials can contribute not only to design considerations but to knowledge regarding the promotion of prolonged product-user relationships.
Resumo:
We propose a novel technique for robust voiced/unvoiced segment detection in noisy speech, based on local polynomial regression. The local polynomial model is well-suited for voiced segments in speech. The unvoiced segments are noise-like and do not exhibit any smooth structure. This property of smoothness is used for devising a new metric called the variance ratio metric, which, after thresholding, indicates the voiced/unvoiced boundaries with 75% accuracy for 0dB global signal-to-noise ratio (SNR). A novelty of our algorithm is that it processes the signal continuously, sample-by-sample rather than frame-by-frame. Simulation results on TIMIT speech database (downsampled to 8kHz) for various SNRs are presented to illustrate the performance of the new algorithm. Results indicate that the algorithm is robust even in high noise levels.
Resumo:
We investigate the use of a two stage transform vector quantizer (TSTVQ) for coding of line spectral frequency (LSF) parameters in wideband speech coding. The first stage quantizer of TSTVQ, provides better matching of source distribution and the second stage quantizer provides additional coding gain through using an individual cluster specific decorrelating transform and variance normalization. Further coding gain is shown to be achieved by exploiting the slow time-varying nature of speech spectra and thus using inter-frame cluster continuity (ICC) property in the first stage of TSTVQ method. The proposed method saves 3-4 bits and reduces the computational complexity by 58-66%, compared to the traditional split vector quantizer (SVQ), but at the expense of 1.5-2.5 times of memory.
Resumo:
The emergence of new technologies has revolutionized the way companies interact and build relationships with customers. The channel–customer relationship has traditionally been managed via a push approach in communication (“What can we sell customers?”) with the hope of cultivating customer loyalty. However, emotional understandings of customers and how they feel about a product, service, or business can drastically alter consumers’ engagement, behavior, and purchasing preferences. This rapidly evolving landscape has left managers at a loss, and what they are experiencing is likely the beginning of a tectonic shift in the way digital channels are designed, monitored, and managed. In this article, digital channel relationships are examined, and useful concepts for clarifying and refining the emotional meaning behind company strategy and their relationship to corresponding digital channels are detailed. Using three case study examples, we discuss the process and impact of such emotionally aware digital channel designs. Recommendations are made regarding how companies can select, design, and maintain digital engagements based on their strategy and industry needs.
Resumo:
We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of ``virtual pattern'' developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy.
Resumo:
Speech has both auditory and visual components (heard speech sounds and seen articulatory gestures). During all perception, selective attention facilitates efficient information processing and enables concentration on high-priority stimuli. Auditory and visual sensory systems interact at multiple processing levels during speech perception and, further, the classical motor speech regions seem also to participate in speech perception. Auditory, visual, and motor-articulatory processes may thus work in parallel during speech perception, their use possibly depending on the information available and the individual characteristics of the observer. Because of their subtle speech perception difficulties possibly stemming from disturbances at elemental levels of sensory processing, dyslexic readers may rely more on motor-articulatory speech perception strategies than do fluent readers. This thesis aimed to investigate the neural mechanisms of speech perception and selective attention in fluent and dyslexic readers. We conducted four functional magnetic resonance imaging experiments, during which subjects perceived articulatory gestures, speech sounds, and other auditory and visual stimuli. Gradient echo-planar images depicting blood oxygenation level-dependent contrast were acquired during stimulus presentation to indirectly measure brain hemodynamic activation. Lip-reading activated the primary auditory cortex, and selective attention to visual speech gestures enhanced activity within the left secondary auditory cortex. Attention to non-speech sounds enhanced auditory cortex activity bilaterally; this effect showed modulation by sound presentation rate. A comparison between fluent and dyslexic readers' brain hemodynamic activity during audiovisual speech perception revealed stronger activation of predominantly motor speech areas in dyslexic readers during a contrast test that allowed exploration of the processing of phonetic features extracted from auditory and visual speech. The results show that visual speech perception modulates hemodynamic activity within auditory cortex areas once considered unimodal, and suggest that the left secondary auditory cortex specifically participates in extracting the linguistic content of seen articulatory gestures. They are strong evidence for the importance of attention as a modulator of auditory cortex function during both sound processing and visual speech perception, and point out the nature of attention as an interactive process (influenced by stimulus-driven effects). Further, they suggest heightened reliance on motor-articulatory and visual speech perception strategies among dyslexic readers, possibly compensating for their auditory speech perception difficulties.
Resumo:
We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single Hidden Markov Model. It is shown that such a solution is possible by aligning the K patterns using the proposed Multi Pattern Dynamic Time Warping algorithm followed by the Constrained Multi Pattern Viterbi Algorithm The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB Signal to Noise Ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.
Resumo:
Purpose The purpose of this study is to identify and understand the emotions behind a passenger’s airport experience and how this can inform digital channel engagements. Design/methodology/approach This study investigates the emotional experience of two hundred (200) passengers’ journeys at an Australian domestic airport. A survey was conducted which implemented the use of Emocards and an interview approach of laddering. The responses were then analysed into attributes, consequences and values. Findings The results indicate that across key stages of the airport (parking, retail, gates and arrivals) passengers had different emotional experiences (positive, negative and neutral). The attributes, consequences and values behind these emotions were then used to propose digital channel content and purpose of various future digital channel engagements. Research limitations/implications By gaining emotional insights airports are able to generate digital channel engagements, which align with passengers’ needs and values rather than internal operational motivations. Theoretical contributions include the development of the Technology Acceptance Model to include emotional drivers as influences in the use of digital channels. Originality/value This research provides a unique method to understand the passengers’ emotional journey across the airport infrastructure and suggest how to better design digital channel engagements to address passenger latent needs.
Resumo:
Considering a general linear model of signal degradation, by modeling the probability density function (PDF) of the clean signal using a Gaussian mixture model (GMM) and additive noise by a Gaussian PDF, we derive the minimum mean square error (MMSE) estimator. The derived MMSE estimator is non-linear and the linear MMSE estimator is shown to be a special case. For speech signal corrupted by independent additive noise, by modeling the joint PDF of time-domain speech samples of a speech frame using a GMM, we propose a speech enhancement method based on the derived MMSE estimator. We also show that the same estimator can be used for transform-domain speech enhancement.
Resumo:
Life of children exposed to alcohol or drugs in utero This study focused on the growth environment, physical development and socio-emotional development of children, aged 16 and under, who had been exposed to alcohol (n=78) or drugs (n=15) in utero. The aim of the study was to obtain a comprehensive picture of the living conditions of these children and to examine the role of the growth environment in their development. The study was carried out using questionnaires, written life stories and interviews. Attachment theory was used as a background theory in the study. Over half of the children exposed to alcohol were diagnosed with foetal alcohol syndrome (FAS), one quarter was diagnosed with foetal alcohol effects (FAE), and one fifth had no diagnosis. Most of the children exposed to drugs had been exposed to either amphetamines or cannabis, and a smaller number to heroin. Some of the children exposed to alcohol were mentally handicapped or intellectually impaired. The children exposed to drugs did not exhibit any serious learning difficulties but a considerable number of them had socio-emotional development problems. Language and speech problems and attention, concentration and social interaction problems were typical among both the children exposed to alcohol and those exposed to drugs. Only one child had been placed into long-term foster care in a family immediately after leaving the maternity hospital. In biological families there had been neglect, violence, mental health problems, crime and unemployment, and many parents were already dead. Two of the children had been sexually abused and four were suspected of having been abused. From the point of view of the children's development, the three most critical issues were 1) the range of illnesses and handicaps that had impaired their functional capacity as a result of their prenatal exposure to alcohol, 2) child's age at the time of placement on a long-term basis, and 3) the number of their traumatic experiences. The relationship with their biological parents after placement also played a role. Children with symptoms were found in all diagnosis categories and types of exposure. Children with the smallest number of symptoms were found among those who had never lived with their biological parents. Almost all children were exhibiting strong symptoms at the time of placement in foster care. In most cases, they were behaving in a disorderly manner towards others, but some children were withdrawn. The most conspicuous feature among those with the most severe symptoms was their disorganized behaviour. Placement in a foster family enhanced the children's development, but did not solve the problems. The foster parents who brought these children up did not receive as much therapy for the children and support for the upbringing as they appear to have needed. In Finland, transfer to long-term custody is based on strict criteria. The rights of children prescribed in the child protection law are not fulfilled in practice. Key words: FASD, FAS, FAE, alcohol exposure, drugs exposure, illegal drugs, early interaction, child development, attachment
Resumo:
The dissertation examines how emotional experiences are oriented to in the details of psychotherapeutic interaction. The data (57 audio recorded sessions) come from one therapist-patient dyad in cognitive psychotherapy. Conversation analysis is used as method. The dissertation consists of 4 original articles and a summary. The analyses explicate the therapist s practices of responding to the patient s affective expressions. Different types of affiliating responses are identified. It is shown that the affiliating responses are combined with, or build grounds for, more interpretive and challenging actions. The study also includes a case study of a session with strong misalignment between the therapist s and patient s orientations, showing how this misalignment is managed by the therapist. Moreover, through a longitudinal analysis of the transformation of a sequence type, the study suggests that therapeutic change processes can be located to sequential relations of actions. The practices found in this study are compared to earlier research on everyday talk and on medical encounters. It is suggested that in psychotherapeutic interaction, the generic norms of interaction considering affiliation and epistemic access, are modified for the purposes of therapeutic work. The study also shows that the practices of responding to emotional experience in psychotherapy can deviate from the everyday practices of affiliation. The results of the study are also discussed in terms of concepts arising from clinical theory. These include empathy, validation of emotion, therapeutic alliance, interpretation, challenging beliefs, and therapeutic change. The therapist s approach described in this study involves practical integration of different clinical theories. In general terms, the study suggests that in the details of interaction, psychotherapy recurrently performs a dual task of empathy and challenging in relation to the patient s ways of describing their experiences. Methodologically, the study discusses the problem of identifying actions in conversation analysis of psychotherapy and emotional interaction, and the possibility to apply conversation analysis in the study of therapeutic change.
New Method for Delexicalization and its Application to Prosodic Tagging for Text-to-Speech Synthesis
Resumo:
This paper describes a new flexible delexicalization method based on glottal excited parametric speech synthesis scheme. The system utilizes inverse filtered glottal flow and all-pole modelling of the vocal tract. The method provides a possibil- ity to retain and manipulate all relevant prosodic features of any kind of speech. Most importantly, the features include voice quality, which has not been properly modeled in earlier delex- icalization methods. The functionality of the new method was tested in a prosodic tagging experiment aimed at providing word prominence data for a text-to-speech synthesis system. The ex- periment confirmed the usefulness of the method and further corroborated earlier evidence that linguistic factors influence the perception of prosodic prominence.
Resumo:
We propose a simple speech music discriminator that uses features based on HILN(Harmonics, Individual Lines and Noise) model. We have been able to test the strength of the feature set on a standard database of 66 files and get an accuracy of around 97%. We also have tested on sung queries and polyphonic music and have got very good results. The current algorithm is being used to discriminate between sung queries and played (using an instrument like flute) queries for a Query by Humming(QBH) system currently under development in the lab.