898 resultados para performativity of speech
Resumo:
Speech recognition technology is regarded as a key enabler for increasing the usability of applications deployed on mobile devices -- devices which are becoming increasingly prevalent in modern hospital-based healthcare. Although the use of speech recognition is not new to the hospital-based healthcare domain, its use with mobile devices has thus far been limited. This paper presents the results of a literature review we conducted in order to observe the manner in which speech recognition technology has been used in hospital-based healthcare and to gain an understanding of how this technology is being evaluated, in terms of its dependability and reliability, in healthcare settings. Our intent is that this review will help identify scope for future uses of speech recognition technologies in the healthcare domain, as well as to identify implications for the meaningful evaluation of such technologies given the specific context of use.
Resumo:
Purpose: Phonological accounts of reading implicate three aspects of phonological awareness tasks that underlie the relationship with reading; a) the language-based nature of the stimuli (words or nonwords), b) the verbal nature of the response, and c) the complexity of the stimuli (words can be segmented into units of speech). Yet, it is uncertain which task characteristics are most important as they are typically confounded. By systematically varying response-type and stimulus complexity across speech and non-speech stimuli, the current study seeks to isolate the characteristics of phonological awareness tasks that drive the prediction of early reading. Method: Four sets of tasks were created; tone stimuli (simple non-speech) requiring a non-verbal response, phonemes (simple speech) requiring a non-verbal response, phonemes requiring a verbal response, and nonwords (complex speech) requiring a verbal response. Tasks were administered to 570 2nd grade children along with standardized tests of reading and non-verbal IQ. Results: Three structural equation models comparing matched sets of tasks were built. Each model consisted of two 'task' factors with a direct link to a reading factor. The following factors predicted unique variance in reading: a) simple speech and non-speech stimuli, b) simple speech requiring a verbal response but not simple speech requiring a non-verbal-response, and c) complex and simple speech stimuli. Conclusions: Results suggest that the prediction of reading by phonological tasks is driven by the verbal nature of the response and not the complexity or 'speechness' of the stimuli. Findings highlight the importance of phonological output processes to early reading.
Resumo:
Auditory Training (AT) describes a regimen of varied listening exercises designed to improve an individual’s ability to perceive speech. The theory of AT is based on brain plasticity (the capacity of neurones in the central auditory system to alter their structure and function) in response to auditory stimulation. The practice of repeatedly listening to the speech sounds included in AT exercises is believed to drive the development of more efficient neuronal pathways, thereby improving auditory processing and speech discrimination. This critical review aims to assess whether auditory training can improve speech discrimination in adults with mild-moderate SNHL. The majority of patients attending Audiology services are adults with presbyacusis and it is therefore important to evaluate evidence of any treatment effect of AT in aural rehabilitation. Ideally this review would seek to appraise evidence of neurophysiological effects of AT so as to verify whether it does induce change in the CAS. However, due to the absence of such studies on this particular patient group, the outcome measure of speech discrimination, as a behavioural indicator of treatment effect is used instead. A review of available research was used to inform an argument for or against using AT in rehabilitative clinical practice. Six studies were identified and although the preliminary evidence indicates an improvement gained from a range of AT paradigms, the treatment effect size was modest and there remains a lack of large-sample RCTs. Future investigation into the efficacy of AT needs to employ neurophysiological studies using auditory evoked potentials in hearing-impaired adults in order to explore effects of AT on the CAS.
Resumo:
An important aspect of speech perception is the ability to group or select formants using cues in the acoustic source characteristics-for example, fundamental frequency (F0) differences between formants promote their segregation. This study explored the role of more radical differences in source characteristics. Three-formant (F1+F2+F3) synthetic speech analogues were derived from natural sentences. In Experiment 1, F1+F3 were generated by passing a harmonic glottal source (F0 = 140 Hz) through second-order resonators (H1+H3); in Experiment 2, F1+F3 were tonal (sine-wave) analogues (T1+T3). F2 could take either form (H2 or T2). In some conditions, the target formants were presented alone, either monaurally or dichotically (left ear = F1+F3; right ear = F2). In others, they were accompanied by a competitor for F2 (F1+F2C+F3; F2), which listeners must reject to optimize recognition. Competitors (H2C or T2C) were created using the time-reversed frequency and amplitude contours of F2. Dichotic presentation of F2 and F2C ensured that the impact of the competitor arose primarily through informational masking. In the absence of F2C, the effect of a source mismatch between F1+F3 and F2 was relatively modest. When F2C was present, intelligibility was lowest when F2 was tonal and F2C was harmonic, irrespective of which type matched F1+F3. This finding suggests that source type and context, rather than similarity, govern the phonetic contribution of a formant. It is proposed that wideband harmonic analogues are more effective informational maskers than narrowband tonal analogues, and so become dominant in across-frequency integration of phonetic information when placed in competition.
Resumo:
Microposts are small fragments of social media content that have been published using a lightweight paradigm (e.g. Tweets, Facebook likes, foursquare check-ins). Microposts have been used for a variety of applications (e.g., sentiment analysis, opinion mining, trend analysis), by gleaning useful information, often using third-party concept extraction tools. There has been very large uptake of such tools in the last few years, along with the creation and adoption of new methods for concept extraction. However, the evaluation of such efforts has been largely consigned to document corpora (e.g. news articles), questioning the suitability of concept extraction tools and methods for Micropost data. This report describes the Making Sense of Microposts Workshop (#MSM2013) Concept Extraction Challenge, hosted in conjunction with the 2013 World Wide Web conference (WWW'13). The Challenge dataset comprised a manually annotated training corpus of Microposts and an unlabelled test corpus. Participants were set the task of engineering a concept extraction system for a defined set of concepts. Out of a total of 22 complete submissions 13 were accepted for presentation at the workshop; the submissions covered methods ranging from sequence mining algorithms for attribute extraction to part-of-speech tagging for Micropost cleaning and rule-based and discriminative models for token classification. In this report we describe the evaluation process and explain the performance of different approaches in different contexts.
Resumo:
It has been proposed that language impairments in children with Autism Spectrum Disorders (ASD) stem from atypical neural processing of speech and/or nonspeech sounds. However, the strength of this proposal is compromised by the unreliable outcomes of previous studies of speech and nonspeech processing in ASD. The aim of this study was to determine whether there was an association between poor spoken language and atypical event-related field (ERF) responses to speech and nonspeech sounds in children with ASD (n = 14) and controls (n = 18). Data from this developmental population (ages 6-14) were analysed using a novel combination of methods to maximize the reliability of our findings while taking into consideration the heterogeneity of the ASD population. The results showed that poor spoken language scores were associated with atypical left hemisphere brain responses (200 to 400 ms) to both speech and nonspeech in the ASD group. These data support the idea that some children with ASD may have an immature auditory cortex that affects their ability to process both speech and nonspeech sounds. Their poor speech processing may impair their ability to process the speech of other people, and hence reduce their ability to learn the phonology, syntax, and semantics of their native language.
Resumo:
Objective: The aim of this study was to design a novel experimental approach to investigate the morphological characteristics of auditory cortical responses elicited by rapidly changing synthesized speech sounds. Methods: Six sound-evoked magnetoencephalographic (MEG) responses were measured to a synthesized train of speech sounds using the vowels /e/ and /u/ in 17 normal hearing young adults. Responses were measured to: (i) the onset of the speech train, (ii) an F0 increment; (iii) an F0 decrement; (iv) an F2 decrement; (v) an F2 increment; and (vi) the offset of the speech train using short (jittered around 135. ms) and long (1500. ms) stimulus onset asynchronies (SOAs). The least squares (LS) deconvolution technique was used to disentangle the overlapping MEG responses in the short SOA condition only. Results: Comparison between the morphology of the recovered cortical responses in the short and long SOAs conditions showed high similarity, suggesting that the LS deconvolution technique was successful in disentangling the MEG waveforms. Waveform latencies and amplitudes were different for the two SOAs conditions and were influenced by the spectro-temporal properties of the sound sequence. The magnetic acoustic change complex (mACC) for the short SOA condition showed significantly lower amplitudes and shorter latencies compared to the long SOA condition. The F0 transition showed a larger reduction in amplitude from long to short SOA compared to the F2 transition. Lateralization of the cortical responses were observed under some stimulus conditions and appeared to be associated with the spectro-temporal properties of the acoustic stimulus. Conclusions: The LS deconvolution technique provides a new tool to study the properties of the auditory cortical response to rapidly changing sound stimuli. The presence of the cortical auditory evoked responses for rapid transition of synthesized speech stimuli suggests that the temporal code is preserved at the level of the auditory cortex. Further, the reduced amplitudes and shorter latencies might reflect intrinsic properties of the cortical neurons to rapidly presented sounds. Significance: This is the first demonstration of the separation of overlapping cortical responses to rapidly changing speech sounds and offers a potential new biomarker of discrimination of rapid transition of sound.
Resumo:
Peer reviewed
Resumo:
This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.
In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.
In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.
Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.
We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.
Resumo:
A previous genome-wide association study (GWAS) of more than 100,000 individuals identified molecular-genetic predictors of educational attainment. We undertook in-depth life-course investigation of the polygenic score derived from this GWAS using the four-decade Dunedin Study (N = 918). There were five main findings. First, polygenic scores predicted adult economic outcomes even after accounting for educational attainments. Second, genes and environments were correlated: Children with higher polygenic scores were born into better-off homes. Third, children's polygenic scores predicted their adult outcomes even when analyses accounted for their social-class origins; social-mobility analysis showed that children with higher polygenic scores were more upwardly mobile than children with lower scores. Fourth, polygenic scores predicted behavior across the life course, from early acquisition of speech and reading skills through geographic mobility and mate choice and on to financial planning for retirement. Fifth, polygenic-score associations were mediated by psychological characteristics, including intelligence, self-control, and interpersonal skill. Effect sizes were small. Factors connecting DNA sequence with life outcomes may provide targets for interventions to promote population-wide positive development.
Resumo:
This study examines how one secondary school teacher’s use of purposeful oral mathematics language impacted her students’ language use and overall communication in written solutions while working with word problems in a grade nine academic mathematics class. Mathematics is often described as a distinct language. As with all languages, students must develop a sense for oral language before developing social practices such as listening, respecting others ideas, and writing. Effective writing is often seen by students that have strong oral language skills. Classroom observations, teacher and student interviews, and collected student work served as evidence to demonstrate the nature of both the teacher’s and the students’ use of oral mathematical language in the classroom, as well as the effect the discourse and language use had on students’ individual written solutions while working on word problems. Inductive coding for themes revealed that the teacher’s purposeful use of oral mathematical language had a positive impact on students’ written solutions. The teacher’s development of a mathematical discourse community created a space for the students to explore mathematical language and concepts that facilitated a deeper level of conceptual understanding of the learned material. The teacher’s oral language appeared to transfer into students written work albeit not with the same complexity of use of the teacher’s oral expression of the mathematical register. Students that learn mathematical language and concepts better appear to have a growth mindset, feel they have ownership over their learning, use reorganizational strategies, and help develop a discourse community.
Resumo:
Although persuasion often occurs via oral communication, it remains a comparatively understudied area. This research tested the hypothesis that changes in three properties of voice influence perceptions of speaker confidence, which in turn differentially affects attitudes according to different underlying psychological processes that the Elaboration Likelihood Model (ELM, Petty & Cacioppo, 1984), suggests should emerge under different levels of thought. Experiment 1 was a 2 (Elaboration: high vs. low) x 2 (Vocal speed: increased speed vs. decreased speed) x 2 (Vocal intonation: falling intonation vs. rising intonation) between participants factorial design. Vocal speed and vocal intonation influenced perceptions of speaker confidence as predicted. In line with the ELM, under high elaboration, confidence biased thought favorability, which in turn influenced attitudes. Under low elaboration, confidence did not bias thoughts but rather directly influenced attitudes as a peripheral cue. Experiment 2 used a similar design as Experiment 1 but focused on vocal pitch. Results confirmed pitch influenced perceptions of confidence as predicted. Importantly, we also replicated the bias and cue processes found in Experiment 1. Experiment 3 investigated the process by which a broader spectrum of speech rate influenced persuasion under moderate elaboration. In a 2 (Argument quality: strong vs. weak) x 4 (Vocal speed: extremely slow vs. moderately slow vs. moderately fast vs. extremely fast) between participants factorial design, results confirmed the hypothesized non-linear relationship between speech rate and perceptions of confidence. In line with the ELM, speech rate influenced persuasion based on the amount of processing. Experiment 4 investigated the effects of a broader spectrum of vocal intonation on persuasion under moderate elaboration and used a similar design as Experiment 3. Results indicated a partial success of our vocal intonation manipulation. No evidence was found to support the hypothesized mechanism. These studies show that changes in several different properties of voice can influence the extent to which others perceive them as confident. Importantly, evidence suggests different vocal properties influence persuasion by the same bias and cue processes under high and low thought. Evidence also suggests that under moderate thought, speech rate influences persuasion based on the amount of processing.
Resumo:
During the civil war between Caesar and Pompey, the military oath which binds the soldier to his army is often openly violated. Yet despite this offense, commanders of armed struggle require recursively the oath to their men. Admittedly, this ritual act seems ineffective given the many desertions and mutinies identified, but military leaders use its symbolic and sacred meaning to legitimize one hand their “anti-republican” actions, on the other armies fighting in a context deemed impius.
Resumo:
El artículo analiza la figura del prosumidor desde los estudios visuales a partir de la combinación de la teoría de los actos de habla y los nuevos medios. El objetivo es evaluar si la distinción entre productores y consumidores, estrategias y tácticas de Michel de Certeau continúa siendo operativa en las interfaces gráficas de la cultura global de la información de Scott Lash. Para ello distingue dos tipos de performatividad de los actos de habla: la performatividad top-down del software, y la bottom-up de los juegos del lenguaje y las formas de vida. Estos tipos se aplican al análisis del discurso de los eslóganes que aparecen en los sitios web de las iniciativas “open” y de economía colaborativa, ya que las primeras están dedicadas a la producción de bienes inmateriales y las segundas a la producción de bienes materiales. El desarrollo muestra cómo los dos tipos de performatividad transforman el análisis textual de los estudios literarios y cinematográficos en una metodología capaz de investigar acciones materiales, humanas y no humanas. Las conclusiones describen el surgimiento de nuevas convenciones narrativas de poder y control ajenas a la ficción que apuntan a una “DIY society”.
Resumo:
随着社会的发展,尤其是互联网的发展,很多语言每年都涌现出了不少新词汇。词语是每个语言最基本也是最重要的组成部分,因此分析这些新词汇的结构特点以及构词法是很有意义的。这篇文章分析了2014年出现在中文里的新词汇和它们的构词方式,论文的目的是为了更好地了解中文词汇的发展和特点。本文以《2014汉语新词语》中公布的2014年出现的新词汇作为语料进行分析,发现了以下两个主要特点:第一,合成法,派生法,缩略法是2014年产生的新词汇的主要构词方式;第二, 百分之七十二的新词汇是多音节词(包含三个或者三个以上音节),而百分之八十的是名词。这些特点说明中文词汇现阶段的特点和发展趋势,跟传统的中文词汇有不同之处。