946 resultados para speech features
Resumo:
We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.
Resumo:
The Cypriot Greek variety (CG), spoken in the island of Cyprus, is relatively distinct from Standard Greek (SG) in all linguistic domains and, especially, in the area of pronunciation. Youth language, within the Greek-Cypriot context, is an area of study that has, until recently, received little attention. Tsiplakou (2004) makes reference to the emergence of a new slang among young Greek-Cypriots, influenced by new comedy series, in which the actors make extensive use of ‘exaggeratedly peasant’ CG. As these comedy series become increasingly popular, the use of marked regional features becomes evident in the speech style of young Greek-Cypriots. A preliminary study has also revealed that marked CG linguistic features are equally evident in the online interactions of young internet users (Themistocleous 2005). In this study, I examine the use of CG phonological elements in a corpus of messages collected from channel #Cyprus, of Internet Relay Chat (IRC). It is demonstrated that young Greek-Cypriots use language in creative ways, in order to represent in writing phonological features, typical of their informal speech.
Resumo:
What this paper adds? What is already known on the subject? Multi-sensory treatment approaches have been shown to impact outcome measures positively, such as accuracy of speech movement patterns and speech intelligibility in adults with motor speech disorders, as well as in children with apraxia of speech, autism and cerebral palsy. However, there has been no empirical study using multi-sensory treatment for children with speech sound disorders (SSDs) who demonstrate motor control issues in the jaw and orofacial structures (e.g. jaw sliding, jaw over extension, inadequate lip rounding/retraction and decreased integration of speech movements). What this paper adds? Findings from this study indicate that, for speech production disorders where both the planning and production of spatiotemporal parameters of movement sequences for speech are disrupted, multi-sensory treatment programmes that integrate auditory, visual and tactile–kinesthetic information improve auditory and visual accuracy of speech production. The training (practised in treatment) and test words (not practised in treatment) both demonstrated positive change in most participants, indicating generalization of target features to untrained words. It is inferred that treatment that focuses on integrating multi-sensory information and normalizing parameters of speech movements is an effective method for treating children with SSDs who demonstrate speech motor control issues.
Resumo:
We describe three patients with a comparable deletion encompassing SLC25A43, SLC25A5, CXorf56, UBE2A, NKRF, and two non-coding RNA genes, U1 and LOC100303728. Moderate to severe intellectual disability (ID), psychomotor retardation, severely impaired/absent speech, seizures, and urogenital anomalies were present in all three patients. Facial dysmorphisms include ocular hypertelorism, synophrys, and a depressed nasal bridge. These clinical features overlap with those described in two patients from a family with a similar deletion at Xq24 that also includes UBE2A, and in several patients of Brazilian and Polish families with point mutations in UBE2A. Notably, all five patients with an Xq24 deletion have ventricular septal defects that are not present inpatients with a point mutation, which might be attributed to the deletion of SLC25A5. Taken together, the UBE2A deficiency syndrome in male patients with a mutation in or a deletion of UBE2A is characterized by ID, absent speech, seizures, urogenital anomalies, frequently including a small penis, and skin abnormalities, which include generalized hirsutism, low posterior hairline, myxedematous appearance, widely spaced nipples, and hair whorls. Facial dysmorphisms include a wide face, a depressed nasal bridge, a large mouth with downturned corners, thin vermilion, and a short, broad neck. (C) 2010 Wiley-Liss, Inc.
Resumo:
This essay studies how dialectal speech is reflected in written literature and how this phenomenon functions in translation. With this purpose in mind, Styron's Sophie's Choice and Twain's The Adventures of Huckleberry Finn are analysed using samples of non-standard orthography which have been applied in order to reflect the dialect, or accent, of certain characters. In the same way, Lundgren's Swedish translation of Sophie's Choice and Ferres and Rolfe's Spanish version of The Adventures of Huckleberry Finn are analysed. The method consists of linguistically analysing a few text samples from each novel, establishing how dialect is represented through non-standard orthography, and thereafter, comparing the same samples with their translation into another language in order to establish whether dialectal features are visible also in the translated novels. It is concluded that non-standard orthography is applied in the novels in order to represent each possible linguistic level, including pronunciation, morphosyntax, and vocabulary. Furthermore, it is concluded that while Lundgren's translation intends to orthographically represent dialectal speech on most occasions where the original does so, Ferres and Rolfe's translation pays no attention to dialectology. The discussion following the data analysis establishes some possible reasons for the exclusion of dialectal features in the Spanish translation considered here. Finally, the reason for which this study contributes to the study of dialectology is declared.
Resumo:
Background: Spondyloepiphyseal dysplasia-brachydactyly and distinctive speech (SED-BDS) is a syndrome characterized by short stature, disproportionately short limbs, peculiar face, thick and abundant hair, high-pitched and coarse voice, small epiphyses, brachymetacarpalia, brachymetatarsalia and brachy-phalangia of fingers and toes, small pelvis and delayed carpal bone age, among other features. Case Report: We report a Brazilian patient with father, brother and sister presenting with the same typical features of the syndrome. Clinically, he showed disproportionately short stature, rhizo-meso-acromelic shortness of the extremities, short hands and feet, a peculiar distinctive high-pitched voice, peculiar facies, and other features already reported as characteristic of this syndrome. Radiographic fndings included shape anomalies of the vertebral bodies such as cuboid-shaped vertebral bodies, mild scoliosis, short and broad tubular bones, brachymetacarpalia, brachymetatarsalia, and brachy-dactyly, lumbar hyperlordosis, generalized osteopenia, and hypoplastic iliac wings. Conclusions: Few cases have been described, as this is a rare skeletal dysplasia. This paper describes a new familial case of SED-BDS. © The American Journal of Case Reports.
Resumo:
Smith-Magenis syndrome (SMS) is a complex disorder whose clinical features include mild to severe intellectual disability with speech delay, growth failure, brachycephaly, flat midface, short broad hands, and behavioral problems. SMS is typically caused by a large deletion on 17p11.2 that encompasses multiple genes including the retinoic acid induced 1, RAI1, gene or a mutation in the RAI1 gene. Here we have evaluated 30 patients with suspected SMS and identified SMS-associated classical 17p11.2 deletions in six patients, an atypical deletion of ∼139 kb that partially deletes the RAI1 gene in one patient, and RAI1 gene nonsynonymous alterations of unknown significance in two unrelated patients. The RAI1 mutant proteins showed no significant alterations in molecular weight, subcellular localization and transcriptional activity. Clinical features of patients with or without 17p11.2 deletions and mutations involving the RAI1 gene were compared to identify phenotypes that may be useful in diagnosing patients with SMS. © 2012 Macmillan Publishers Limited All rights reserved.
Resumo:
This study investigated whether there are differences in the Speech-Evoked Auditory Brainstem Response among children with Typical Development (TD), (Central) Auditory Processing Disorder (C) APD, and Language Impairment (LI). The speech-evoked Auditory Brainstem Response was tested in 57 children (ages 6-12). The children were placed into three groups: TD (n = 18), (C)APD (n = 18) and LI (n = 21). Speech-evoked ABR were elicited using the five-formant syllable/da/. Three dimensions were defined for analysis, including timing, harmonics, and pitch. A comparative analysis of the responses between the typical development children and children with (C)APD and LI revealed abnormal encoding of the speech acoustic features that are characteristics of speech perception in children with (C)APD and LI, although the two groups differed in their abnormalities. While the children with (C)APD might had a greater difficulty distinguishing stimuli based on timing cues, the children with LI had the additional difficulty of distinguishing speech harmonics, which are important to the identification of speech sounds. These data suggested that an inefficient representation of crucial components of speech sounds may contribute to the difficulties with language processing found in children with LI. Furthermore, these findings may indicate that the neural processes mediated by the auditory brainstem differ among children with auditory processing and speech-language disorders. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Background: Psychosis has various causes, including mania and schizophrenia. Since the differential diagnosis of psychosis is exclusively based on subjective assessments of oral interviews with patients, an objective quantification of the speech disturbances that characterize mania and schizophrenia is in order. In principle, such quantification could be achieved by the analysis of speech graphs. A graph represents a network with nodes connected by edges; in speech graphs, nodes correspond to words and edges correspond to semantic and grammatical relationships. Methodology/Principal Findings: To quantify speech differences related to psychosis, interviews with schizophrenics, manics and normal subjects were recorded and represented as graphs. Manics scored significantly higher than schizophrenics in ten graph measures. Psychopathological symptoms such as logorrhea, poor speech, and flight of thoughts were grasped by the analysis even when verbosity differences were discounted. Binary classifiers based on speech graph measures sorted schizophrenics from manics with up to 93.8% of sensitivity and 93.7% of specificity. In contrast, sorting based on the scores of two standard psychiatric scales (BPRS and PANSS) reached only 62.5% of sensitivity and specificity. Conclusions/Significance: The results demonstrate that alterations of the thought process manifested in the speech of psychotic patients can be objectively measured using graph-theoretical tools, developed to capture specific features of the normal and dysfunctional flow of thought, such as divergence and recurrence. The quantitative analysis of speech graphs is not redundant with standard psychometric scales but rather complementary, as it yields a very accurate sorting of schizophrenics and manics. Overall, the results point to automated psychiatric diagnosis based not on what is said, but on how it is said.
Resumo:
The characteristics of aphasics’ speech in various languages have been the core of numerous studies, but Arabic in general, and Palestinian Arabic in particular, is still a virgin field in this respect. However, it is of vital importance to have a clear picture of the specific aspects of Palestinian Arabic that might be affected in the speech of aphasics in order to establish screening, diagnosis and therapy programs based on a clinical linguistic database. Hence the central questions of this study are what are the main neurolinguistic features of the Palestinian aphasics’ speech at the phonetic-acoustic level and to what extent are the results similar or not to those obtained from other languages. In general, this study is a survey of the most prominent features of Palestinian Broca’s aphasics’ speech. The main acoustic parameters of vowels and consonants are analysed such as vowel duration, formant frequency, Voice Onset Time (VOT), intensity and frication duration. The deviant patterns among the Broca’s aphasics are displayed and compared with those of normal speakers. The nature of deficit, whether phonetic or phonological, is also discussed. Moreover, the coarticulatory characteristics and some prosodic patterns of Broca’s aphasics are addressed. Samples were collected from six Broca’s aphasics from the same local region. The acoustic analysis conducted on a range of consonant and vowel parameters displayed differences between the speech patterns of Broca’s aphasics and normal speakers. For example, impairments in voicing contrast between the voiced and voiceless stops were found in Broca’s aphasics. This feature does not exist for the fricatives produced by the Palestinian Broca’s aphasics and hence deviates from data obtained for aphasics’ speech from other languages. The Palestinian Broca’s aphasics displayed particular problems with the emphatic sounds. They exhibited deviant coarticulation patterns, another feature that is inconsistent with data obtained from studies from other languages. However, several other findings are in accordance with those reported from various other languages such as impairments in the VOT. The results are in accordance with the suggestions that speech production deficits in Broca’s aphasics are not related to phoneme selection but rather to articulatory implementation and some speech output impairments are related to timing and planning deficits.
Resumo:
Speech is often a multimodal process, presented audiovisually through a talking face. One area of speech perception influenced by visual speech is speech segmentation, or the process of breaking a stream of speech into individual words. Mitchel and Weiss (2013) demonstrated that a talking face contains specific cues to word boundaries and that subjects can correctly segment a speech stream when given a silent video of a speaker. The current study expanded upon these results, using an eye tracker to identify highly attended facial features of the audiovisual display used in Mitchel and Weiss (2013). In Experiment 1, subjects were found to spend the most time watching the eyes and mouth, with a trend suggesting that the mouth was viewed more than the eyes. Although subjects displayed significant learning of word boundaries, performance was not correlated with gaze duration on any individual feature, nor was performance correlated with a behavioral measure of autistic-like traits. However, trends suggested that as autistic-like traits increased, gaze duration of the mouth increased and gaze duration of the eyes decreased, similar to significant trends seen in autistic populations (Boratston & Blakemore, 2007). In Experiment 2, the same video was modified so that a black bar covered the eyes or mouth. Both videos elicited learning of word boundaries that was equivalent to that seen in the first experiment. Again, no correlations were found between segmentation performance and SRS scores in either condition. These results, taken with those in Experiment, suggest that neither the eyes nor mouth are critical to speech segmentation and that perhaps more global head movements indicate word boundaries (see Graf, Cosatto, Strom, & Huang, 2002). Future work will elucidate the contribution of individual features relative to global head movements, as well as extend these results to additional types of speech tasks.
Resumo:
Open-ended interviews of 90 min length of 38 patients were analyzed with respect to speech stylistics, shown by Schucker and Jacobs to differentiate individuals with type A personality features from those with type B. In our patients, Type A/B had been assessed by the Bortner Personality Inventory. The stylistics studied were: repeated words swallowed words, interruptions, simultaneous speech, silence latency (between question and answer) (SL), speed of speech, uneven speed of speech (USS), explosive words (PW), uneven speech volume (USV), and speech volume. Correlations between both raters for all speech categories were high. Positive correlations between extent of type A and SL (r = 0.33; p = 0.022), USS (r = 0.51; p = 0.002), PW (r = 0.46; p = 0.003) and USV (r = 0.39; p = 0.012) were found. Our results indicate that the speech in nonstress open-ended interviews of type A individuals tends to show a higher emotional tension (positive correlations for USS PW and USV) and is more controlled in conversation (positive correlation for SL).
Resumo:
Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.