32 resultados para performativity of speech
Resumo:
Basing the conception of language on the sign represents also an obstacle to the awareness of certain elements of human life, especially to a full understanding of what language or art do, Henri Meschonnic’s poetics of the continuum and of rhythm criticizes the sign based on Benveniste’s terms of rhythm and discourse, developing an anthropology of language. Rhythm, for Meschonnic, is no formal metrical but a semantic principle, each time unique and unforeseeable. As for Humboldt, his starting point is not the word but the ensemble of speech; language is not ergon but energeia. The poem then is not a literary form but a process of transformation that Meschonnic defines as the invention of a form of life by a form of language and vice versa. Thus a poem is a way of thinking and rhythm is form in movement. The particular subject of art and literature is consequently not the author but a process of subjectivation – this is the contrary of the conception of the sign. By demonstrating the limits of the sign, Meschonnic’s poetics attempts to thematize the intelligibility of presence. Art and literature raise our awareness of this element of human life we cannot grasp conceptually. This poetical thinking is a necessary counterforce against all institutionalization.
Resumo:
This paper studies single-channel speech separation, assuming unknown, arbitrary temporal dynamics for the speech signals to be separated. A data-driven approach is described, which matches each mixed speech segment against a composite training segment to separate the underlying clean speech segments. To advance the separation accuracy, the new approach seeks and separates the longest mixed speech segments with matching composite training segments. Lengthening the mixed speech segments to match reduces the uncertainty of the constituent training segments, and hence the error of separation. For convenience, we call the new approach Composition of Longest Segments, or CLOSE. The CLOSE method includes a data-driven approach to model long-range temporal dynamics of speech signals, and a statistical approach to identify the longest mixed speech segments with matching composite training segments. Experiments are conducted on the Wall Street Journal database, for separating mixtures of two simultaneous large-vocabulary speech utterances spoken by two different speakers. The results are evaluated using various objective and subjective measures, including the challenge of large-vocabulary continuous speech recognition. It is shown that the new separation approach leads to significant improvement in all these measures.
Resumo:
Temporal dynamics and speaker characteristics are two important features of speech that distinguish speech from noise. In this paper, we propose a method to maximally extract these two features of speech for speech enhancement. We demonstrate that this can reduce the requirement for prior information about the noise, which can be difficult to estimate for fast-varying noise. Given noisy speech, the new approach estimates clean speech by recognizing long segments of the clean speech as whole units. In the recognition, clean speech sentences, taken from a speech corpus, are used as examples. Matching segments are identified between the noisy sentence and the corpus sentences. The estimate is formed by using the longest matching segments found in the corpus sentences. Longer speech segments as whole units contain more distinct dynamics and richer speaker characteristics, and can be identified more accurately from noise than shorter speech segments. Therefore, estimation based on the longest recognized segments increases the noise immunity and hence the estimation accuracy. The new approach consists of a statistical model to represent up to sentence-long temporal dynamics in the corpus speech, and an algorithm to identify the longest matching segments between the noisy sentence and the corpus sentences. The algorithm is made more robust to noise uncertainty by introducing missing-feature based noise compensation into the corpus sentences. Experiments have been conducted on the TIMIT database for speech enhancement from various types of nonstationary noise including song, music, and crosstalk speech. The new approach has shown improved performance over conventional enhancement algorithms in both objective and subjective evaluations.
Resumo:
Historically political song has often been perceived negatively, as a disturbance of the peace, summed up by the legendary line from Goethe’s Faust: “Politisches Lied – ein garstiges Lied”. In the period in Germany of the Vormärz (from 1815 up to the revolution of March 1848), however, we see how this perception may be changing as it increasingly becomes a means of self-expression in public life. This was the era of restauration, in which broader sections of German society are striving for political emancipation from the princes and kings. A whole host of political themes emerge in the songs (Freiheitslieder) of that period in which a new oppositional political consciousness is reflected. The themes range from freedom of speech, freedom from censorship, and the need for democratic and national self-determination to critiques of injustice and hunger, and parodies of political convention and opportunism. Sources of reception give indications about the social and political milieus in which these songs circulated. Such sources include broadsheets, handwritten manuscripts, song collections, commemoration events, advertisements in political press, memoires, police reports and general literature of the time. In many cases we see how these songs reflect the emerging social and political identities of those who sing them. One also sees the use of well known melodies in the popular dissemination of these songs. An intertextual function of music often becomes apparent in the practice of contrefacture whereby melodies with particular semantic associations are used to either underline the message or parody the subject of the song.
Resumo:
Three experiments measured the effects of age on informational masking of speech by competing speech. The experiments were designed to minimize the energetic contributions of the competing speech so that informational masking could be measured with no large corrections for energetic masking. Experiment 1 used a "speech-in-speech-in-noise" design, in which the competing speech was presented in noise at a signal-to-noise ratio (SNR) of -4 dB. This ensured that the noise primarily contributed the energetic masking but the competing speech contributed the informational masking. Equal amounts of informational masking (3 dB) were observed for young and elderly listeners, although less was found for hearing-impaired listeners. Experiment 2 tested a range of SNRs in this design and showed that informational masking increased with SNR up to about an SNR of -4 dB, but decreased thereafter. Experiment 3 further reduced the energetic contribution of the competing speech by filtering it into different frequency bands from the target speech. The elderly listeners again showed approximately the same amount of informational masking (4-5 dB), although some elderly listeners had particular difficulty understanding these stimuli in any condition. On the whole, these results suggest that young and elderly listeners were equally susceptible to informational masking. © 2009 Acoustical Society of America.
Resumo:
In this paper, I critically assess John Rawls' repeated claim that the duty of civility is only a moral duty and should not be enforced by law. In the first part of the paper, I examine and reject the view that Rawls' position may be due to the practical difficulties that the legal enforcement of the duty of civility might entail. I thus claim that Rawls' position must be driven by deeper normative reasons grounded in a conception of free speech. In the second part of the paper, I therefore examine various arguments for free speech and critically assess whether they are consistent with Rawls' political liberalism. I first focus on the arguments from truth and self-fulfilment. Both arguments, I argue, rely on comprehensive doctrines and therefore cannot provide a freestanding political justification for free speech. Freedom of speech, I claim, can be justified instead on the basis of Rawls' political conception of the person and of the two moral powers. However, Rawls' wide view of public reason already allows scope for the kind of free speech necessary for the exercise of the two moral powers and therefore cannot explain Rawls' opposition to the legal enforcement of the duty of civility. Such opposition, I claim, can only be explained on the basis of a defence of unconstrained freedom of speech grounded in the ideas of democracy and political legitimacy. Yet, I conclude, while public reason and the duty of civility are essential to political liberalism, unconstrained freedom of speech is not. Rawls and political liberals could therefore renounce unconstrained freedom of speech, and endorse the legal enforcement of the duty of civility, while remaining faithful to political liberalism.
Resumo:
Objective
To determine the optimal transcranial magnetic stimulation (TMS) coil direction for inducing motor responses in the tongue in a group of non-neurologically impaired participants.
Methods
Single-pulse TMS was delivered using a figure-of-eight Magstim 2002 TMS coil. Study 1 investigated the effect of eight different TMS coil directions on the motor-evoked potentials elicited in the tongue in eight adults. Study 2 examined active motor threshold levels at optimal TMS coil direction compared to a customarily-used ventral-caudal direction. Study 3 repeated the procedure of Study 1 at five different sites across the tongue motor cortex in one adult.
Results
Inter-individual variability in optimal direction was observed, with an optimal range of directions determined for the group. Active motor threshold was reduced when a participant's own optimal TMS coil direction was used compared to the ventral-caudal direction. A restricted range of optimal directions was identified across the five cortical positions tested.
Conclusions
There is a need to identify each individual's own optimal TMS coil direction in investigating tongue motor cortex function. A recommended procedure for determining optimal coil direction is described.
Significance
Optimized TMS procedures are needed so that TMS can be utilized in determining the underlying neurophysiological basis of various motor speech disorders.
Resumo:
Language experience clearly affects the perception of speech, but little is known about whether these differences in perception extend to non-speech sounds. In this study, we investigated rhythmic perception of non-linguistic sounds in speakers of French and German using a grouping task, in which complexity (variability in sounds, presence of pauses) was manipulated. In this task, participants grouped sequences of auditory chimeras formed from musical instruments. These chimeras mimic the complexity of speech without being speech. We found that, while showing the same overall grouping preferences, the German speakers showed stronger biases than the French speakers in grouping complex sequences. Sound variability reduced all participants' biases, resulting in the French group showing no grouping preference for the most variable sequences, though this reduction was attenuated by musical experience. In sum, this study demonstrates that linguistic experience, musical experience, and complexity affect rhythmic grouping of non-linguistic sounds and suggests that experience with acoustic cues in a meaningful context (language or music) is necessary for developing a robust grouping preference that survives acoustic variability.
Resumo:
Williams syndrome is a genetic disorder that, it has been claimed, results in an unusual pattern of linguistic strengths and weaknesses. The current study investigated the hypothesis that there is a reduced influence of lexical knowledge on phonological short-term memory in Williams syndrome. Fourteen children with Williams syndrome and 2 vocabulary la matched control groups, 20 typically developing children and 13 children with learning difficulties, were tested on 2 probed serial-recall tasks. On the basis of previous findings, it was predicted that children with Williams syndrome would demonstrate (a) a reduced effect of lexicality on the recall of list items, (b) relatively poorer recall of list items compared with recall of serial order, and (c) a reduced tendency to produce lexicalization errors in the recall of nonwords. in fact, none of these predictions were supported. Alternative explanations for previous findings and implications for accounts of language development in Williams syndrome are discussed.
Resumo:
Effects of vowel variation on interaction are considered, with particular relevance to their role in conversational breakdown. The effect of speaker knowledge and experience is noted as a variable in developmental progress which must inform profiling decisions, and the need for appropriate taxonomies of speech varieties is emphasized as a precursor to clinical and educational assessments. It is noted, too, that a shared sociolinguistic background between speaker and listener does not always resolve difficulties arising from non-target realizations, casting some doubt on ideas that assessors always possess a guaranteed sense of phonological variability and its effects. Hence, an informed understanding of phonological variation, rather than merely awareness that such variation exists, is advocated.
Resumo:
Studies in sensory neuroscience reveal the critical importance of accurate sensory perception for cognitive development. There is considerable debate concerning the possible sensory correlates of phonological processing, the primary cognitive risk factor for developmental dyslexia. Across languages, children with dyslexia have a specific difficulty with the neural representation of the phonological structure of speech. The identification of a robust sensory marker of phonological difficulties would enable early identification of risk for developmental dyslexia and early targeted intervention. Here, we explore whether phonological processing difficulties are associated with difficulties in processing acoustic cues to speech rhythm. Speech rhythm is used across languages by infants to segment the speech stream into words and syllables. Early difficulties in perceiving auditory sensory cues to speech rhythm and prosody could lead developmentally to impairments in phonology. We compared matched samples of children with and without dyslexia, learning three very different spoken and written languages, English, Spanish, and Chinese. The key sensory cue measured was rate of onset of the amplitude envelope (rise time), known to be critical for the rhythmic timing of speech. Despite phonological and orthographic differences, for each language, rise time sensitivity was a significant predictor of phonological awareness, and rise time was the only consistent predictor of reading acquisition. The data support a language-universal theory of the neural basis of developmental dyslexia on the basis of rhythmic perception and syllable segmentation. They also suggest that novel remediation strategies on the basis of rhythm and music may offer benefits for phonological and linguistic development.
Resumo:
In this paper we present a novel method for performing speaker recognition with very limited training data and in the presence of background noise. Similarity-based speaker recognition is considered so that speaker models can be created with limited training speech data. The proposed similarity is a form of cosine similarity used as a distance measure between speech feature vectors. Each speech frame is modelled using subband features, and into this framework, multicondition training and optimal feature selection are introduced, making the system capable of performing speaker recognition in the presence of realistic, time-varying noise, which is unknown during training. Speaker identi?cation experiments were carried out using the SPIDRE database. The performance of the proposed new system for noise compensation is compared to that of an oracle model; the speaker identi?cation accuracy for clean speech by the new system trained with limited training data is compared to that of a GMM trained with several minutes of speech. Both comparisons have demonstrated the effectiveness of the new model. Finally, experiments were carried out to test the new model for speaker identi?cation given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.
Resumo:
This paper presents a novel method of audio-visual feature-level fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there are limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new multimodal feature representation and a modified cosine similarity are introduced to combine and compare bimodal features with limited training data, as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal dataset created from the SPIDRE speaker recognition database and AR face recognition database with variable noise corruption of speech and occlusion in the face images. The system's speaker identification performance on the SPIDRE database, and facial identification performance on the AR database, is comparable with the literature. Combining both modalities using the new method of multimodal fusion leads to significantly improved accuracy over the unimodal systems, even when both modalities have been corrupted. The new method also shows improved identification accuracy compared with the bimodal systems based on multicondition model training or missing-feature decoding alone.