999 resultados para Poetic Speech


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The support for typically out-of-vocabulary query terms such as names, acronyms, and foreign words is an important requirement of many speech indexing applications. However, to date many unrestricted vocabulary indexing systems have struggled to provide a balance between good detection rate and fast query speeds. This paper presents a fast and accurate unrestricted vocabulary speech indexing technique named Dynamic Match Lattice Spotting (DMLS). The proposed method augments the conventional lattice spotting technique with dynamic sequence matching, together with a number of other novel algorithmic enhancements, to obtain a system that is capable of searching hours of speech in seconds while maintaining excellent detection performance

Relevância:

20.00% 20.00%

Publicador:

Resumo:

China’s biggest search engine has a constitutional right to filter its search results, a US court found last month. But that’s just the start of the story. Eight New York-based pro-democracy activists sued Baidu Inc in 2011, seeking damages because Baidu prevents their work from showing up in search results. Baidu follows Chinese law that requires it to censor politically sensitive results. But in what the plaintiffs’ lawyer has dubbed a “perfect paradox”, US District Judge Jesse Furman has dismissed the challenge, explaining that to hold Baidu liable for its decisions to censor pro-democracy content would itself infringe the right to free speech.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To identify and categorize complex stimuli such as familiar objects or speech, the human brain integrates information that is abstracted at multiple levels from its sensory inputs. Using cross-modal priming for spoken words and sounds, this functional magnetic resonance imaging study identified 3 distinct classes of visuoauditory incongruency effects: visuoauditory incongruency effects were selective for 1) spoken words in the left superior temporal sulcus (STS), 2) environmental sounds in the left angular gyrus (AG), and 3) both words and sounds in the lateral and medial prefrontal cortices (IFS/mPFC). From a cognitive perspective, these incongruency effects suggest that prior visual information influences the neural processes underlying speech and sound recognition at multiple levels, with the STS being involved in phonological, AG in semantic, and mPFC/IFS in higher conceptual processing. In terms of neural mechanisms, effective connectivity analyses (dynamic causal modeling) suggest that these incongruency effects may emerge via greater bottom-up effects from early auditory regions to intermediate multisensory integration areas (i.e., STS and AG). This is consistent with a predictive coding perspective on hierarchical Bayesian inference in the cortex where the domain of the prediction error (phonological vs. semantic) determines its regional expression (middle temporal gyrus/STS vs. AG/intraparietal sulcus).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating noncritical in-car systems. Under such conditions, however, speech recognition accuracy degrades significantly, and techniques such as speech enhancement are required to improve these accuracies. Likelihood-maximizing (LIMA) frameworks optimize speech enhancement algorithms based on recognized state sequences rather than traditional signal-level criteria such as maximizing signal-to-noise ratio. LIMA frameworks typically require calibration utterances to generate optimized enhancement parameters that are used for all subsequent utterances. Under such a scheme, suboptimal recognition performance occurs in noise conditions that are significantly different from that present during the calibration session – a serious problem in rapidly changing noise environments out on the open road. In this chapter, we propose a dialog-based design that allows regular optimization iterations in order to track the ever-changing noise conditions. Experiments using Mel-filterbank noise subtraction (MFNS) are performed to determine the optimization requirements for vehicular environments and show that minimal optimization is required to improve speech recognition, avoid over-optimization, and ultimately assist with semireal-time operation. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session only.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper explores the experiences of older community-dwelling Australians evacuated from their homes during the 2011 and 2013 Queensland floods, applying the novel creative methodology of poetic inquiry as an analysis and interpretative tool. As well as exploring how older adults managed during a natural disaster, the paper documents the process and potential of poetic inquiry in gerontological research. The first and second poems highlight the different social resources older people have to draw on in their lives, especially during a crisis. Poem 1 (“Nobody came to help me”) illustrates how one older resident felt all alone during the flood, whereas Poem 2 (“They came from everywhere”), Poem 3 ("The Girls") and Poem 5 (“Man in Blue Shirt”) shows how supported – from both family and the wider community - other older residents felt. Poem 4 (“I can’t swim”) highlights one participant’s fear as the water rises. To date, few studies have explicitly explored older adult’s disaster experience, with this paper the first to utilise a poetic lens. We argue that poetic presentation enhances understanding of older residents’ unique experiences during a disaster, and may better engage a wider audience of policy-makers, practitioners, the general community and older people themselves in discussion about, and reflection on, the impact and experience of disasters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For most people, speech production is relatively effortless and error-free. Yet it has long been recognized that we need some type of control over what we are currently saying and what we plan to say. Precisely how we monitor our internal and external speech has been a topic of research interest for several decades. The predominant approach in psycholinguistics has assumed monitoring of both is accomplished via systems responsible for comprehending others' speech. This special topic aimed to broaden the field, firstly by examining proposals that speech production might also engage more general systems, such as those involved in action monitoring. A second aim was to examine proposals for a production-specific, internal monitor. Both aims require that we also specify the nature of the representations subject to monitoring.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This large-scale longitudinal population study provided a rare opportunity to consider the interface between multilingualism and speech-language competence on children’s academic and social-emotional outcomes and to determine whether differences between groups at 4 to 5 years persist, deepen, or disappear with time and schooling. Four distinct groups were identified from the Kindergarten cohort of the Longitudinal Study of Australian Children (LSAC) (1) English-only + typical speech and language (n = 2,012); (2) multilingual + typical speech and language (n = 476); (3) English-only + speech and language concern (n = 643); and (4) multilingual + speech and language concern (n = 109). Two analytic approaches were used to compare these groups. First, a matched case-control design was used to randomly match multilingual children with speech and language concern (group 4, n = 109) to children in groups 1, 2, and 3 on gender, age, and family socio-economic position in a cross-sectional comparison of vocabulary, school readiness, and behavioral adjustment. Next, analyses were applied to the whole sample to determine longitudinal effects of group membership on teachers’ ratings of literacy, numeracy, and behavioral adjustment at ages 6 to 7 and 8 to 9 years. At 4 to 5 years, multilingual children with speech and language concern did equally well or better than English-only children (with or without speech and language concern) on school readiness tests but performed more poorly on measures of English vocabulary and behavior. At ages 6 to 7 and 8 to 9, the early gap between English-only and multilingual children had closed. Multilingualism was not found to contribute to differences in literacy and numeracy outcomes at school; instead, outcomes were more related to concerns about children’s speech and language in early childhood. There were no group differences for socio-emotional outcomes. Early evidence for the combined risks of multilingualism plus speech and language concern was not upheld into the school years.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic speech recognition from multiple distant micro- phones poses significant challenges because of noise and reverberations. The quality of speech acquisition may vary between microphones because of movements of speakers and channel distortions. This paper proposes a channel selection approach for selecting reliable channels based on selection criterion operating in the short-term modulation spectrum domain. The proposed approach quantifies the relative strength of speech from each microphone and speech obtained from beamforming modulations. The new technique is compared experimentally in the real reverb conditions in terms of perceptual evaluation of speech quality (PESQ) measures and word error rate (WER). Overall improvement in recognition rate is observed using delay-sum and superdirective beamformers compared to the case when the channel is selected randomly using circular microphone arrays.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech rhythm is an essential part of speech processing. It is the outcome of the workings of a combination of linguistic and non-linguistic parameters, many of which also have other functions in speech. This study focusses on the acoustic and auditive realization of two linguistic parameters of rhythm: (1) sentence stress, and (2) speech rate and pausing. The aim was to find out how well Finnish comprehensive school pupils realize these two parameters in English and how native speakers of English react to Finnish pupils English rhythm. The material was elicited by means of a story-telling task and questionnaires. Three female and three male pupils representing different levels of oral skills in English were selected as the experimental group. The control group consisted of two female and two male native speakers of English. The stories were analysed acoustically and auditorily with respect to interstress intervals, weak forms, fundamental frequency, pausing, and speech as well as articulation rate. In addition, 52 native speakers of English were asked to rate the intelligibility of the Finnish pupils English with respect to speech rhythm and give their attitudes on what the pupils sounded like. Results showed that Finnish pupils can produce isochronous interstress intervals in English, but that too large a proportion of these intervals contain pauses. A closer analysis of the pauses revealed that Finnish pupils pause too frequently and in inappropriate places when they speak English. Frequent pausing was also found to cause slow speech rates. The findings of the fundamental frequency (F0) measurements indicate that Finnish pupils tend to make a slightly narrower F0 difference between stressed and unstressed syllables than the native speakers of English. Furthermore, Finnish pupils appear to know how to reduce the duration and quality of unstressed sounds, but they fail to do it frequently enough. Native listeners gave lower intelligibility and attitude scores to pupils with more anomalous speech rhythm. Finnish pupils rhythm anomalies seemed to derive from various learning- or learner-related factors rather than from the differences between English and Finnish. This study demonstrates that pausing may be a more important component of English speech rhythm than sentence stress as far as Finnish adolescents are concerned and that interlanguage development is affected by various factors and characterised by jumps or periods of stasis. Other theoretical, methodological and pedagogical implications of the results are also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Titled "An Essay on Antimetaphoric Resistance", the dissertation investigates what is here being called "Counter-figures": a term which has in this context a certain variety of applications. Any other-than-image or other-than-figure, anything that cannot be exhausted by figuration (and that is, more or less, anything at all, except perhaps the reproducible images and figures themselves) can be considered "counter-figurative" with regard to the formation of images and figures, ideas and schemas, "any graven image, or any likeness of any thing". Singularity and radical alterity, as well as temporality and its peculiar mode of uniqueness are key issues here, and an ethical dimension is implied by, or intertwined with, the aesthetic. In terms borrowed from Paul Celan's "Meridian" speech, poetry may "allow the most idiosyncratic quality of the Other, its time, to participate in the dialogue". This connection between singularity, alterity and temporality is one of the reasons why Celan so strongly objects to the application of the traditional concept of metaphor to poetry. As Celan says, "carrying over [übertragen]" by metaphor may imply an unwillingness to "bear with [mittragen]" and to "endure [ertragen]" the poem. The thesis is divided into two main parts. The first consists of five distinct prolegomena which all address the mentioned variety of applications of the term "counter-figures", and especially the rejection or critique of either metaphor (by Aristotle, for instance) or the concept of metaphor (defined by Aristotle, and sometimes deemed "anti-poetic" by both theorists and poets). Even if we restrict ourselves to the traditional rhetorico-poetical terms, we may see how, for instance, metonymy can be a counter-figure for metaphor, allegory for symbol, and irony for any single trope or for any piece of discourse at all. The limits of figurality may indeed be located at these points of intersection between different types of tropes or figures, and even between figures or tropes and the "non-figurative trope" or "pseudo-figure" called catachresis. The second part, following on from the open-ended prolegomena, concentrates on Paul Celan's poetry and poetics. According to Celan, true poetry is "essentially anti-metaphoric". I argue that inasmuch as we are willing to pay attention to the "will" of the poetic images themselves (the tropes and metaphors in a poem) to be "carried ad absurdum", as Celan invites us to do, we may find alternative ways of reading poetry and approaching its "secret of the encounter", precisely when the traditional rhetorical instruments, and especially the notion of metaphor, become inapplicable or suspicious — and even where they still seem to impose themselves.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maila Pylkkönen (1931 1986) was one of the most important modernist poets in Finland and a central figure in developing the dramatic monologue in Finnish literature. The study examines Pylkkönen s poetic work Arvo. Vanhaäiti puhuu runonsa (Value. An old woman speaks her poem, 1959) as an example of the dramatic monologue, approaching it from three perspectives: its generic features and background, and the poetic framework to which it connects in the context of Pylkkönen s poetry. In addition to methods of literary scholarship, the poetic analysis benefits from a linguistic approach. The study shows that the dramatic monologue genre drives Pylkkönen s first work, Klassilliset tunteet (Classical feelings, 1957), in a context of finding poetic identity, characterised by the expression to be the words of a living creature . The study demonstrates that important generic features of the dramatic monologue, namely, a poem representing a speech-event and a hierarchical structure, are also Arvo s most significant generic features. Arvo s poems as speech-events are examined for their internal progressive, pragmatic unity constructed through single line units; for their function as narratives dealing with the life story of an old woman, Arvo s speaker; and from the perspective of the communication between the old woman and the poems other characters. Arvo s speech-events can also be seen as semantic shifts from one poem to another: the poems construct semantic stages representing different phases of the old woman s life. The study demonstrates that analysis of Arvo s hierarchical structure, that is, the relationship between the speaker and the rhetorical levels, reveals the work s structural and ideological wholeness by focusing on the old woman s emotions: longing, loneliness and alienation from the world. In other words, the contradictions between the explicit level of the speaker and an implied rhetorical level open up the tragedy of an old woman s daily life. Study of Arvo s hierarchical structure also highlights the special position of the reader in the framework of a dramatic monologue. The elements of a dramatic present in which the old woman s emotions are conveyed, an italicized opening poem, and the work s title Value invite the reader to consider Arvo as a structural and ideological whole. The function of Arvo s hierarchical structure is to ask the reader to recognise the hopelessness of the old woman s situation, understand it, and even identify with it.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation consists of four articles and an introduction. The five parts address the same topic, nonverbal predication in Erzya, from different perspectives. The work is at the same time linguistic typology and Uralic studies. The findings based on a large corpus of empirical Erzya data, which was collected using several different methods and included recordings of the spoken language, made it possible for the present study to apply, then test and finally discuss the previous theories based on cross-linguistic data. Erzya makes use of multiple predication patterns which vary from totally analytic to the morphologically very complex. Nonverbal predicate clause types are classified on the basis of propositional acts in clauses denoting class-membership, identity, property and location. The predicates of these clauses are nouns, adjectives and locational expressions, respectively. The following three predication strategies in Erzya nonverbal predication can be identified: i. the zero-copula construction, ii. the predicative suffix construction and iii. the copula construction. It has been suggested that verbs and nouns cannot be clearly distinguished on morphological grounds when functioning as predicates in Erzya. This study shows that even though predicativity must not be considered a sufficient tool for defining parts of speech in any language, the Erzya lexical classes of adjective, noun and verb can be distinguished from each other also in predicate position. The relative frequency and degree of obligation for using the predicative suffix construction decreases when moving left to right on the scale verb adjective/locative noun ( identificational statement). The predicative suffix is the main pattern in the present tense over the whole domain of nonverbal predication in Standard Erzya, but if it is replaced it is most likely to be with a zero-copula construction in a nominal predication. This study exploits the theory of (a)symmetry for the first time in order to describe verbal vs. nonverbal predication. It is shown that the asymmetry of paradigms and constructions differentiates the lexical classes. Asymmetrical structures are motivated by functional level asymmetry. Variation in predication as such adds to the complexity of the grammar. When symmetric structures are employed, the functional complexity of grammar decreases, even though morphological complexity increases. The genre affects the employment of predication strategies in Erzya. There are differences in the relative frequency of the patterns, and some patterns are totally lacking from some of the data. The clearest difference is that the past tense predicative suffix construction occurs relatively frequently in Standard Erzya, while it occurs infrequently in the other data. Also, the predicative suffixes of the present tense are used more regularly in written Standard Erzya than in any other genre. The genre also affects the incidence of the translative in uľ(ń)ems copula constructions. In translations from Russian to Erzya the translative case is employed relatively frequently in comparison to other data. This study reveals differences between the two Mordvinic languages Erzya and Moksha. The predicative suffixes (bound person markers) of the present tense are used more regularly in Moksha in all kinds of nonverbal predicate clauses compared to Erzya. It should further be observed that identificational statements are encoded with a predicative suffix in Moksha, but seldom in Erzya. Erzya clauses are more frequently encoded using zero-constructions, displaying agreement in number only.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are addressing the problem of jointly using multiple noisy speech patterns for automatic speech recognition (ASR), given that they come from the same class. If the user utters a word K times, the ASR system should try to use the information content in all the K patterns of the word simultaneously and improve its speech recognition accuracy compared to that of the single pattern based speech recognition. T address this problem, recently we proposed a Multi Pattern Dynamic Time Warping (MPDTW) algorithm to align the K patterns by finding the least distortion path between them. A Constrained Multi Pattern Viterbi algorithm was used on this aligned path for isolated word recognition (IWR). In this paper, we explore the possibility of using only the MPDTW algorithm for IWR. We also study the properties of the MPDTW algorithm. We show that using only 2 noisy test patterns (10 percent burst noise at -5 dB SNR) reduces the noisy speech recognition error rate by 37.66 percent when compared to the single pattern recognition using the Dynamic Time Warping algorithm.