998 resultados para Speech segmentation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Autism and Asperger syndrome (AS) are neurodevelopmental disorders characterised by deficient social and communication skills, as well as restricted, repetitive patterns of behaviour. The language development in individuals with autism is significantly delayed and deficient, whereas in individuals with AS, the structural aspects of language develop quite normally. Both groups, however, have semantic-pragmatic language deficits. The present thesis investigated auditory processing in individuals with autism and AS. In particular, the discrimination of and orienting to speech and non-speech sounds was studied, as well as the abstraction of invariant sound features from speech-sound input. Altogether five studies were conducted with auditory event-related brain potentials (ERP); two studies also included a behavioural sound-identification task. In three studies, the subjects were children with autism, in one study children with AS, and in one study adults with AS. In children with autism, even the early stages of sound encoding were deficient. In addition, these children had altered sound-discrimination processes characterised by enhanced spectral but deficient temporal discrimination. The enhanced pitch discrimination may partly explain the auditory hypersensitivity common in autism, and it may compromise the filtering of relevant auditory information from irrelevant information. Indeed, it was found that when sound discrimination required abstracting invariant features from varying input, children with autism maintained their superiority in pitch processing, but lost it in vowel processing. Finally, involuntary orienting to sound changes was deficient in children with autism in particular with respect to speech sounds. This finding is in agreement with previous studies on autism suggesting deficits in orienting to socially relevant stimuli. In contrast to children with autism, the early stages of sound encoding were fairly unimpaired in children with AS. However, sound discrimination and orienting were rather similarly altered in these children as in those with autism, suggesting correspondences in the auditory phenotype in these two disorders which belong to the same continuum. Unlike children with AS, adults with AS showed enhanced processing of duration changes, suggesting developmental changes in auditory processing in this disorder.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article examines whether cluster analysis can be used to identify groups of Finnish residents with similar housing preferences. Because homebuilders in Finland have been providing relatively homogeneous products to an increasingly diverse population, current housing may not represent the occupiers' preferences so a segmentation approach relying on socioeconomic characteristics and expressed preferences may not be sufficient. We use data collected via questionnaire in a principal component analysis followed by a hierarchical cluster analysis to determine whether different combinations of housing attributes are important to groups of residents. We can identify four clusters of housing residents based on important characteristics when looking for a house. The clusters describe Finnish people in different phases of the life cycle and with different preferences based on their recreational activities and financial expenditures. Mass customization of housing could be used to better appeal to these different clusters of consumers who share similar preferences, increasing consumer satisfaction and improving profitability.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this presentation, I reflect upon the global landscape surrounding the governance and classification of media content, at a time of rapid change in media platforms and services for content production and distribution, and contested cultural and social norms. I discuss the tensions and contradictions arising in the relationship between national, regional and global dimensions of media content distribution, as well as the changing relationships between state and non-state actors. These issues will be explored through consideration of issues such as: recent debates over film censorship; the review of the National Classification Scheme conducted by the Australian Law Reform Commission; online controversies such as the future of the Reddit social media site; and videos posted online by the militant group ISIS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Typed caption: Festakt in der Stadthalle zur Eroeffnung der Heidelberger Festpiele 1929. Thomas Mann hielt die Festrede. Unsere Aufnahme zeigt einen Ausschnitt aus der Festversammlung, an der die geistige Prominenz Heidelbergs und zahlreiche bedeutende Persoenlichkeiten des deutschen Kulturlebends teilnahmen. Von links nach rechts in der 1. Reihe: Rene Schickele, Rudolf Rittner, Gustav Hargung, Prof. Dr. Martin Dibelius, Gerhart Hauptmann, Dr. Rudolf H. Goldschmidt, Oberbuergermeister Dr. Neinhaus, Thomas Mann und Frau, Kultusminister Leers, Rudolf G. Binding; In der 3. Reihe: Oberbaurat L. Schmieder, Nobelpreistraeger Dr. F. Bergius, Bankdirektor Fremerey, Prof. Hellpach, Prof. Radbruch

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compressive sensing (CS) has been proposed for signals with sparsity in a linear transform domain. We explore a signal dependent unknown linear transform, namely the impulse response matrix operating on a sparse excitation, as in the linear model of speech production, for recovering compressive sensed speech. Since the linear transform is signal dependent and unknown, unlike the standard CS formulation, a codebook of transfer functions is proposed in a matching pursuit (MP) framework for CS recovery. It is found that MP is efficient and effective to recover CS encoded speech as well as jointly estimate the linear model. Moderate number of CS measurements and low order sparsity estimate will result in MP converge to the same linear transform as direct VQ of the LP vector derived from the original signal. There is also high positive correlation between signal domain approximation and CS measurement domain approximation for a large variety of speech spectra.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a novel technique for robust voiced/unvoiced segment detection in noisy speech, based on local polynomial regression. The local polynomial model is well-suited for voiced segments in speech. The unvoiced segments are noise-like and do not exhibit any smooth structure. This property of smoothness is used for devising a new metric called the variance ratio metric, which, after thresholding, indicates the voiced/unvoiced boundaries with 75% accuracy for 0dB global signal-to-noise ratio (SNR). A novelty of our algorithm is that it processes the signal continuously, sample-by-sample rather than frame-by-frame. Simulation results on TIMIT speech database (downsampled to 8kHz) for various SNRs are presented to illustrate the performance of the new algorithm. Results indicate that the algorithm is robust even in high noise levels.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigate the use of a two stage transform vector quantizer (TSTVQ) for coding of line spectral frequency (LSF) parameters in wideband speech coding. The first stage quantizer of TSTVQ, provides better matching of source distribution and the second stage quantizer provides additional coding gain through using an individual cluster specific decorrelating transform and variance normalization. Further coding gain is shown to be achieved by exploiting the slow time-varying nature of speech spectra and thus using inter-frame cluster continuity (ICC) property in the first stage of TSTVQ method. The proposed method saves 3-4 bits and reduces the computational complexity by 58-66%, compared to the traditional split vector quantizer (SVQ), but at the expense of 1.5-2.5 times of memory.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are addressing the novel problem of jointly evaluating multiple speech patterns for automatic speech recognition and training. We propose solutions based on both the non-parametric dynamic time warping (DTW) algorithm, and the parametric hidden Markov model (HMM). We show that a hybrid approach is quite effective for the application of noisy speech recognition. We extend the concept to HMM training wherein some patterns may be noisy or distorted. Utilizing the concept of ``virtual pattern'' developed for joint evaluation, we propose selective iterative training of HMMs. Evaluating these algorithms for burst/transient noisy speech and isolated word recognition, significant improvement in recognition accuracy is obtained using the new algorithms over those which do not utilize the joint evaluation strategy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Segmentation defects of the vertebrae (SDV) are caused by aberrant somite formation during embryogenesis and result in irregular formation of the vertebrae and ribs. The Notch signal transduction pathway plays a critical role in somite formation and patterning in model vertebrates. In humans, mutations in several genes involved in the Notch pathway are associated with SDV, with both autosomal recessive (MESP2, DLL3, LFNG, HES7) and autosomal dominant (TBX6) inheritance. However, many individuals with SDV do not carry mutations in these genes. Using whole-exome capture and massive parallel sequencing, we identified compound heterozygous mutations in RIPPLY2 in two brothers with multiple regional SDV, with appropriate familial segregation. One novel mutation (c.A238T:p.Arg80*) introduces a premature stop codon. In transiently transfected C2C12 mouse myoblasts, the RIPPLY2 mutant protein demonstrated impaired transcriptional repression activity compared with wild-type RIPPLY2 despite similar levels of expression. The other mutation (c.240-4T>G), with minor allele frequency <0.002, lies in the highly conserved splice site consensus sequence 5' to the terminal exon. Ripply2 has a well-established role in somitogenesis and vertebral column formation, interacting at both gene and protein levels with SDV-associated Mesp2 and Tbx6. We conclude that compound heterozygous mutations in RIPPLY2 are associated with SDV, a new gene for this condition. © The Author 2014.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech has both auditory and visual components (heard speech sounds and seen articulatory gestures). During all perception, selective attention facilitates efficient information processing and enables concentration on high-priority stimuli. Auditory and visual sensory systems interact at multiple processing levels during speech perception and, further, the classical motor speech regions seem also to participate in speech perception. Auditory, visual, and motor-articulatory processes may thus work in parallel during speech perception, their use possibly depending on the information available and the individual characteristics of the observer. Because of their subtle speech perception difficulties possibly stemming from disturbances at elemental levels of sensory processing, dyslexic readers may rely more on motor-articulatory speech perception strategies than do fluent readers. This thesis aimed to investigate the neural mechanisms of speech perception and selective attention in fluent and dyslexic readers. We conducted four functional magnetic resonance imaging experiments, during which subjects perceived articulatory gestures, speech sounds, and other auditory and visual stimuli. Gradient echo-planar images depicting blood oxygenation level-dependent contrast were acquired during stimulus presentation to indirectly measure brain hemodynamic activation. Lip-reading activated the primary auditory cortex, and selective attention to visual speech gestures enhanced activity within the left secondary auditory cortex. Attention to non-speech sounds enhanced auditory cortex activity bilaterally; this effect showed modulation by sound presentation rate. A comparison between fluent and dyslexic readers' brain hemodynamic activity during audiovisual speech perception revealed stronger activation of predominantly motor speech areas in dyslexic readers during a contrast test that allowed exploration of the processing of phonetic features extracted from auditory and visual speech. The results show that visual speech perception modulates hemodynamic activity within auditory cortex areas once considered unimodal, and suggest that the left secondary auditory cortex specifically participates in extracting the linguistic content of seen articulatory gestures. They are strong evidence for the importance of attention as a modulator of auditory cortex function during both sound processing and visual speech perception, and point out the nature of attention as an interactive process (influenced by stimulus-driven effects). Further, they suggest heightened reliance on motor-articulatory and visual speech perception strategies among dyslexic readers, possibly compensating for their auditory speech perception difficulties.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are addressing a new problem of improving automatic speech recognition performance, given multiple utterances of patterns from the same class. We have formulated the problem of jointly decoding K multiple patterns given a single Hidden Markov Model. It is shown that such a solution is possible by aligning the K patterns using the proposed Multi Pattern Dynamic Time Warping algorithm followed by the Constrained Multi Pattern Viterbi Algorithm The new formulation is tested in the context of speaker independent isolated word recognition for both clean and noisy patterns. When 10 percent of speech is affected by a burst noise at -5 dB Signal to Noise Ratio (local), it is shown that joint decoding using only two noisy patterns reduces the noisy speech recognition error rate to about 51 percent, when compared to the single pattern decoding using the Viterbi Algorithm. In contrast a simple maximization of individual pattern likelihoods, provides only about 7 percent reduction in error rate.