945 resultados para Pathological Speech Signal Analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Language Program sponsored by the Advanced Research Projects Agency (ARPA). I begin by reviewing some of the ways that spontaneous spoken language differs from standard written language and discuss methods of coping with the difficulties of spontaneous speech. I then look at how systems cope with errors in speech recognition and at attempts to use natural language information to reduce recognition errors. Finally, I discuss how prosodic information in the speech signal might be used to improve understanding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A MATLAB-based computer code has been developed for the simultaneous wavelet analysis and filtering of several environmental time series, particularly focused on the analyses of cave monitoring data. The continuous wavelet transform, the discrete wavelet transform and the discrete wavelet packet transform have been implemented to provide a fast and precise time–period examination of the time series at different period bands. Moreover, statistic methods to examine the relation between two signals have been included. Finally, the entropy of curves and splines based methods have also been developed for segmenting and modeling the analyzed time series. All these methods together provide a user-friendly and fast program for the environmental signal analysis, with useful, practical and understandable results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of my study is to investigate the relationship between selected deictic shields on the pronoun ‘I’ and the involvement/detachment dichotomy in a sample of television news interviews. I focus on the use of personal pronouns in political discourse. Drawing upon Caffi’s (2007) classification of mitigating devices into bushes, hedges and shields, I focus on deictic shields on the pronoun ‘I’: I examine the way a selection of ‘I’-related deictic shields is employed in a collection of news interviews broadcast during the electoral campaign prior to the UK 2015 General Election. My purpose is to uncover the frequencies of each of the linguistic items selected and the pragmatic functions of those linguistic items in the involvement/detachment dichotomy. The research is structured as follows. Chapter 1 provides an account of previous studies on the three main areas of research: speech event analysis, institutional interaction and the news interview, and the UK 2015 General Election television programmes. Chapter 2 is centred on the involvement/detachment dichotomy: I provide an overview of nonlinguistic and linguistic features of involvement and detachment at all levels of sentence structure. Chapter 3 contains a detailed account of the data collection and data analysis process. Chapter 4 provides an accurate description of results in three steps: quantitative analysis, qualitative analysis and discussion of the pragmatic functions of the selected linguistic features of involvement and detachment. Chapter 5 includes a brief summary of the investigation, reviews the main findings, and indicates limitations of the study and possible inputs for further research. The results of the analysis confirm that, while some of the linguistic items examined point toward involvement, others have a detaching effect. I therefore conclude that deictic shields on the pronoun ‘I’ permit the realisation of the involvement/detachment dichotomy in the speech genre of the news interview.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The master thesis presents methods for intellectual analysis and visualization 3D EKG in order to increase the efficiency of ECG analysis by extracting additional data. Visualization is presented as part of the signal analysis tasks considered imaging techniques and their mathematical description. Have been developed algorithms for calculating and visualizing the signal attributes are described using mathematical methods and tools for mining signal. The model of patterns searching for comparison purposes of accuracy of methods was constructed, problems of a clustering and classification of data are solved, the program of visualization of data is also developed. This approach gives the largest accuracy in a task of the intellectual analysis that is confirmed in this work. Considered visualization and analysis techniques are also applicable to the multi-dimensional signals of a different kind.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To investigate laryngeal function and phonatory disturbance in children with traumatic brain injury (TBI), using both perceptual and instrumental techniques. Design and participants: The performance of 16 individuals with moderate to severe TBI acquired in childhood and 16 nonneurologicatly impaired control subjects was compared on a battery of perceptual (Frenchay Dysarthria Assessment, speech sample analysis) and instrumental (Aerophone II, laryngograph) assessments. Results and conclusions: As a group, the children with TBI demonstrated normal, or only minimally impaired laryngeal function, when compared with the control group, which contrasts with the significant laryngeal impairment noted in adults after TBI. Several reasons for the different findings in relation to laryngeal function in adults and children after TBI are postulated: (1) differing types of injury usually incurred by adults and children may result in a relatively decreased degree of neurologic impairment in these children, (2) differences in recovery potential between adults and children, and (3) the pediatric larynx is still developing, hence it may be better able to compensate for any impairment incurred.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Primary objective: To investigate the articulatory function of a group of children with traumatic brain injury (TBI), using both perceptual and instrumental techniques. Research design: The performance of 24 children with TBI was assessed on a battery of perceptual (Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech and speech sample analysis) and instrumental ( lip and tongue pressure transduction systems) assessments and compared with that of 24 non-neurologically impaired children matched for age and sex. Main outcomes: Perceptual assessment identified consonant and vowel imprecision, increased length of phonemes and overall reduction in speech intelligibility, while instrumental assessment revealed significant impairment in lip and tongue function in the TBI group, with rate and pressure in repetitive lip and tongue tasks particularly impaired. Significant negative correlations were identified between the degree of deviance of perceptual articulatory features and decreased function on many non-speech measures of lip function, as well as maximum tongue pressure and fine force tongue control at 20% of maximum tongue pressure. Additionally, sub-clinical articulatory deficits were identified in the children with TBI who were non-dysarthric. Conclusion: The results of the instrumental assessment of lip and tongue function support the finding of substantial articulatory dysfunction in this group of children following TBI. Hence, remediation of articulatory function should be a therapeutic priority in these children.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The need for low bit-rate speech coding is the result of growing demand on the available radio bandwidth for mobile communications both for military purposes and for the public sector. To meet this growing demand it is required that the available bandwidth be utilized in the most economic way to accommodate more services. Two low bit-rate speech coders have been built and tested in this project. The two coders combine predictive coding with delta modulation, a property which enables them to achieve simultaneously the low bit-rate and good speech quality requirements. To enhance their efficiency, the predictor coefficients and the quantizer step size are updated periodically in each coder. This enables the coders to keep up with changes in the characteristics of the speech signal with time and with changes in the dynamic range of the speech waveform. However, the two coders differ in the method of updating their predictor coefficients. One updates the coefficients once every one hundred sampling periods and extracts the coefficients from input speech samples. This is known in this project as the Forward Adaptive Coder. Since the coefficients are extracted from input speech samples, these must be transmitted to the receiver to reconstruct the transmitted speech sample, thus adding to the transmission bit rate. The other updates its coefficients every sampling period, based on information of output data. This coder is known as the Backward Adaptive Coder. Results of subjective tests showed both coders to be reasonably robust to quantization noise. Both were graded quite good, with the Forward Adaptive performing slightly better, but with a slightly higher transmission bit rate for the same speech quality, than its Backward counterpart. The coders yielded acceptable speech quality of 9.6kbps for the Forward Adaptive and 8kbps for the Backward Adaptive.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The primary purpose of this thesis was to present a theoretical large-signal analysis to study the power gain and efficiency of a microwave power amplifier for LS-band communications using software simulation. Power gain, efficiency, reliability, and stability are important characteristics in the power amplifier design process. These characteristics affect advance wireless systems, which require low-cost device amplification without sacrificing system performance. Large-signal modeling and input and output matching components are used for this thesis. Motorola's Electro Thermal LDMOS model is a new transistor model that includes self-heating affects and is capable of small-large signal simulations. It allows for most of the design considerations to be on stability, power gain, bandwidth, and DC requirements. The matching technique allows for the gain to be maximized at a specific target frequency. Calculations and simulations for the microwave power amplifier design were performed using Matlab and Microwave Office respectively. Microwave Office is the simulation software used in this thesis. The study demonstrated that Motorola's Electro Thermal LDMOS transistor in microwave power amplifier design process is a viable solution for common-source amplifier applications in high power base stations. The MET-LDMOS met the stability requirements for the specified frequency range without a stability-improvement model. The power gain of the amplifier circuit was improved through proper microwave matching design using input/output-matching techniques. The gain and efficiency of the amplifier improve approximately 4dB and 7.27% respectively. The gain value is roughly .89 dB higher than the maximum gain specified by the MRF21010 data sheet specifications. This work can lead to efficient modeling and development of high power LDMOS transistor implementations in commercial and industry applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Here we use two filtered speech tasks to investigate children’s processing of slow (<4 Hz) versus faster (∼33 Hz) temporal modulations in speech. We compare groups of children with either developmental dyslexia (Experiment 1) or speech and language impairments (SLIs, Experiment 2) to groups of typically-developing (TD) children age-matched to each disorder group. Ten nursery rhymes were filtered so that their modulation frequencies were either low-pass filtered (<4 Hz) or band-pass filtered (22 – 40 Hz). Recognition of the filtered nursery rhymes was tested in a picture recognition multiple choice paradigm. Children with dyslexia aged 10 years showed equivalent recognition overall to TD controls for both the low-pass and band-pass filtered stimuli, but showed significantly impaired acoustic learning during the experiment from low-pass filtered targets. Children with oral SLIs aged 9 years showed significantly poorer recognition of band pass filtered targets compared to their TD controls, and showed comparable acoustic learning effects to TD children during the experiment. The SLI samples were also divided into children with and without phonological difficulties. The children with both SLI and phonological difficulties were impaired in recognizing both kinds of filtered speech. These data are suggestive of impaired temporal sampling of the speech signal at different modulation rates by children with different kinds of developmental language disorder. Both SLI and dyslexic samples showed impaired discrimination of amplitude rise times. Implications of these findings for a temporal sampling framework for understanding developmental language disorders are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current hearing-assistive technology performs poorly in noisy multi-talker conditions. The goal of this thesis was to establish the feasibility of using EEG to guide acoustic processing in such conditions. To attain this goal, this research developed a model via the constructive research method, relying on literature review. Several approaches have revealed improvements in the performance of hearing-assistive devices under multi-talker conditions, namely beamforming spatial filtering, model-based sparse coding shrinkage, and onset enhancement of the speech signal. Prior research has shown that electroencephalography (EEG) signals contain information that concerns whether the person is actively listening, what the listener is listening to, and where the attended sound source is. This thesis constructed a model for using EEG information to control beamforming, model-based sparse coding shrinkage, and onset enhancement of the speech signal. The purpose of this model is to propose a framework for using EEG signals to control sound processing to select a single talker in a noisy environment containing multiple talkers speaking simultaneously. On a theoretical level, the model showed that EEG can control acoustical processing. An analysis of the model identified a requirement for real-time processing and that the model inherits the computationally intensive properties of acoustical processing, although the model itself is low complexity placing a relatively small load on computational resources. A research priority is to develop a prototype that controls hearing-assistive devices with EEG. This thesis concludes highlighting challenges for future research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Animal welfare has been an important research topic in animal production mainly in its ways of assessment. Vocalization is found to be an interesting tool for evaluating welfare as it provides data in a non-invasive way as well as it allows easy automation of process. The present research had as objective the implementation of an algorithm based on artificial neural network that had the potential of identifying vocalization related to welfare pattern indicatives. The research was done in two parts, the first was the development of the algorithm, and the second its validation with data from the field. Previous records allowed the development of the algorithm from behaviors observed in sows housed in farrowing cages. Matlab® software was used for implementing the network. It was selected a retropropagation gradient algorithm for training the network with the following stop criteria: maximum of 5,000 interactions or error quadratic addition smaller than 0.1. Validation was done with sows and piglets housed in commercial farm. Among the usual behaviors the ones that deserved enhancement were: the feed dispute at farrowing and the eventual risk of involuntary aggression between the piglets or between those and the sow. The algorithm was able to identify through the noise intensity the inherent risk situation of piglets welfare reduction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The feasibility of characterizing the dynamics of a spouted bed based on acoustic emission (AE) signals is evaluated. Acoustic emission signals were measured in a semi-cylindrical Plexiglas column of diameter 150 mm and height 1000 mm with a conical base of internal angle 60 degrees and 25 mm inlet orifice diameter. Data were obtained for U/U(ms), from 0.3 to 2.0, static bed height from 250 to 500 mm, and glass beads of diameter 1.2 and 2.4 mm. AE signals reflected the effects of particle size and U/U(ms), but in general were insensitive to bed depth, even when there were drastic changes in spouting flow patterns. The results indicate that the AE signals were insensitive to the spouted bed hydrodynamics for the conditions studied. Overall, it appears that the AE analysis is unlikely to be a suitable technique for discriminating spouted bed flow regimes, at least for the range of frequencies and operating conditions investigated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In an open channel, a hydraulic jump is the rapid transition from super- to sub-critical flow associated with strong turbulence and air bubble entrainment in the mixing layer. New experiments were performed at relatively large Reynolds numbers using phase-detection probes. Some new signal analysis provided characteristic air-water time and length scales of the vortical structures advecting the air bubbles in the developing shear flow. An analysis of the longitudinal air-water flow structure suggested little bubble clustering in the mixing layer, although an interparticle arrival time analysis showed some preferential bubble clustering for small bubbles with chord times below 3 ms. Correlation analyses yielded longitudinal air-water time scales Txx*V1/d1 of about 0.8 in average. The transverse integral length scale Z/d1 of the eddies advecting entrained bubbles was typically between 0.25 and 0.4, irrespective of the inflow conditions within the range of the investigations. Overall the findings highlighted the complicated nature of the air-water flow