990 resultados para speech analysis
Resumo:
HIA Report to the Minister for Health The Department of health requested that the Health Insurance Authority provide a report to the Minister ‘in anticipation of the enactment of the health Insurance (Amendment) Bill 2012.” Click here to download PDF 8.87MB The Health Insurance (Amendment) Bill 2012 Link to the Minister’s Second Stage Speech to the Dáil on 14 November 2012     Â
Resumo:
It has been demonstrated in earlier studies that patients with a cochlear implant have increased abilities for audio-visual integration because the crude information transmitted by the cochlear implant requires the persistent use of the complementary speech information from the visual channel. The brain network for these abilities needs to be clarified. We used an independent components analysis (ICA) of the activation (H2 (15) O) positron emission tomography data to explore occipito-temporal brain activity in post-lingually deaf patients with unilaterally implanted cochlear implants at several months post-implantation (T1), shortly after implantation (T0) and in normal hearing controls. In between-group analysis, patients at T1 had greater blood flow in the left middle temporal cortex as compared with T0 and normal hearing controls. In within-group analysis, patients at T0 had a task-related ICA component in the visual cortex, and patients at T1 had one task-related ICA component in the left middle temporal cortex and the other in the visual cortex. The time courses of temporal and visual activities during the positron emission tomography examination at T1 were highly correlated, meaning that synchronized integrative activity occurred. The greater involvement of the visual cortex and its close coupling with the temporal cortex at T1 confirm the importance of audio-visual integration in more experienced cochlear implant subjects at the cortical level.
Resumo:
Although both are fundamental terms in the humanities and social sciences, discourse and knowledge have seldom been explicitly related, and even less so in critical discourse studies. After a brief summary of what we know about these relationships in linguistics, psychology, epistemology and the social sciences, with special emphasis on the role of knowledge in the formation of mental models as a basis for discourse, I examine in more detail how a critical study of discourse and knowledge may be articulated in critical discourse studies. Thus, several areas of critical epistemic discourse analysis are identified, and then applied in a study of Tony Blair’s Iraq speech on March 18, 2003, in which he sought to legitimatize his decision to go to war in Iraq with George Bush. The analysis shows the various modes of how knowledge is managed and manipulated of all levels of discourse of this speech.
Resumo:
This case study presents corpus data gathered from a Spanish-English bilingual child with expressive language delay. Longitudinal data on the child’s linguistic development was collected from the onset of productive speech at age 1;1 until age 4 over the course of 28 video-taped sessions with the child’s principal caregivers. A literature review focused on the relationship between language delay and persisting disorders—including a discussion of the frequent difficulty in distinguishing between the two at early stages of bilingual development—is followed by an analysis of the child’s productive development in 2 distinct phases. An attempt is made to assess the child’s speech at age 4 for preliminary signs of SLI and to consider techniques for identifying ‘at risk’ bilingual children (that is, those with productive language delay, poor oral fluency, and family history of language problems) based on samples of recorded and transcribed speech.
Resumo:
The physiological basis of human cerebral asymmetry for language remains mysterious. We have used simultaneous physiological and anatomical measurements to investigate the issue. Concentrating on neural oscillatory activity in speech-specific frequency bands and exploring interactions between gestural (motor) and auditory-evoked activity, we find, in the absence of language-related processing, that left auditory, somatosensory, articulatory motor, and inferior parietal cortices show specific, lateralized, speech-related physiological properties. With the addition of ecologically valid audiovisual stimulation, activity in auditory cortex synchronizes with left-dominant input from the motor cortex at frequencies corresponding to syllabic, but not phonemic, speech rhythms. Our results support theories of language lateralization that posit a major role for intrinsic, hardwired perceptuomotor processing in syllabic parsing and are compatible both with the evolutionary view that speech arose from a combination of syllable-sized vocalizations and meaningful hand gestures and with developmental observations suggesting phonemic analysis is a developmentally acquired process.
Resumo:
In a system where tens of thousands of words are made up of a limited number of phonemes, many words are bound to sound alike. This similarity of the words in the lexicon as characterized by phonological neighbourhood density (PhND) has been shown to affect speed and accuracy of word comprehension and production. Whereas there is a consensus about the interfering nature of neighbourhood effects in comprehension, the language production literature offers a more contradictory picture with mainly facilitatory but also interfering effects reported on word production. Here we report both of these two types of effects in the same study. Multiple regression mixed models analyses were conducted on PhND effects on errors produced in a naming task by a group of 21 participants with aphasia. These participants produced more formal errors (interfering effect) for words in dense phonological neighbourhoods, but produced fewer nonwords and semantic errors (a facilitatory effect) with increasing density. In order to investigate the nature of these opposite effects of PhND, we further analysed a subset of formal errors and nonword errors by distinguishing errors differing on a single phoneme from the target (corresponding to the definition of phonological neighbours) from those differing on two or more phonemes. This analysis confirmed that only formal errors that were phonological neighbours of the target increased in dense neighbourhoods, while all other errors decreased. Based on additional observations favouring a lexical origin of these formal errors (they exceeded the probability of producing a real-word error by chance, were of a higher frequency, and preserved the grammatical category of the targets), we suggest that the interfering effect of PhND is due to competition between lexical neighbours and target words in dense neighbourhoods.
Resumo:
This paper analyzes applications of cumulant analysis in speech processing. A special focus is made on different second-order statistics. A dominant role is played by an integral representation for cumulants by means of integrals involving cyclic products of kernels.
Resumo:
The purpose of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from non-invasive intelligent methods. The methods selected in this case are speech biomarkers oriented to Sponta-neous Speech and Emotional Response Analysis. Thus the main goal of the present work is feature search in Spontaneous Speech oriented to pre-clinical evaluation for the definition of test for AD diagnosis by One-class classifier. One-class classifi-cation problem differs from multi-class classifier in one essen-tial aspect. In one-class classification it is assumed that only information of one of the classes, the target class, is available. In this work we explore the problem of imbalanced datasets that is particularly crucial in applications where the goal is to maximize recognition of the minority class as in medical diag-nosis. The use of information about outlier and Fractal Dimen-sion features improves the system performance.
Resumo:
tThis paper deals with the potential and limitations of using voice and speech processing to detect Obstruc-tive Sleep Apnea (OSA). An extensive body of voice features has been extracted from patients whopresent various degrees of OSA as well as healthy controls. We analyse the utility of a reduced set offeatures for detecting OSA. We apply various feature selection and reduction schemes (statistical rank-ing, Genetic Algorithms, PCA, LDA) and compare various classifiers (Bayesian Classifiers, kNN, SupportVector Machines, neural networks, Adaboost). S-fold crossvalidation performed on 248 subjects showsthat in the extreme cases (that is, 127 controls and 121 patients with severe OSA) voice alone is able todiscriminate quite well between the presence and absence of OSA. However, this is not the case withmild OSA and healthy snoring patients where voice seems to play a secondary role. We found that thebest classification schemes are achieved using a Genetic Algorithm for feature selection/reduction.
Resumo:
This action research observes a second year Japanese class at a university where foreign language courses are elective for undergraduate students. In this study, using the six strategies to teach Japanese speech acts that Ishihara and Cohen (2006) suggested, I conducted three classes and analyzed my teaching practice with a critical friend. These strategies assist learners toward the development of their understanding of the following Japanese speech acts and also keep the learners to use them in a manner appropriate to the context: (I) invitation and refusal; (2) compliments; and (3) asking for a permission. The aim of this research is not only to improve my instruction in relation to second language (L2) pragmatic development, but also to raise further questions and to develop future research. The findings are analyzed and the data derived from my journals, artifacts, students' work, observation sheets, interviews with my critical friend, and pretests and posttests are coded and presented. The analysis shows that (I) after my critical friend encouraged my study and my students gave me some positive comments after each lesson, I gained confidence in teaching the suggested speech acts; (2) teaching involved explaining concepts and strategies, creating the visual material (a video) showing the strategies, and explaining the relationship between the strategy and grammatical forms and samples of misusing the forms; (3) students' background and learning styles influenced lessons; and (4) pretest and posttests showed that the students' Icvel of their L2 appropriate pragmatics dramatically improved after each instruction. However, after careful observation, it was noted that some factors prevented students from producing the correct output even though they understood the speech act differences.
Resumo:
Timely detection of sudden change in dynamics that adversely affect the performance of systems and quality of products has great scientific relevance. This work focuses on effective detection of dynamical changes of real time signals from mechanical as well as biological systems using a fast and robust technique of permutation entropy (PE). The results are used in detecting chatter onset in machine turning and identifying vocal disorders from speech signal.Permutation Entropy is a nonlinear complexity measure which can efficiently distinguish regular and complex nature of any signal and extract information about the change in dynamics of the process by indicating sudden change in its value. Here we propose the use of permutation entropy (PE), to detect the dynamical changes in two non linear processes, turning under mechanical system and speech under biological system.Effectiveness of PE in detecting the change in dynamics in turning process from the time series generated with samples of audio and current signals is studied. Experiments are carried out on a lathe machine for sudden increase in depth of cut and continuous increase in depth of cut on mild steel work pieces keeping the speed and feed rate constant. The results are applied to detect chatter onset in machining. These results are verified using frequency spectra of the signals and the non linear measure, normalized coarse-grained information rate (NCIR).PE analysis is carried out to investigate the variation in surface texture caused by chatter on the machined work piece. Statistical parameter from the optical grey level intensity histogram of laser speckle pattern recorded using a charge coupled device (CCD) camera is used to generate the time series required for PE analysis. Standard optical roughness parameter is used to confirm the results.Application of PE in identifying the vocal disorders is studied from speech signal recorded using microphone. Here analysis is carried out using speech signals of subjects with different pathological conditions and normal subjects, and the results are used for identifying vocal disorders. Standard linear technique of FFT is used to substantiate thc results.The results of PE analysis in all three cases clearly indicate that this complexity measure is sensitive to change in regularity of a signal and hence can suitably be used for detection of dynamical changes in real world systems. This work establishes the application of the simple, inexpensive and fast algorithm of PE for the benefit of advanced manufacturing process as well as clinical diagnosis in vocal disorders.
Resumo:
Median filtering is a simple digital non—linear signal smoothing operation in which median of the samples in a sliding window replaces the sample at the middle of the window. The resulting filtered sequence tends to follow polynomial trends in the original sample sequence. Median filter preserves signal edges while filtering out impulses. Due to this property, median filtering is finding applications in many areas of image and speech processing. Though median filtering is simple to realise digitally, its properties are not easily analysed with standard analysis techniques,
Resumo:
Speech is the most natural means of communication among human beings and speech processing and recognition are intensive areas of research for the last five decades. Since speech recognition is a pattern recognition problem, classification is an important part of any speech recognition system. In this work, a speech recognition system is developed for recognizing speaker independent spoken digits in Malayalam. Voice signals are sampled directly from the microphone. The proposed method is implemented for 1000 speakers uttering 10 digits each. Since the speech signals are affected by background noise, the signals are tuned by removing the noise from it using wavelet denoising method based on Soft Thresholding. Here, the features from the signals are extracted using Discrete Wavelet Transforms (DWT) because they are well suitable for processing non-stationary signals like speech. This is due to their multi- resolutional, multi-scale analysis characteristics. Speech recognition is a multiclass classification problem. So, the feature vector set obtained are classified using three classifiers namely, Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Naive Bayes classifiers which are capable of handling multiclasses. During classification stage, the input feature vector data is trained using information relating to known patterns and then they are tested using the test data set. The performances of all these classifiers are evaluated based on recognition accuracy. All the three methods produced good recognition accuracy. DWT and ANN produced a recognition accuracy of 89%, SVM and DWT combination produced an accuracy of 86.6% and Naive Bayes and DWT combination produced an accuracy of 83.5%. ANN is found to be better among the three methods.
Resumo:
This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).