961 resultados para Automatic speech recognition (ASR)
Resumo:
OBJECTIVE: To identify and quantify sources of variability in scores on the speech, spatial, and qualities of hearing scale (SSQ) and its short forms among normal-hearing and hearing-impaired subjects using a French-language version of the SSQ. DESIGN: Multi-regression analyses of SSQ scores were performed using age, gender, years of education, hearing loss, and hearing-loss asymmetry as predictors. Similar analyses were performed for each subscale (Speech, Spatial, and Qualities), for several SSQ short forms, and for differences in subscale scores. STUDY SAMPLE: One hundred normal-hearing subjects (NHS) and 230 hearing-impaired subjects (HIS). RESULTS: Hearing loss in the better ear and hearing-loss asymmetry were the two main predictors of scores on the overall SSQ, the three main subscales, and the SSQ short forms. The greatest difference between the NHS and HIS was observed for the Speech subscale, and the NHS showed scores well below the maximum of 10. An age effect was observed mostly on the Speech subscale items, and the number of years of education had a significant influence on several Spatial and Qualities subscale items. CONCLUSION: Strong similarities between SSQ scores obtained across different populations and languages, and between SSQ and short forms, underline their potential international use.
Resumo:
This paper gives a full description of the phonetics and phonology of Traditional Cockney and Popular London speech, treating these varieties as constituting a continuum rather than two separate dialects. Exemplification of the vowels, diphthongs and consonants is provided, both in isolate words and in connected speech, along with their range of variation. The frequencies of the vowels have been charted on the basis of the pronunciation of three elderly male speakers. Regarding the consonants, there are detailed observations on the features typically associated with the linguistic varieties examined: strong aspiration of unvoiced plosives, glottalization, H-dropping, L-vocalization and TH-fronting. A section on prosody provides coverage of lexical stress, rhythm and intonation. The paper takes into account up-to-date research on these phenomena, but does not deal with the most recent vowel shifts, some of which form part of Multi-cultural London English.
Resumo:
In fetal brain MRI, most of the high-resolution reconstruction algorithms rely on brain segmentation as a preprocessing step. Manual brain segmentation is however highly time-consuming and therefore not a realistic solution. In this work, we assess on a large dataset the performance of Multiple Atlas Fusion (MAF) strategies to automatically address this problem. Firstly, we show that MAF significantly increase the accuracy of brain segmentation as regards single-atlas strategy. Secondly, we show that MAF compares favorably with the most recent approach (Dice above 0.90). Finally, we show that MAF could in turn provide an enhancement in terms of reconstruction quality.
Resumo:
The recognition of prior experiential learning (RPEL) involves the assessment ofskills and knowledge acquired by an individual through previous experience, which isnot necessarily related to an academic context. RPEL practices are far from generalisedin higher education, and there is a lack of specific guidelines on how to implement RPLprograms in particular settings, such as management education or online programs. TheRPEL pilot program developed in a Spanish virtual university is used throughout thearticle as the basis for further reflection on the design and implementation of RPEL inonline postgraduate education in the business field. The role of competences as a centraltheoretical foundation for RPEL is explained, and the context and characteristics of theRPEL program described. Special attention is paid to the key elements of the program¿sdesign and to the practical aspects of its implementation. The results of the program areassessed and general conclusions and suggestions for further research are discussed.
Resumo:
In this paper, we propose a new supervised linearfeature extraction technique for multiclass classification problemsthat is specially suited to the nearest neighbor classifier (NN).The problem of finding the optimal linear projection matrix isdefined as a classification problem and the Adaboost algorithmis used to compute it in an iterative way. This strategy allowsthe introduction of a multitask learning (MTL) criterion in themethod and results in a solution that makes no assumptions aboutthe data distribution and that is specially appropriated to solvethe small sample size problem. The performance of the methodis illustrated by an application to the face recognition problem.The experiments show that the representation obtained followingthe multitask approach improves the classic feature extractionalgorithms when using the NN classifier, especially when we havea few examples from each class
Resumo:
Peer-reviewed
Resumo:
Behavior-based navigation of autonomous vehicles requires the recognition of the navigable areas and the potential obstacles. In this paper we describe a model-based objects recognition system which is part of an image interpretation system intended to assist the navigation of autonomous vehicles that operate in industrial environments. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using a rule-based cooperative expert system
Resumo:
This Master's thesis addresses the design and implementation of the optical character recognition (OCR) system for a mobile device working on the Symbian operating system. The developed OCR system, named OCRCapriccio, emphasizes the modularity, effective extensibility and reuse. The system consists of two parts which are the graphical user interface and the OCR engine that was implemented as a plug-in. In fact, the plug-in includes two implementations of the OCR engine for enabling two types of recognition: the bitmap comparison based recognition and statistical recognition. The implementation results have shown that the approach based on bitmap comparison is more suitable for the Symbian environment because of its nature. Although the current implementation of bitmap comparison is lacking in accuracy, further development should be done in its direction. The biggest challenges of this work were related to developing an OCR scheme that would be suitable for Symbian OS Smartphones that have limited computational power and restricted resources.
Resumo:
The degradation of the catalytic filaments is the main factor limiting the industrial implementation of the hot wire chemical vapor deposition (HWCVD) technique. Up to now, no solution has been found to protect the catalytic filaments used in HWCVD without compromising their catalytic activity. Probably, the definitive solution relies on the automatic replacement of the catalytic filaments. In this work, the results of the validation tests of a new apparatus for the automatic replacement of the catalytic filaments are reported. The functionalities of the different parts have been validated using a 0.2 mm diameter tungsten filament under uc-Si:H deposition conditions.
Resumo:
In liberalized electricity markets, which have taken place in many countries over the world, the electricity distribution companies operate in the competitive conditions. Therefore, accurate information about the customers’ energy consumption plays an essential role for the budget keeping of the distribution company and for correct planning and operation of the distribution network. This master’s thesis is focused on the description of the possible benefits for the electric utilities and residential customers from the automatic meter reading system usage. Major benefits of the AMR, illustrated in the thesis, are distribution network management, power quality monitoring, load modelling, and detection of the illegal usage of the electricity. By the example of the power system state estimation, it was illustrated that even the partial installation of the AMR in the customer side leads to more accurate data about the voltage and power levels in the whole network. The thesis also contains the description of the present situation of the AMR integration in Russia.
Resumo:
Language acquisition is a complex process that requires the synergic involvement of different cognitive functions, which include extracting and storing the words of the language and their embedded rules for progressive acquisition of grammatical information. As has been shown in other fields that study learning processes, synchronization mechanisms between neuronal assemblies might have a key role during language learning. In particular, studying these dynamics may help uncover whether different oscillatory patterns sustain more item-based learning of words and rule-based learning from speech input. Therefore, we tracked the modulation of oscillatory neural activity during the initial exposure to an artificial language, which contained embedded rules. We analyzed both spectral power variations, as a measure of local neuronal ensemble synchronization, as well as phase coherence patterns, as an index of the long-range coordination of these local groups of neurons. Synchronized activity in the gamma band (2040 Hz), previously reported to be related to the engagement of selective attention, showed a clear dissociation of local power and phase coherence between distant regions. In this frequency range, local synchrony characterized the subjects who were focused on word identification and was accompanied by increased coherence in the theta band (48 Hz). Only those subjects who were able to learn the embedded rules showed increased gamma band phase coherence between frontal, temporal, and parietal regions.
Resumo:
In this paper, we present the Melodic Analysis of Speech method (MAS) that enables us to carry out complete and objective descriptions of a language's intonation, from a phonetic (melodic) point of view as well as from a phonological point of view. It is based on the acoustic-perceptive method by Cantero (2002), which has already been used in research on prosody in different languages. In this case, we present the results of its application in Spanish and Catalan.