587 resultados para Phonetic alphabet.
Resumo:
Predecessors’ research found that feeling-of-knowing and feeling-of-not-knowing was two different cognitional processes. Processing depth had more good effects on FOK judgment, but it had little effects on FOnK judgment, furthermore, it perhaps decreased the accuracy of FOnK judgment. On the base of predecessors’ research the experiment discussed the different effects on FOK judgment and FOnK judgment by processing depth and memory materials of different kinds. The first purpose was to find that the effects of processing depth on FOK judgment and FOnK judgment were different or not. The second purpose was to reveal the two different memory materials of the Paired-Chinese-words and the Paired- Chinese-phonetic-alphabet would cause difference on the grade and accuracy of FOK judgment or not, and if the two different kinds of memory materials took different effects on FOK judgment and FOnK judgment. The third purpose was to search if there was interaction on processing depth and different kinds of memory materials. The experiment used the Paired-Chinese-words and the Paired- Chinese-phonetic-alphabet as the materials, and regarded processing depth in the time of encoding stage and different kinds of memory materials as the independent variable. The experiment regarded validity of memory; the grade of FOK judgment; the accuracy of FOK judgment; the accuracy of FOnK judgment as the dependent variable. The experiment adopted the “RJR” normal researching form of FOK judgment projected by Hart. The result of the researching proved that in the condition of deep processing in the time of encoding stage, the validity of memory; the grade of FOK judgment; the accuracy of FOK judgment were higher than in the condition of superficial processing, but processing depth had little effect on accuracy of FOnK judgment. FOK judgment and FOnK judgment were two different cognitional processes. Memory materials of different kinds led clear difference on the dependent variable of the validity of memory; the grade of FOK judgment; the accuracy of FOK judgment, and also had little effect on accuracy of FOnK judgment. Processing depth and different kinds of memory materials had interaction on their effects on FOK judgment. Regard the accuracy of recall, the percentage of “feeling of knowing”, the percentage of “feeling of not knowing”, and the grade of FOK judgment as the dependent variables, memory materials of different kinds make little effect in the condition of superficial processing in the time of encoding stage, but in the condition of deep processing in the time of encoding stage, Chinese characters was higher than Chinese phonetic alphabet.
Resumo:
Between 1700 and 1850, per-capita income doubled in Europe while falling in the rest of Eurasia. Neither geography nor economic institutions can explain this sudden divergence. Here the consequences of differences in communications technology are examined. For the first time, there appeared in Europe a combination of a standardized medium (national vernaculars with a phonetic alphabet) and a non-standardized message (competing religious, political and scientific ideas). The result was an unprecedented fall in the cost of combining ideas and burst of productivity-raising innovation. Elsewhere, decreasing standardization of the medium and increasing standardization of the message blocked innovation.
Resumo:
An abridgement of the 5th part was published by the English Dialect Society, 1899, in v. 24 under title: "English dialects, their sounds and homes."
Resumo:
A collection of miscellaneous pamphlets.
Resumo:
Mode of access: Internet.
Resumo:
Mode of access: Internet.
Resumo:
Spoken term detection (STD) popularly involves performing word or sub-word level speech recognition and indexing the result. This work challenges the assumption that improved speech recognition accuracy implies better indexing for STD. Using an index derived from phone lattices, this paper examines the effect of language model selection on the relationship between phone recognition accuracy and STD accuracy. Results suggest that language models usually improve phone recognition accuracy but their inclusion does not always translate to improved STD accuracy. The findings suggest that using phone recognition accuracy to measure the quality of an STD index can be problematic, and highlight the need for an alternative that is more closely aligned with the goals of the specific detection task.
Resumo:
While spoken term detection (STD) systems based on word indices provide good accuracy, there are several practical applications where it is infeasible or too costly to employ an LVCSR engine. An STD system is presented, which is designed to incorporate a fast phonetic decoding front-end and be robust to decoding errors whilst still allowing for rapid search speeds. This goal is achieved through mono-phone open-loop decoding coupled with fast hierarchical phone lattice search. Results demonstrate that an STD system that is designed with the constraint of a fast and simple phonetic decoding front-end requires a compromise to be made between search speed and search accuracy.
Resumo:
This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of sTD accuracy, making it an ideal candiate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phe classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a non-linear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but such approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks on the other hand, optimise the parameters of speech enhancement algorithms based on state sequences generated by a speech recogniser for utterances of known transcriptions. Previous applications of LIMA frameworks have generated a set of global enhancement parameters for all model states without taking in account the distribution of model occurrence, making optimisation susceptible to favouring frequently occurring models, in particular silence. In this paper, we demonstrate the existence of highly disproportionate phonetic distributions on two corpora with distinct speech tasks, and propose to normalise the influence of each phone based on a priori occurrence probabilities. Likelihood analysis and speech recognition experiments verify this approach for improving ASR performance in noisy environments.
Resumo:
Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.
Resumo:
For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.