14 resultados para Speaker Recognition, Text-constrained, Multilingual, Speaker Verification, HMMs

em Aston University Research Archive


Relevância:

50.00% 50.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The research presented in this paper is part of an ongoing investigation into how best to incorporate speech-based input within mobile data collection applications. In our previous work [1], we evaluated the ability of a single speech recognition engine to support accurate, mobile, speech-based data input. Here, we build on our previous research to compare the achievable speaker-independent accuracy rates of a variety of speech recognition engines; we also consider the relative effectiveness of different speech recognition engine and microphone pairings in terms of their ability to support accurate text entry under realistic mobile conditions of use. Our intent is to provide some initial empirical data derived from mobile, user-based evaluations to support technological decisions faced by developers of mobile applications that would benefit from, or require, speech-based data entry facilities.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper uses a feminist post-structuralist approach to examine the gendered identities of a sample of British business leaders in Britain. While recent national surveys offer many material reasons why women are acutely under-represented as business leaders, the role of language is rarely addressed. This paper explores the ways in which ten senior women and men construct their sense of leadership identities through the medium of interview narratives. Drawing upon two poststructuralist models of analysis (Derrida’s 1987 theory of deconstruction and Bakhtin’s 1927/1981 concept of double-voiced discourse), the paper shows how both females and males are able to shift pragmatically between interwoven corporate discourses, which demand competing cultural allegiances from one moment to the next, allegiances constantly tested by the rapid change and uncertainty that characterise global business. While male leaders experience a relative freedom of movement between different cultural discourses, female leaders are circumscribed by negative and reductive representations of female speech and behaviour. In sum, senior women are required constantly to observe, review, police and repair their use of leadership language, which potentially undermines their confidence and authority as leaders.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The present thesis investigates mode related aspects in biology lecture discourse and attempts to identify the position of this variety along the spontaneous spoken versus planned written language continuum. Nine lectures (of 43,000 words) consisting of three sets of three lectures each, given by the three lecturers at Aston University, make up the corpus. The indeterminacy of the results obtained from the investigation of grammatical complexity as measured in subordination motivates the need to take the analysis beyond sentence level to the study of mode related aspects in the use of sentence-initial connectives, sub-topic shifting and paraphrase. It is found that biology lecture discourse combines features typical of speech and writing at sentence as well as discourse level: thus, subordination is more used than co-ordination, but one degree complexity sentence is favoured; some sentence initial connectives are only found in uses typical of spoken language but sub-topic shift signalling (generally introduced by a connective) typical of planned written language is a major feature of the lectures; syntactic and lexical revision and repetition, interrupted structures are found in the sub-topic shift signalling utterance and paraphrase, but the text is also amenable to analysis into sentence like units. On the other hand, it is also found that: (1) while there are some differences in the use of a given feature, inter-speaker variation is on the whole not significant; (2) mode related aspects are often motivated by the didactic function of the variety; and (3) the structuring of the text follows a sequencing whose boundaries are marked by sub-topic shifting and the summary paraphrase. This study enables us to draw four theoretical conclusions: (1) mode related aspects cannot be approached as a simple dichotomy since a combination of aspects of both speech and writing are found in a given feature. It is necessary to go to the level of textual features to identify mode related aspects; (2) homogeneity is dominant in this sample of lectures which suggests that there is a high level of standardization in this variety; (3) the didactic function of the variety is manifested in some mode related aspects; (4) the features studied play a role in the structuring of the text.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Master of Arts dissertation

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article investigates potential effects which (the recontextualisation of) interpreted discourse can have on the positioning of participants. The discursive event which forms the basis of the analysis are international press conferences which bring politicians and journalists together. The dominant question addressed is: (How) do interpreter-mediated encounters influence the positioning of participants and thus the construction of interactional and social roles? The article illustrates that methods of (critical) discourse analysis can be used to identify positioning strategies which are employed by participants in such triadic exchanges. The data come from press conferences which involve English, German, and French as source and target languages.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study is concerned with the linguistic situation in the town of Kirkuk in north eastern Iraq. In this town there are three main ethnic groups: Kurds, Arabs and Turkmana with some very smell minorities such as Chaldeene, Assyrians and Armenians. The languages spoken by these three ethnic groups belong to different language Family groups. In the First cart of the study the historical background of the population, a review of the literature, both of the present linguistic situation in Kirkuk end of relevant sociolinguistics in general, and the theoretical Framework, have been discussed in detail in order to provide background to this study which is mainly concerned with the Following areas: 1. The relationships existing between ethnic background and language usage and language loyalty in Kirkuk. 2. The attitudes of Kirkukiane towards language maintenance and language shift in Kirkuk. 3. Bilingual, multilingual individual communicative competence of Kurds, Arabs and Turkmans in the languages concerned, including the degree to which such a speaker is bilingual or multilingual and the nature of bilingualism or multilingualism in different domains and situations in Kirkuk. To throw light a these areas a situationally-oriented language survey was conducted; the relevant data was collected by randomly distributed questionnaire, by parsonal interview, by personal observation of language use and language attitudes in this town. The data subjected to commuter analysis and the results proved that the were no significant and substantial correlations between the language use, attitudes and competence based on the socio-economic status of respondents in this town, on the other hand, the correlations between the ethnic backgrounds and the language, use, attitudes and competence are indisoutable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis addresses the viability of automatic speech recognition for control room systems; with careful system design, automatic speech recognition (ASR) devices can be useful means for human computer interaction in specific types of task. These tasks can be defined as complex verbal activities, such as command and control, and can be paired with spatial tasks, such as monitoring, without detriment. It is suggested that ASR use be confined to routine plant operation, as opposed the critical incidents, due to possible problems of stress on the operators' speech.  It is proposed that using ASR will require operators to adapt a commonly used skill to cater for a novel use of speech. Before using the ASR device, new operators will require some form of training. It is shown that a demonstration by an experienced user of the device can lead to superior performance than instructions. Thus, a relatively cheap and very efficient form of operator training can be supplied by demonstration by experienced ASR operators. From a series of studies into speech based interaction with computers, it is concluded that the interaction be designed to capitalise upon the tendency of operators to use short, succinct, task specific styles of speech. From studies comparing different types of feedback, it is concluded that operators be given screen based feedback, rather than auditory feedback, for control room operation. Feedback will take two forms: the use of the ASR device will require recognition feedback, which will be best supplied using text; the performance of a process control task will require task feedback integrated into the mimic display. This latter feedback can be either textual or symbolic, but it is suggested that symbolic feedback will be more beneficial. Related to both interaction style and feedback is the issue of handling recognition errors. These should be corrected by simple command repetition practices, rather than use error handling dialogues. This method of error correction is held to be non intrusive to primary command and control operations. This thesis also addresses some of the problems of user error in ASR use, and provides a number of recommendations for its reduction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Three experiments assessed the development of children's part and configural (part-relational) processing in object recognition during adolescence. In total, 312 school children aged 7-16 years and 80 adults were tested in 3-alternative forced choice (3-AFC) tasks. They judged the correct appearance of upright and inverted presented familiar animals, artifacts, and newly learned multipart objects, which had been manipulated either in terms of individual parts or part relations. Manipulation of part relations was constrained to either metric (animals, artifacts, and multipart objects) or categorical (multipart objects only) changes. For animals and artifacts, even the youngest children were close to adult levels for the correct recognition of an individual part change. By contrast, it was not until 11-12 years of age that they achieved similar levels of performance with regard to altered metric part relations. For the newly learned multipart objects, performance was equivalent throughout the tested age range for upright presented stimuli in the case of categorical part-specific and part-relational changes. In the case of metric manipulations, the results confirmed the data pattern observed for animals and artifacts. Together, the results provide converging evidence, with studies of face recognition, for a surprisingly late consolidation of configural-metric relative to part-based object recognition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Full text: Several Lancet publications have questioned the value of glycaemic control in diabetic patients. For example, in their Comment (Sept 29, p 1103),1 John Cleland and Stephen Atkin state that “Improved glycaemic control is not a surrogate for effective care of patients who have diabetes”, and Victor Montori and colleagues (p 1104)2 claim that “HbA1c loses its validity as a surrogate marker when patients have a constellation of metabolic abnormalities”. We are concerned that the reaction against “glucocentricity” in the field of diabetes has gone too far. Even the UK's National Prescribing Centre website, carrying the National Health Service logo, includes comments that undermine the value of glycaemic control. For example, referring to the United Kingdom Prospective Diabetes Study (UKPDS), this site states that “Compared with ‘conventional control’ there was no benefit from tight control of blood glucose with sulphonylureas or insulin with regard to total mortality, diabetes-related death, macrovascular outcomes or microvascular outcomes, including all the most serious ones such as blindness or kidney failure”.3 It is well established that better glycaemic control reduces long-term microvascular complications in type 1 and type 2 diabetes.4 In type 2 diabetes, the UKPDS reported that a composite microvascular endpoint (retinopathy requiring photocoagulation, vitreous haemorrhage, and fatal or non-fatal renal failure) was reduced by 25% in patients randomised to intensive glucose control (p=0·0099).4 To imply that these are not patient-relevant outcomes is to distort the evidence. Many studies have also found that improved glycaemic control reduces macrovascular complications.5 Do not be misled: glycaemic control remains a crucial component in the care of people with diabetes. The authors have received research support and undertaken ad hoc consultancies and speaker engagements for several pharmaceutical companies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As one of the most popular deep learning models, convolution neural network (CNN) has achieved huge success in image information extraction. Traditionally CNN is trained by supervised learning method with labeled data and used as a classifier by adding a classification layer in the end. Its capability of extracting image features is largely limited due to the difficulty of setting up a large training dataset. In this paper, we propose a new unsupervised learning CNN model, which uses a so-called convolutional sparse auto-encoder (CSAE) algorithm pre-Train the CNN. Instead of using labeled natural images for CNN training, the CSAE algorithm can be used to train the CNN with unlabeled artificial images, which enables easy expansion of training data and unsupervised learning. The CSAE algorithm is especially designed for extracting complex features from specific objects such as Chinese characters. After the features of articficial images are extracted by the CSAE algorithm, the learned parameters are used to initialize the first CNN convolutional layer, and then the CNN model is fine-Trained by scene image patches with a linear classifier. The new CNN model is applied to Chinese scene text detection and is evaluated with a multilingual image dataset, which labels Chinese, English and numerals texts separately. More than 10% detection precision gain is observed over two CNN models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel template matching approach for the discrimination of handwritten and machine-printed text. We first pre-process the scanned document images by performing denoising, circles/lines exclusion and word-block level segmentation. We then align and match characters in a flexible sized gallery with the segmented regions, using parallelised normalised cross-correlation. The experimental results over the Pattern Recognition & Image Analysis Research Lab-Natural History Museum (PRImA-NHM) dataset show remarkably high robustness of the algorithm in classifying cluttered, occluded and noisy samples, in addition to those with significant high missing data. The algorithm, which gives 84.0% classification rate with false positive rate 0.16 over the dataset, does not require training samples and generates compelling results as opposed to the training-based approaches, which have used the same benchmark.