36 resultados para trained incapacity


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets. © 2010 Springer-Verlag.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper investigates several approaches to bootstrapping a new spoken language understanding (SLU) component in a target language given a large dataset of semantically-annotated utterances in some other source language. The aim is to reduce the cost associated with porting a spoken dialogue system from one language to another by minimising the amount of data required in the target language. Since word-level semantic annotations are costly, Semantic Tuple Classifiers (STCs) are used in conjunction with statistical machine translation models both of which are trained from unaligned data to further reduce development time. The paper presents experiments in which a French SLU component in the tourist information domain is bootstrapped from English data. Results show that training STCs on automatically translated data produced the best performance for predicting the utterance's dialogue act type, however individual slot/value pairs are best predicted by training STCs on the source language and using them to decode translated utterances. © 2010 ISCA.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the field of motor control, two hypotheses have been controversial: whether the brain acquires internal models that generate accurate motor commands, or whether the brain avoids this by using the viscoelasticity of musculoskeletal system. Recent observations on relatively low stiffness during trained movements support the existence of internal models. However, no study has revealed the decrease in viscoelasticity associated with learning that would imply improvement of internal models as well as synergy between the two hypothetical mechanisms. Previously observed decreases in electromyogram (EMG) might have other explanations, such as trajectory modifications that reduce joint torques. To circumvent such complications, we required strict trajectory control and examined only successful trials having identical trajectory and torque profiles. Subjects were asked to perform a hand movement in unison with a target moving along a specified and unusual trajectory, with shoulder and elbow in the horizontal plane at the shoulder level. To evaluate joint viscoelasticity during the learning of this movement, we proposed an index of muscle co-contraction around the joint (IMCJ). The IMCJ was defined as the summation of the absolute values of antagonistic muscle torques around the joint and computed from the linear relation between surface EMG and joint torque. The IMCJ during isometric contraction, as well as during movements, was confirmed to correlate well with joint stiffness estimated using the conventional method, i.e., applying mechanical perturbations. Accordingly, the IMCJ during the learning of the movement was computed for each joint of each trial using estimated EMG-torque relationship. At the same time, the performance error for each trial was specified as the root mean square of the distance between the target and hand at each time step over the entire trajectory. The time-series data of IMCJ and performance error were decomposed into long-term components that showed decreases in IMCJ in accordance with learning with little change in the trajectory and short-term interactions between the IMCJ and performance error. A cross-correlation analysis and impulse responses both suggested that higher IMCJs follow poor performances, and lower IMCJs follow good performances within a few successive trials. Our results support the hypothesis that viscoelasticity contributes more when internal models are inaccurate, while internal models contribute more after the completion of learning. It is demonstrated that the CNS regulates viscoelasticity on a short- and long-term basis depending on performance error and finally acquires smooth and accurate movements while maintaining stability during the entire learning process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents an agenda-based user simulator which has been extended to be trainable on real data with the aim of more closely modelling the complex rational behaviour exhibited by real users. The train-able part is formed by a set of random decision points that may be encountered during the process of receiving a system act and responding with a user act. A sample-based method is presented for using real user data to estimate the parameters that control these decisions. Evaluation results are given both in terms of statistics of generated user behaviour and the quality of policies trained with different simulators. Compared to a handcrafted simulator, the trained system provides a much better fit to corpus data and evaluations suggest that this better fit should result in improved dialogue performance. © 2010 Association for Computational Linguistics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In recent years, the use of morphological decomposition strategies for Arabic Automatic Speech Recognition (ASR) has become increasingly popular. Systems trained on morphologically decomposed data are often used in combination with standard word-based approaches, and they have been found to yield consistent performance improvements. The present article contributes to this ongoing research endeavour by exploring the use of the 'Morphological Analysis and Disambiguation for Arabic' (MADA) tools for this purpose. System integration issues concerning language modelling and dictionary construction, as well as the estimation of pronunciation probabilities, are discussed. In particular, a novel solution for morpheme-to-word conversion is presented which makes use of an N-gram Statistical Machine Translation (SMT) approach. System performance is investigated within a multi-pass adaptation/combination framework. All the systems described in this paper are evaluated on an Arabic large vocabulary speech recognition task which includes both Broadcast News and Broadcast Conversation test data. It is shown that the use of MADA-based systems, in combination with word-based systems, can reduce the Word Error Rates by up to 8.1 relative. © 2012 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In standard Gaussian Process regression input locations are assumed to be noise free. We present a simple yet effective GP model for training on input points corrupted by i.i.d. Gaussian noise. To make computations tractable we use a local linear expansion about each input point. This allows the input noise to be recast as output noise proportional to the squared gradient of the GP posterior mean. The input noise variances are inferred from the data as extra hyperparameters. They are trained alongside other hyperparameters by the usual method of maximisation of the marginal likelihood. Training uses an iterative scheme, which alternates between optimising the hyperparameters and calculating the posterior gradient. Analytic predictive moments can then be found for Gaussian distributed test points. We compare our model to others over a range of different regression problems and show that it improves over current methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pavement condition assessment is essential when developing road network maintenance programs. In practice, the data collection process is to a large extent automated. However, pavement distress detection (cracks, potholes, etc.) is mostly performed manually, which is labor-intensive and time-consuming. Existing methods either rely on complete 3D surface reconstruction, which comes along with high equipment and computation costs, or make use of acceleration data, which can only provide preliminary and rough condition surveys. In this paper we present a method for automated pothole detection in asphalt pavement images. In the proposed method an image is first segmented into defect and non-defect regions using histogram shape-based thresholding. Based on the geometric properties of a defect region the potential pothole shape is approximated utilizing morphological thinning and elliptic regression. Subsequently, the texture inside a potential defect shape is extracted and compared with the texture of the surrounding non-defect pavement in order to determine if the region of interest represents an actual pothole. This methodology has been implemented in a MATLAB prototype, trained and tested on 120 pavement images. The results show that this method can detect potholes in asphalt pavement images with reasonable accuracy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Several research studies have been recently initiated to investigate the use of construction site images for automated infrastructure inspection, progress monitoring, etc. In these studies, it is always necessary to extract material regions (concrete or steel) from the images. Existing methods made use of material's special color/texture ranges for material information retrieval, but they do not sufficiently discuss how to find these appropriate color/texture ranges. As a result, users have to define appropriate ones by themselves, which is difficult for those who do not have enough image processing background. This paper presents a novel method of identifying concrete material regions using machine learning techniques. Under the method, each construction site image is first divided into regions through image segmentation. Then, the visual features of each region are calculated and classified with a pre-trained classifier. The output value determines whether the region is composed of concrete or not. The method was implemented using C++ and tested over hundreds of construction site images. The results were compared with the manual classification ones to indicate the method's validity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A growing number of people are now entering the elderly age category in Japan; this raises the likelihood of more persons with dementia, as the probability of becoming cognitively impaired increases with age. There is an increasing need for caregivers who are well trained and experienced and who can pay special attention to the needs of people with dementia. Technology can play an important role in helping such people and their caregivers. A lack of mutual understanding between caregivers and researchers regarding the appropriate uses of assistive technologies is another problem. We have described the relationship between information and communication technology (ICT), especially assistive technologies, and social issues as a first step towards developing a technology roadmap. © 2012 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A recent trend in spoken dialogue research is the use of reinforcement learning to train dialogue systems in a simulated environment. Past researchers have shown that the types of errors that are simulated can have a significant effect on simulated dialogue performance. Since modern systems typically receive an N-best list of possible user utterances, it is important to be able to simulate a full N-best list of hypotheses. This paper presents a new method for simulating such errors based on logistic regression, as well as a new method for simulating the structure of N-best lists of semantics and their probabilities, based on the Dirichlet distribution. Off-line evaluations show that the new Dirichlet model results in a much closer match to the receiver operating characteristics (ROC) of the live data. Experiments also show that the logistic model gives confusions that are closer to the type of confusions observed in live situations. The hope is that these new error models will be able to improve the resulting performance of trained dialogue systems. © 2012 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current commercial dialogue systems typically use hand-crafted grammars for Spoken Language Understanding (SLU) operating on the top one or two hypotheses output by the speech recogniser. These systems are expensive to develop and they suffer from significant degradation in performance when faced with recognition errors. This paper presents a robust method for SLU based on features extracted from the full posterior distribution of recognition hypotheses encoded in the form of word confusion networks. Following [1], the system uses SVM classifiers operating on n-gram features, trained on unaligned input/output pairs. Performance is evaluated on both an off-line corpus and on-line in a live user trial. It is shown that a statistical discriminative approach to SLU operating on the full posterior ASR output distribution can substantially improve performance both in terms of accuracy and overall dialogue reward. Furthermore, additional gains can be obtained by incorporating features from the previous system output. © 2012 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We describe our work on developing a speech recognition system for multi-genre media archives. The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we present Multi-level Adaptive Networks (MLAN), a novel technique for incorporating information from out-of-domain posterior features using deep neural networks. We show that it provides a substantial reduction in WER over other systems, with relative WER reductions of 15% over a PLP baseline, 9% over in-domain tandem features and 8% over the best out-of-domain tandem features. © 2012 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statistical approaches for building non-rigid deformable models, such as the Active Appearance Model (AAM), have enjoyed great popularity in recent years, but typically require tedious manual annotation of training images. In this paper, a learning based approach for the automatic annotation of visually deformable objects from a single annotated frontal image is presented and demonstrated on the example of automatically annotating face images that can be used for building AAMs for fitting and tracking. This approach employs the idea of initially learning the correspondences between landmarks in a frontal image and a set of training images with a face in arbitrary poses. Using this learner, virtual images of unseen faces at any arbitrary pose for which the learner was trained can be reconstructed by predicting the new landmark locations and warping the texture from the frontal image. View-based AAMs are then built from the virtual images and used for automatically annotating unseen images, including images of different facial expressions, at any random pose within the maximum range spanned by the virtually reconstructed images. The approach is experimentally validated by automatically annotating face images from three different databases. © 2009 IEEE.