874 resultados para Language acquisition.


Relevância:

20.00% 20.00%

Publicador:

Resumo:

NMR-based approach to metabolomics typically involves the collection of two-dimensional (2D) heteronuclear correlation spectra for identification and assignment of metabolites. In case of spectral overlap, a 3D spectrum becomes necessary, which is hampered by slow data acquisition for achieving sufficient resolution. We describe here a method to simultaneously acquire three spectra (one 3D and two 2D) in a single data set, which is based on a combination of different fast data acquisition techniques such as G-matrix Fourier transform (GFT) NMR spectroscopy, parallel data acquisition and non-uniform sampling. The following spectra are acquired simultaneously: (1) C-13 multiplicity edited GFT (3,2)D HSQC-TOCSY, (2) 2D H-1- H-1] TOCSY and (3) 2D C-13- H-1] HETCOR. The spectra are obtained at high resolution and provide high-dimensional spectral information for resolving ambiguities. While the GFT spectrum has been shown previously to provide good resolution, the editing of spin systems based on their CH multiplicities further resolves the ambiguities for resonance assignments. The experiment is demonstrated on a mixture of 21 metabolites commonly observed in metabolomics. The spectra were acquired at natural abundance of C-13. This is the first application of a combination of three fast NMR methods for small molecules and opens up new avenues for high-throughput approaches for NMR-based metabolomics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Graph algorithms have been shown to possess enough parallelism to keep several computing resources busy-even hundreds of cores on a GPU. Unfortunately, tuning their implementation for efficient execution on a particular hardware configuration of heterogeneous systems consisting of multicore CPUs and GPUs is challenging, time consuming, and error prone. To address these issues, we propose a domain-specific language (DSL), Falcon, for implementing graph algorithms that (i) abstracts the hardware, (ii) provides constructs to write explicitly parallel programs at a higher level, and (iii) can work with general algorithms that may change the graph structure (morph algorithms). We illustrate the usage of our DSL to implement local computation algorithms (that do not change the graph structure) and morph algorithms such as Delaunay mesh refinement, survey propagation, and dynamic SSSP on GPU and multicore CPUs. Using a set of benchmark graphs, we illustrate that the generated code performs close to the state-of-the-art hand-tuned implementations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Following transmission, HIV-1 adapts in the new host by acquiring mutations that allow it to escape from the host immune response at multiple epitopes. It also reverts mutations associated with epitopes targeted in the transmitting host but not in the new host. Moreover, escape mutations are often associated with additional compensatory mutations that partially recover fitness costs. It is unclear whether recombination expedites this process of multi-locus adaptation. To elucidate the role of recombination, we constructed a detailed population dynamics model that integrates viral dynamics, host immune response at multiple epitopes through cytotoxic T lymphocytes, and viral evolution driven by mutation, recombination, and selection. Using this model, we compute the expected waiting time until the emergence of the strain that has gained escape and compensatory mutations against the new host's immune response, and reverted these mutations at epitopes no longer targeted. We find that depending on the underlying fitness landscape, shaped by both costs and benefits of mutations, adaptation proceeds via distinct dominant pathways with different effects of recombination, in particular distinguishing escape and reversion. When adaptation at a single epitope is involved, recombination can substantially accelerate immune escape but minimally affects reversion. When multiple epitopes are involved, recombination can accelerate or inhibit adaptation depending on the fitness landscape. Specifically, recombination tends to delay adaptation when a purely uphill fitness landscape is accessible at each epitope, and accelerate it when a fitness valley is associated with each epitope. Our study points to the importance of recombination in shaping the adaptation of HIV-1 following its transmission to new hosts, a process central to T cell-based vaccine strategies. (C) 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates unsupervised test-time adaptation of language models (LM) using discriminative methods for a Mandarin broadcast speech transcription and translation task. A standard approach to adapt interpolated language models to is to optimize the component weights by minimizing the perplexity on supervision data. This is a widely made approximation for language modeling in automatic speech recognition (ASR) systems. For speech translation tasks, it is unclear whether a strong correlation still exists between perplexity and various forms of error cost functions in recognition and translation stages. The proposed minimum Bayes risk (MBR) based approach provides a flexible framework for unsupervised LM adaptation. It generalizes to a variety of forms of recognition and translation error metrics. LM adaptation is performed at the audio document level using either the character error rate (CER), or translation edit rate (TER) as the cost function. An efficient parameter estimation scheme using the extended Baum-Welch (EBW) algorithm is proposed. Experimental results on a state-of-the-art speech recognition and translation system are presented. The MBR adapted language models gave the best recognition and translation performance and reduced the TER score by up to 0.54% absolute. © 2007 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaptation may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences. ©2010 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents a new method for acquiring three-dimensional (3-D) volumes of ultrasonic axial strain data. The method uses a mechanically-swept probe to sweep out a single volume while applying a continuously varying axial compression. Acquisition of a volume takes 15-20 s. A strain volume is then calculated by comparing frame pairs throughout the sequence. The method uses strain quality estimates to automatically pick out high quality frame pairs, and so does not require careful control of the axial compression. In a series of in vitro and in vivo experiments, we quantify the image quality of the new method and also assess its ease of use. Results are compared with those for the current best alternative, which calculates strain between two complete volumes. The volume pair approach can produce high quality data, but skillful scanning is required to acquire two volumes with appropriate relative strain. In the new method, the automatic quality-weighted selection of image pairs overcomes this difficulty and the method produces superior quality images with a relatively relaxed scanning technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Existing devices for communicating information to computers are bulky, slow to use, or unreliable. Dasher is a new interface incorporating language modelling and driven by continuous two-dimensional gestures, e.g. a mouse, touchscreen, or eye-tracker. Tests have shown that this device can be used to enter text at a rate of up to 34 words per minute, compared with typical ten-finger keyboard typing of 40-60 words per minute. Although the interface is slower than a conventional keyboard, it is small and simple, and could be used on personal data assistants and by motion-impaired computer users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple subsystems developed at different sites. Cross system adaptation can be used as an alternative to direct hypothesis level combination schemes such as ROVER. In normal cross adaptation it is assumed that useful diversity among systems exists only at acoustic level. However, complimentary features among complex LVCSR systems also manifest themselves in other layers of modelling hierarchy, e.g., subword and word level. It is thus interesting to also cross adapt language models (LM) to capture them. In this paper cross adaptation of multi-level LMs modelling both syllable and word sequences was investigated to improve LVCSR system combination. Significant error rate gains up to 6.7% rel. were obtained over ROVER and acoustic model only cross adaptation when combining 13 Chinese LVCSR subsystems used in the 2010 DARPA GALE evaluation. © 2010 ISCA.