17 resultados para visual sub-system
em Cambridge University Engineering Department Publications Database
Resumo:
State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple sub-systems that may even be developed at different sites. Cross system adaptation, in which model adaptation is performed using the outputs from another sub-system, can be used as an alternative to hypothesis level combination schemes such as ROVER. Normally cross adaptation is only performed on the acoustic models. However, there are many other levels in LVCSR systems' modelling hierarchy where complimentary features may be exploited, for example, the sub-word and the word level, to further improve cross adaptation based system combination. It is thus interesting to also cross adapt language models (LMs) to capture these additional useful features. In this paper cross adaptation is applied to three forms of language models, a multi-level LM that models both syllable and word sequences, a word level neural network LM, and the linear combination of the two. Significant error rate reductions of 4.0-7.1% relative were obtained over ROVER and acoustic model only cross adaptation when combining a range of Chinese LVCSR sub-systems used in the 2010 and 2011 DARPA GALE evaluations. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Placing a gene of interest under the control of an inducible promoter greatly aids the purification, localization and functional analysis of proteins but usually requires the sub-cloning of the gene of interest into an appropriate expression vector. Here, we describe an alternative approach employing in vitro transposition of Tn Omega P(BAD) to place the highly regulable, arabinose inducible P(BAD) promoter upstream of the gene to be expressed. The method is rapid, simple and facilitates the optimization of expression by producing constructs with variable distances between the P(BAD) promoter and the gene. To illustrate the use of this approach, we describe the construction of a strain of Escherichia coli in which growth at low temperatures on solid media is dependent on threshold levels of arabinose. Other uses of the transposable promoter are also discussed.
Resumo:
A significant proportion of the processing delays within the visual system are luminance dependent. Thus placing an attenuating filter over one eye causes a temporal delay between the eyes and thus an illusion of motion in depth for objects moving in the fronto-parallel plane, known as the Pulfrich effect. We have used this effect to study adaptation to such an interocular delay in two normal subjects wearing 75% attenuating neutral density filters over one eye. In two separate experimental periods both subjects showed about 60% adaptation over 9 days. Reciprocal effects were seen on removal of the filters. To isolate the site of adaptation we also measured the subjects' flicker fusion frequencies (FFFs) and contrast sensitivity functions (CSFs). Both subjects showed significant adaptation in their FFFs. An attempt to model the Pulfrich and FFF adaptation curves with a change in a single parameter in Kelly's [(1971) Journal of the Optical Society of America, 71, 537-546] retinal model was only partially successful. Although we have demonstrated adaptation in normal subjects to induced time delays in the visual system we postulate that this may at least partly represent retinal adaptation to the change in mean luminance.
Resumo:
This paper advances the proposition that in many electronic products, the partitioning scheme adopted and the interconnection system used to interconnect the sub-assemblies or components are intimately related to the economic benefits, and hence the attractiveness, of reuse of these items. An architecture has been developed in which the residual values of the connectors, components and sub-assemblies are maximized, and opportunities for take-back and reuse of redundant items are greatly enhanced. The system described also offers significant manufacturing cost benefits in terms of ease of assembly, compactness and robustness.
Resumo:
State-of-the-art large vocabulary continuous speech recognition (LVCSR) systems often combine outputs from multiple subsystems developed at different sites. Cross system adaptation can be used as an alternative to direct hypothesis level combination schemes such as ROVER. The standard approach involves only cross adapting acoustic models. To fully exploit the complimentary features among sub-systems, language model (LM) cross adaptation techniques can be used. Previous research on multi-level n-gram LM cross adaptation is extended to further include the cross adaptation of neural network LMs in this paper. Using this improved LM cross adaptation framework, significant error rate gains of 4.0%-7.1% relative were obtained over acoustic model only cross adaptation when combining a range of Chinese LVCSR sub-systems used in the 2010 and 2011 DARPA GALE evaluations. Copyright © 2011 ISCA.
Resumo:
The visual system must learn to infer the presence of objects and features in the world from the images it encounters, and as such it must, either implicitly or explicitly, model the way these elements interact to create the image. Do the response properties of cells in the mammalian visual system reflect this constraint? To address this question, we constructed a probabilistic model in which the identity and attributes of simple visual elements were represented explicitly and learnt the parameters of this model from unparsed, natural video sequences. After learning, the behaviour and grouping of variables in the probabilistic model corresponded closely to functional and anatomical properties of simple and complex cells in the primary visual cortex (V1). In particular, feature identity variables were activated in a way that resembled the activity of complex cells, while feature attribute variables responded much like simple cells. Furthermore, the grouping of the attributes within the model closely parallelled the reported anatomical grouping of simple cells in cat V1. Thus, this generative model makes explicit an interpretation of complex and simple cells as elements in the segmentation of a visual scene into basic independent features, along with a parametrisation of their moment-by-moment appearances. We speculate that such a segmentation may form the initial stage of a hierarchical system that progressively separates the identity and appearance of more articulated visual elements, culminating in view-invariant object recognition.
Resumo:
Action Potential (APs) patterns of sensory cortex neurons encode a variety of stimulus features, but how can a neuron change the feature to which it responds? Here, we show that in vivo a spike-timing-dependent plasticity (STDP) protocol-consisting of pairing a postsynaptic AP with visually driven presynaptic inputs-modifies a neurons' AP-response in a bidirectional way that depends on the relative AP-timing during pairing. Whereas postsynaptic APs repeatedly following presynaptic activation can convert subthreshold into suprathreshold responses, APs repeatedly preceding presynaptic activation reduce AP responses to visual stimulation. These changes were paralleled by restructuring of the neurons response to surround stimulus locations and membrane-potential time-course. Computational simulations could reproduce the observed subthreshold voltage changes only when presynaptic temporal jitter was included. Together this shows that STDP rules can modify output patterns of sensory neurons and the timing of single-APs plays a crucial role in sensory coding and plasticity.DOI:http://dx.doi.org/10.7554/eLife.00012.001.
Resumo:
The task of word-level confidence estimation (CE) for automatic speech recognition (ASR) systems stands to benefit from the combination of suitably defined input features from multiple information sources. However, the information sources of interest may not necessarily operate at the same level of granularity as the underlying ASR system. The research described here builds on previous work on confidence estimation for ASR systems using features extracted from word-level recognition lattices, by incorporating information at the sub-word level. Furthermore, the use of Conditional Random Fields (CRFs) with hidden states is investigated as a technique to combine information for word-level CE. Performance improvements are shown using the sub-word-level information in linear-chain CRFs with appropriately engineered feature functions, as well as when applying the hidden-state CRF model at the word level.
Resumo:
We demonstrate a mid-infrared Raman-soliton continuum extending from 1.9 to 3 μm in a highly germanium-doped silica-clad fiber, pumped by a nanotube mode-locked thulium-doped fiber system, delivering 12 kW sub-picosecond pulses at 1.95 μm. This simple and robust source of light covers a portion of the atmospheric transmission window. © 2013 Optical Society of America.
Resumo:
This paper presents a complete system for expressive visual text-to-speech (VTTS), which is capable of producing expressive output, in the form of a 'talking head', given an input text and a set of continuous expression weights. The face is modeled using an active appearance model (AAM), and several extensions are proposed which make it more applicable to the task of VTTS. The model allows for normalization with respect to both pose and blink state which significantly reduces artifacts in the resulting synthesized sequences. We demonstrate quantitative improvements in terms of reconstruction error over a million frames, as well as in large-scale user studies, comparing the output of different systems. © 2013 IEEE.
Resumo:
Strategic planning can be an arduous and complex task; and, once a plan has been devised, it is often quite a challenge to effectively communicate the principal missions and key priorities to the array of different stakeholders. The communication challenge can be addressed through the application of a clearly and concisely designed visualisation of the strategic plan - to that end, this paper proposes the use of a roadmapping framework to structure a visual canvas. The canvas provides a template in the form of a single composite visual output that essentially allows a 'plan-on-a-page' to be generated. Such a visual representation provides a high-level depiction of the future context, end-state capabilities and the system-wide transitions needed to realise the strategic vision. To demonstrate this approach, an illustrative case study based on the Australian Government's Defence White Paper and the Royal Australian Navy's fleet plan will be presented. The visual plan plots the in-service upgrades for addressing the capability shortfalls and gaps in the Navy's fleet as it transitions from its current configuration to its future end-state vision. It also provides a visualisation of project timings in terms of the decision gates (approval, service release) and specific phases (proposal, contract, delivery) together with how these projects are rated against the key performance indicators relating to the technology acquisition process and associated management activities. © 2013 Taylor & Francis.