6 resultados para Input and outputs
em National Center for Biotechnology Information - NCBI
Resumo:
Optimism is growing that the near future will witness rapid growth in human-computer interaction using voice. System prototypes have recently been built that demonstrate speaker-independent real-time speech recognition, and understanding of naturally spoken utterances with vocabularies of 1000 to 2000 words, and larger. Already, computer manufacturers are building speech recognition subsystems into their new product lines. However, before this technology can be broadly useful, a substantial knowledge base is needed about human spoken language and performance during computer-based spoken interaction. This paper reviews application areas in which spoken interaction can play a significant role, assesses potential benefits of spoken interaction with machines, and compares voice with other modalities of human-computer interaction. It also discusses information that will be needed to build a firm empirical foundation for the design of future spoken and multimodal interfaces. Finally, it argues for a more systematic and scientific approach to investigating spoken input and performance with future language technology.
Resumo:
Neuronal responses are conspicuously variable. We focus on one particular aspect of that variability: the precision of action potential timing. We show that for common models of noisy spike generation, elementary considerations imply that such variability is a function of the input, and can be made arbitrarily large or small by a suitable choice of inputs. Our considerations are expected to extend to virtually any mechanism of spike generation, and we illustrate them with data from the visual pathway. Thus, a simplification usually made in the application of information theory to neural processing is violated: noise is not independent of the message. However, we also show the existence of error-correcting topologies, which can achieve better timing reliability than their components.
Resumo:
The visual responses of neurons in the cerebral cortex were first adequately characterized in the 1960s by D. H. Hubel and T. N. Wiesel [(1962) J. Physiol. (London) 160, 106-154; (1968) J. Physiol. (London) 195, 215-243] using qualitative analyses based on simple geometric visual targets. Over the past 30 years, it has become common to consider the properties of these neurons by attempting to make formal descriptions of these transformations they execute on the visual image. Most such models have their roots in linear-systems approaches pioneered in the retina by C. Enroth-Cugell and J. R. Robson [(1966) J. Physiol. (London) 187, 517-552], but it is clear that purely linear models of cortical neurons are inadequate. We present two related models: one designed to account for the responses of simple cells in primary visual cortex (V1) and one designed to account for the responses of pattern direction selective cells in MT (or V5), an extrastriate visual area thought to be involved in the analysis of visual motion. These models share a common structure that operates in the same way on different kinds of input, and instantiate the widely held view that computational strategies are similar throughout the cerebral cortex. Implementations of these models for Macintosh microcomputers are available and can be used to explore the models' properties.
Resumo:
The scientific bases for human-machine communication by voice are in the fields of psychology, linguistics, acoustics, signal processing, computer science, and integrated circuit technology. The purpose of this paper is to highlight the basic scientific and technological issues in human-machine communication by voice and to point out areas of future research opportunity. The discussion is organized around the following major issues in implementing human-machine voice communication systems: (i) hardware/software implementation of the system, (ii) speech synthesis for voice output, (iii) speech recognition and understanding for voice input, and (iv) usability factors related to how humans interact with machines.
Resumo:
This paper provides an overview of the colloquium's discussion session on natural language understanding, which followed presentations by M. Bates [Bates, M. (1995) Proc. Natl. Acad. Sci. USA 92, 9977-9982] and R. C. Moore [Moore, R. C. (1995) Proc. Natl. Acad. Sci. USA 92, 9983-9988]. The paper reviews the dual role of language processing in providing understanding of the spoken input and an additional source of constraint in the recognition process. To date, language processing has successfully provided understanding but has provided only limited (and computationally expensive) constraint. As a result, most current systems use a loosely coupled, unidirectional interface, such as N-best or a word network, with natural language constraints as a postprocess, to filter or resort the recognizer output. However, the level of discourse context provides significant constraint on what people can talk about and how things can be referred to; when the system becomes an active participant, it can influence this order. But sources of discourse constraint have not been extensively explored, in part because these effects can only be seen by studying systems in the context of their use in interactive problem solving. This paper argues that we need to study interactive systems to understand what kinds of applications are appropriate for the current state of technology and how the technology can move from the laboratory toward real applications.
Resumo:
The role of intrinsic cortical connections in processing sensory input and in generating behavioral output is poorly understood. We have examined this issue in the context of the tuning of neuronal responses in cortex to the orientation of a visual stimulus. We analytically study a simple network model that incorporates both orientation-selective input from the lateral geniculate nucleus and orientation-specific cortical interactions. Depending on the model parameters, the network exhibits orientation selectivity that originates from within the cortex, by a symmetry-breaking mechanism. In this case, the width of the orientation tuning can be sharp even if the lateral geniculate nucleus inputs are only weakly anisotropic. By using our model, several experimental consequences of this cortical mechanism of orientation tuning are derived. The tuning width is relatively independent of the contrast and angular anisotropy of the visual stimulus. The transient population response to changing of the stimulus orientation exhibits a slow "virtual rotation." Neuronal cross-correlations exhibit long time tails, the sign of which depends on the preferred orientations of the cells and the stimulus orientation.