25 resultados para Visual Speech Recognition, Multiple Views, Frontal View, Profile View


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Colloquium on Human-Machine Communication by Voice highlighted the global technical community's focus on the problems and promise of voice-processing technology, particularly, speech recognition and speech synthesis. Clearly, there are many areas in both the research and development of these technologies that can be advanced significantly. However, it is also true that there are many applications of these technologies that are capable of commercialization now. Early successful commercialization of new technology is vital to ensure continuing interest in its development. This paper addresses efforts to commercialize speech technologies in two markets: telecommunications and aids for the handicapped.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the telecommunications industry evolves over the next decade to provide the products and services that people will desire, several key technologies will become commonplace. Two of these, automatic speech recognition and text-to-speech synthesis, will provide users with more freedom on when, where, and how they access information. While these technologies are currently in their infancy, their capabilities are rapidly increasing and their deployment in today's telephone network is expanding. The economic impact of just one application, the automation of operator services, is well over $100 million per year. Yet there still are many technical challenges that must be resolved before these technologies can be deployed ubiquitously in products and services throughout the worldwide telephone network. These challenges include: (i) High level of accuracy. The technology must be perceived by the user as highly accurate, robust, and reliable. (ii) Easy to use. Speech is only one of several possible input/output modalities for conveying information between a human and a machine, much like a computer terminal or Touch-Tone pad on a telephone. It is not the final product. Therefore, speech technologies must be hidden from the user. That is, the burden of using the technology must be on the technology itself. (iii) Quick prototyping and development of new products and services. The technology must support the creation of new products and services based on speech in an efficient and timely fashion. In this paper I present a vision of the voice-processing industry with a focus on the areas with the broadest base of user penetration: speech recognition, text-to-speech synthesis, natural language processing, and speaker recognition technologies. The current and future applications of these technologies in the telecommunications industry will be examined in terms of their strengths, limitations, and the degree to which user needs have been or have yet to be met. Although noteworthy gains have been made in areas with potentially small user bases and in the more mature speech-coding technologies, these subjects are outside the scope of this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a range of opportunities for military and government applications of human-machine communication by voice, based on visits and contacts with numerous user organizations in the United States. The applications include some that appear to be feasible by careful integration of current state-of-the-art technology and others that will require a varying mix of advances in speech technology and in integration of the technology into applications environments. Applications that are described include (1) speech recognition and synthesis for mobile command and control; (2) speech processing for a portable multifunction soldier's computer; (3) speech- and language-based technology for naval combat team tactical training; (4) speech technology for command and control on a carrier flight deck; (5) control of auxiliary systems, and alert and warning generation, in fighter aircraft and helicopters; and (6) voice check-in, report entry, and communication for law enforcement agents or special forces. A phased approach for transfer of the technology into applications is advocated, where integration of applications systems is pursued in parallel with advanced research to meet future needs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The deployment of systems for human-to-machine communication by voice requires overcoming a variety of obstacles that affect the speech-processing technologies. Problems encountered in the field might include variation in speaking style, acoustic noise, ambiguity of language, or confusion on the part of the speaker. The diversity of these practical problems encountered in the "real world" leads to the perceived gap between laboratory and "real-world" performance. To answer the question "What applications can speech technology support today?" the concept of the "degree of difficulty" of an application is introduced. The degree of difficulty depends not only on the demands placed on the speech recognition and speech synthesis technologies but also on the expectations of the user of the system. Experience has shown that deployment of effective speech communication systems requires an iterative process. This paper discusses general deployment principles, which are illustrated by several examples of human-machine communication systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the state of the art in applications of voice-processing technologies. In the first part, technologies concerning the implementation of speech recognition and synthesis algorithms are described. Hardware technologies such as microprocessors and DSPs (digital signal processors) are discussed. Software development environment, which is a key technology in developing applications software, ranging from DSP software to support software also is described. In the second part, the state of the art of algorithms from the standpoint of applications is discussed. Several issues concerning evaluation of speech recognition/synthesis algorithms are covered, as well as issues concerning the robustness of algorithms in adverse conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This talk, which was the keynote address of the NAS Colloquium on Human-Machine Communication by Voice, discusses the past, present, and future of human-machine communications, especially speech recognition and speech synthesis. Progress in these technologies is reviewed in the context of the general progress in computer and communications technologies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334–337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈π/2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Stimulus recognition in monkeys is severely impaired by destruction or dysfunction of the perirhinal cortex and also by systemic administration of the cholinergic-muscarinic receptor blocker, scopolamine. These two effects are shown here to be linked: Stimulus recognition was found to be significantly impaired after bilateral microinjection of scopolamine directly into the perirhinal cortex, but not after equivalent injections into the laterally adjacent visual area TE or into the dentate gyrus of the overlying hippocampal formation. The results suggest that the formation of stimulus memories depends critically on cholinergic-muscarinic activation of the perirhinal area, providing a new clue to how stimulus representations are stored.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

At the forefront of debates on language are new data demonstrating infants' early acquisition of information about their native language. The data show that infants perceptually “map” critical aspects of ambient language in the first year of life before they can speak. Statistical properties of speech are picked up through exposure to ambient language. Moreover, linguistic experience alters infants' perception of speech, warping perception in the service of language. Infants' strategies are unexpected and unpredicted by historical views. A new theoretical position has emerged, and six postulates of this position are described.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Although proteases related to the interleukin 1 beta-converting enzyme (ICE) are known to be essential for apoptotic execution, the number of enzymes involved, their substrate specificities, and their specific roles in the characteristic biochemical and morphological changes of apoptosis are currently unknown. These questions were addressed using cloned recombinant ICE-related proteases (IRPs) and a cell-free model system for apoptosis (S/M extracts). First, we compared the substrate specificities of two recombinant human IRPs, CPP32 and Mch2 alpha. Both enzymes cleaved poly-(ADP-ribose) polymerase, albeit with different efficiencies. Mch2 alpha also cleaved recombinant and nuclear lamin A at a conserved VEID decreases NG sequence located in the middle of the coiled-coil rod domain, producing a fragment that was indistinguishable from the lamin A fragment observed in S/M extracts and in apoptotic cells. In contrast, CPP32 did not cleave lamin A. The cleavage of lamin A by Mch2 alpha and by S/M extracts was inhibited by millimolar concentrations of Zn2+, which had a minimal effect on cleavage of poly (ADP-ribose) polymerase by CPP32 and by S/M extracts. We also found that N-(acetyltyrosinylvalinyl-N epsilon-biotinyllysyl)aspartic acid [(2,6-dimethylbenzoyl)oxy]methyl ketone, which derivatizes the larger subunit of active ICE, can affinity label up to five active IRPs in S/M extracts. Together, these observations indicate that the processing of nuclear proteins in apoptosis involves multiple IRPs having distinct preferences for their apoptosis-associated substrates.