10 resultados para In-vehicle speech technology
em National Center for Biotechnology Information - NCBI
Resumo:
This paper introduces the session "Technology in the Year 2001" and is the first of four papers dealing with the future of human-machine communication by voice. In looking to the future it is important to recognize both the difficulties of technological forecasting and the frailties of the technology as it exists today--frailties that are manifestations of our limited scientific understanding of human cognition. The technology to realize truly advanced applications does not yet exist and cannot be supported by our presently incomplete science of speech. To achieve this long-term goal, the authors advocate a fundamental research program using a cybernetic approach substantially different from more conventional synthetic approaches. In a cybernetic approach, feedback control systems will allow a machine to adapt to a linguistically rich environment using reinforcement learning.
Resumo:
Research in speech recognition and synthesis over the past several decades has brought speech technology to a point where it is being used in "real-world" applications. However, despite the progress, the perception remains that the current technology is not flexible enough to allow easy voice communication with machines. The focus of speech research is now on producing systems that are accurate and robust but that do not impose unnecessary constraints on the user. This chapter takes a critical look at the shortcomings of the current speech recognition and synthesis algorithms, discusses the technical challenges facing research, and examines the new directions that research in speech recognition and synthesis must take in order to form the basis of new solutions suitable for supporting a wide range of applications.
Resumo:
In the past decade, tremendous advances in the state of the art of automatic speech recognition by machine have taken place. A reduction in the word error rate by more than a factor of 5 and an increase in recognition speeds by several orders of magnitude (brought about by a combination of faster recognition search algorithms and more powerful computers), have combined to make high-accuracy, speaker-independent, continuous speech recognition for large vocabularies possible in real time, on off-the-shelf workstations, without the aid of special hardware. These advances promise to make speech recognition technology readily available to the general public. This paper focuses on the speech recognition advances made through better speech modeling techniques, chiefly through more accurate mathematical modeling of speech sounds.
Resumo:
A methodology, fluorescence-intensity distribution analysis, has been developed for confocal microscopy studies in which the fluorescence intensity of a sample with a heterogeneous brightness profile is monitored. An adjustable formula, modeling the spatial brightness distribution, and the technique of generating functions for calculation of theoretical photon count number distributions serve as the two cornerstones of the methodology. The method permits the simultaneous determination of concentrations and specific brightness values of a number of individual fluorescent species in solution. Accordingly, we present an extremely sensitive tool to monitor the interaction of fluorescently labeled molecules or other microparticles with their respective biological counterparts that should find a wide application in life sciences, medicine, and drug discovery. Its potential is demonstrated by studying the hybridization of 5′-(6-carboxytetramethylrhodamine)-labeled and nonlabeled complementary oligonucleotides and the subsequent cleavage of the DNA hybrids by restriction enzymes.
Resumo:
Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.
Resumo:
“TKO” is an expression vector that knocks out the activity of a transcription factor in vivo under genetic control. We describe a successful test of this concept that used a sea urchin transcription factor of known function, P3A2, as the target. The TKO cassette employs modular cis-regulatory elements to express an encoded single-chain antibody that prevents the P3A2 protein from binding DNA in vivo. In normal development, one of the functions of the P3A2 transcription factor is to repress directly the expression of the CyIIIa cytoskeletal actin gene outside the aboral ectoderm of the embryo. Ectopic expression in oral ectoderm occurs if P3A2 sites are deleted from CyIIIa expression constructs, and we show here that introduction of an αP3A2⋅TKO expression cassette causes exactly the same ectopic oral expression of a coinjected wild-type CyIIIa construct. Furthermore, the αP3A2⋅TKO cassette derepresses the endogenous CyIIIa gene in the oral ectoderm and in the endoderm. αP3A2⋅TKO thus abrogates the function of the endogenous SpP3A2 transcription factor with respect to spatial repression of the CyIIIa gene. Widespread expression of αP3A2⋅TKO in the endoderm has the additional lethal effect of disrupting morphogenesis of the archenteron, revealing a previously unsuspected function of SpP3A2 in endoderm development. In principle, TKO technology could be utilized for spatially and temporally controlled blockade of any transcription factor in any biological system amenable to gene transfer.
Resumo:
Pseudomonas aeruginosa, an opportunistic human pathogen, is a major causative agent of mortality and morbidity in immunocompromised patients and those with cystic fibrosis genetic disease. To identify new virulence genes of P. aeruginosa, a selection system was developed based on the in vivo expression technology (IVET) that was first reported in Salmonella system. An adenine-requiring auxotrophic mutant strain of P. aeruginosa was isolated and found avirulent on neutropenic mice. A DNA fragment that can complement the mutant strain, containing purEK operon that is required for de novo biosynthesis of purine, was sequenced and used in the IVET vector construction. By applying the IVET selection system to a neutropenic mouse infection model, genetic loci that are specifically induced in vivo were identified. Twenty-two such loci were partially sequenced and analyzed. One of them was a well-studied virulence factor, pyochelin receptor (FptA), that is involved in iron acquisition. Fifteen showed significant homology to reported sequences in GenBank, while the remaining six did not. One locus, designated np20, encodes an open reading frame that shares amino acid sequence homology to transcriptional regulators, especially to the ferric uptake regulator (Fur) proteins of other bacteria. An insertional np20 null mutant strain of P. aeruginosa did not show a growth defect on laboratory media; however, its virulence on neutropenic mice was significantly reduced compared with that of a wild-type parent strain, demonstrating the importance of the np20 locus in the bacterial virulence. The successful isolation of genetic loci that affect bacterial virulence demonstrates the utility of the IVET system in identification of new virulence genes of P. aeruginosa.
Resumo:
The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Language Program sponsored by the Advanced Research Projects Agency (ARPA). I begin by reviewing some of the ways that spontaneous spoken language differs from standard written language and discuss methods of coping with the difficulties of spontaneous speech. I then look at how systems cope with errors in speech recognition and at attempts to use natural language information to reduce recognition errors. Finally, I discuss how prosodic information in the speech signal might be used to improve understanding.
Resumo:
This paper describes a range of opportunities for military and government applications of human-machine communication by voice, based on visits and contacts with numerous user organizations in the United States. The applications include some that appear to be feasible by careful integration of current state-of-the-art technology and others that will require a varying mix of advances in speech technology and in integration of the technology into applications environments. Applications that are described include (1) speech recognition and synthesis for mobile command and control; (2) speech processing for a portable multifunction soldier's computer; (3) speech- and language-based technology for naval combat team tactical training; (4) speech technology for command and control on a carrier flight deck; (5) control of auxiliary systems, and alert and warning generation, in fighter aircraft and helicopters; and (6) voice check-in, report entry, and communication for law enforcement agents or special forces. A phased approach for transfer of the technology into applications is advocated, where integration of applications systems is pursued in parallel with advanced research to meet future needs.
Resumo:
The deployment of systems for human-to-machine communication by voice requires overcoming a variety of obstacles that affect the speech-processing technologies. Problems encountered in the field might include variation in speaking style, acoustic noise, ambiguity of language, or confusion on the part of the speaker. The diversity of these practical problems encountered in the "real world" leads to the perceived gap between laboratory and "real-world" performance. To answer the question "What applications can speech technology support today?" the concept of the "degree of difficulty" of an application is introduced. The degree of difficulty depends not only on the demands placed on the speech recognition and speech synthesis technologies but also on the expectations of the user of the system. Experience has shown that deployment of effective speech communication systems requires an iterative process. This paper discusses general deployment principles, which are illustrated by several examples of human-machine communication systems.