Biblioteca Digital

28 resultados para Speech processing systems.

em National Center for Biotechnology Information - NCBI

Processing of speech signals for physical and sensory disabilities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.

Commercial applications of speech interface technology: an industry at the threshold.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines.

Deployment of human-machine dialogue systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The deployment of systems for human-to-machine communication by voice requires overcoming a variety of obstacles that affect the speech-processing technologies. Problems encountered in the field might include variation in speaking style, acoustic noise, ambiguity of language, or confusion on the part of the speaker. The diversity of these practical problems encountered in the "real world" leads to the perceived gap between laboratory and "real-world" performance. To answer the question "What applications can speech technology support today?" the concept of the "degree of difficulty" of an application is introduced. The degree of difficulty depends not only on the demands placed on the speech recognition and speech synthesis technologies but also on the expectations of the user of the system. Experience has shown that deployment of effective speech communication systems requires an iterative process. This paper discusses general deployment principles, which are illustrated by several examples of human-machine communication systems.

Syntax processing by auditory cortical neurons in the FM–FM area of the mustached bat Pteronotus parnellii

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Syntax denotes a rule system that allows one to predict the sequencing of communication signals. Despite its significance for both human speech processing and animal acoustic communication, the representation of syntactic structure in the mammalian brain has not been studied electrophysiologically at the single-unit level. In the search for a neuronal correlate for syntax, we used playback of natural and temporally destructured complex species-specific communication calls—so-called composites—while recording extracellularly from neurons in a physiologically well defined area (the FM–FM area) of the mustached bat’s auditory cortex. Even though this area is known to be involved in the processing of target distance information for echolocation, we found that units in the FM–FM area were highly responsive to composites. The finding that neuronal responses were strongly affected by manipulation in the time domain of the natural composite structure lends support to the hypothesis that syntax processing in mammals occurs at least at the level of the nonprimary auditory cortex.

Research in speech communication.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker.

Models of natural language understanding.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper surveys some of the fundamental problems in natural language (NL) understanding (syntax, semantics, pragmatics, and discourse) and the current approaches to solving them. Some recent developments in NL processing include increased emphasis on corpus-based rather than example- or intuition-based work, attempts to measure the coverage and effectiveness of NL systems, dealing with discourse and dialogue phenomena, and attempts to use both analytic and stochastic knowledge. Critical areas for the future include grammars that are appropriate to processing large amounts of real language; automatic (or at least semi-automatic) methods for deriving models of syntax, semantics, and pragmatics; self-adapting systems; and integration with speech processing. Of particular importance are techniques that can be tuned to such requirements as full versus partial understanding and spoken language versus text. Portability (the ease with which one can configure an NL system for a particular application) is one of the largest barriers to application of this technology.

Military and government applications of human-machine communication by voice.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes a range of opportunities for military and government applications of human-machine communication by voice, based on visits and contacts with numerous user organizations in the United States. The applications include some that appear to be feasible by careful integration of current state-of-the-art technology and others that will require a varying mix of advances in speech technology and in integration of the technology into applications environments. Applications that are described include (1) speech recognition and synthesis for mobile command and control; (2) speech processing for a portable multifunction soldier's computer; (3) speech- and language-based technology for naval combat team tactical training; (4) speech technology for command and control on a carrier flight deck; (5) control of auxiliary systems, and alert and warning generation, in fighter aircraft and helicopters; and (6) voice check-in, report entry, and communication for law enforcement agents or special forces. A phased approach for transfer of the technology into applications is advocated, where integration of applications systems is pursued in parallel with advanced research to meet future needs.

The universal ancestor

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A genetic annealing model for the universal ancestor of all extant life is presented; the name of the model derives from its resemblance to physical annealing. The scenario pictured starts when “genetic temperatures” were very high, cellular entities (progenotes) were very simple, and information processing systems were inaccurate. Initially, both mutation rate and lateral gene transfer levels were elevated. The latter was pandemic and pervasive to the extent that it, not vertical inheritance, defined the evolutionary dynamic. As increasingly complex and precise biological structures and processes evolved, both the mutation rate and the scope and level of lateral gene transfer, i.e., evolutionary temperature, dropped, and the evolutionary dynamic gradually became that characteristic of modern cells. The various subsystems of the cell “crystallized,” i.e., became refractory to lateral gene transfer, at different stages of “cooling,” with the translation apparatus probably crystallizing first. Organismal lineages, and so organisms as we know them, did not exist at these early stages. The universal phylogenetic tree, therefore, is not an organismal tree at its base but gradually becomes one as its peripheral branchings emerge. The universal ancestor is not a discrete entity. It is, rather, a diverse community of cells that survives and evolves as a biological unit. This communal ancestor has a physical history but not a genealogical one. Over time, this ancestor refined into a smaller number of increasingly complex cell types with the ancestors of the three primary groupings of organisms arising as a result.

Integration of speech with natural language understanding.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Language Program sponsored by the Advanced Research Projects Agency (ARPA). I begin by reviewing some of the ways that spontaneous spoken language differs from standard written language and discuss methods of coping with the difficulties of spontaneous speech. I then look at how systems cope with errors in speech recognition and at attempts to use natural language information to reduce recognition errors. Finally, I discuss how prosodic information in the speech signal might be used to improve understanding.

New trends in natural language processing: statistical natural language processing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The field of natural language processing (NLP) has seen a dramatic shift in both research direction and methodology in the past several years. In the past, most work in computational linguistics tended to focus on purely symbolic methods. Recently, more and more work is shifting toward hybrid methods that combine new empirical corpus-based methods, including the use of probabilistic and information-theoretic techniques, with traditional symbolic methods. This work is made possible by the recent availability of linguistic databases that add rich linguistic annotation to corpora of natural language text. Already, these methods have led to a dramatic improvement in the performance of a variety of NLP systems with similar improvement likely in the coming years. This paper focuses on these trends, surveying in particular three areas of recent progress: part-of-speech tagging, stochastic parsing, and lexical semantics.

Proteolytic Processing and Ca2+-binding Activity of Dense-Core Vesicle Polypeptides in Tetrahymena

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Formation and discharge of dense-core secretory vesicles depend on controlled rearrangement of the core proteins during their assembly and dispersal. The ciliate Tetrahymena thermophila offers a simple system in which the mechanisms may be studied. Here we show that most of the core consists of a set of polypeptides derived proteolytically from five precursors. These share little overall amino acid identity but are nonetheless predicted to have structural similarity. In addition, sites of proteolytic processing are notably conserved and suggest that specific endoproteases as well as carboxypeptidase are involved in core maturation. In vitro binding studies and sequence analysis suggest that the polypeptides bind calcium in vivo. Core assembly and postexocytic dispersal are compartment-specific events. Two likely regulatory factors are proteolytic processing and exposure to calcium. We asked whether these might directly influence the conformations of core proteins. Results using an in vitro chymotrypsin accessibility assay suggest that these factors can induce sequential structural rearrangements. Such progressive changes in polypeptide folding may underlie the mechanisms of assembly and of rapid postexocytic release. The parallels between dense-core vesicles in different systems suggest that similar mechanisms are widespread in this class of organelles.

Length suppression in histone messenger RNA 3′-end maturation: Processing defects of insertion mutant premessenger RNAs can be compensated by insertions into the U7 small nuclear RNA

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Efficient 3′-end processing of cell cycle-regulated mammalian histone premessenger RNAs (pre-mRNAs) requires an upstream stem–loop and a histone downstream element (HDE) that base pairs with the U7 small ribonuclearprotein. Insertions between these elements have two effects: the site of cleavage moves in concert with the HDE and processing efficiency declines. We used Xenopus oocytes to ask whether compensatory length insertions in the human U7 RNA could restore the fidelity and efficiency of processing of mouse histone insertion pre-mRNAs. An insertion of 5 nt into U7 RNA that extends its complementary to the HDE compensated for both defects in processing of a 5-nt insertion substrate; a noncomplementary insertion into U7 did not. Yet, the noncomplementary insertion mutant U7 was shown to be active on insertion substrates further mutated to allow base pairing. Our results suggest that the histone pre-mRNA becomes rigidified upstream of its HDE, allowing the bound U7 small ribonucleoprotein to measure from the HDE to the cleavage site. Such a mechanism may be common to other RNA measuring systems. To our knowledge, this is the first demonstration of length suppression in an RNA processing system.

Cortical auditory signal processing in poor readers

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Magnetoencephalographic responses recorded from auditory cortex evoked by brief and rapidly successive stimuli differed between adults with poor vs. good reading abilities in four important ways. First, the response amplitude evoked by short-duration acoustic stimuli was stronger in the post-stimulus time range of 150–200 ms in poor readers than in normal readers. Second, response amplitude to rapidly successive and brief stimuli that were identical or that differed significantly in frequency were substantially weaker in poor readers compared with controls, for interstimulus intervals of 100 or 200 ms, but not for an interstimulus interval of 500 ms. Third, this neurological deficit closely paralleled subjects’ ability to distinguish between and to reconstruct the order of presentation of those stimulus sequences. Fourth, the average distributed response coherence evoked by rapidly successive stimuli was significantly weaker in the β- and γ-band frequency ranges (20–60 Hz) in poor readers, compared with controls. These results provide direct electrophysiological evidence supporting the hypothesis that reading disabilities are correlated with the abnormal neural representation of brief and rapidly successive sensory inputs, manifested in this study at the entry level of the cortical auditory/aural speech representational system(s).

Cortical processing of change detection: Dissociation between natural vowels and two-frequency complex tones

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We compared magnetoencephalographic responses for natural vowels and for sounds consisting of two pure tones that represent the two lowest formant frequencies of these vowels. Our aim was to determine whether spectral changes in successive stimuli are detected differently for speech and nonspeech sounds. The stimuli were presented in four blocks applying an oddball paradigm (20% deviants, 80% standards): (i) /α/ tokens as deviants vs. /i/ tokens as standards; (ii) /e/ vs. /i/; (iii) complex tones representing /α/ formants vs. /i/ formants; and (iv) complex tones representing /e/ formants vs. /i/ formants. Mismatch fields (MMFs) were calculated by subtracting the source waveform produced by standards from that produced by deviants. As expected, MMF amplitudes for the complex tones reflected acoustic deviation: the amplitudes were stronger for the complex tones representing /α/ than /e/ formants, i.e., when the spectral difference between standards and deviants was larger. In contrast, MMF amplitudes for the vowels were similar despite their different spectral composition, whereas the MMF onset time was longer for /e/ than for /α/. Thus the degree of spectral difference between standards and deviants was reflected by the MMF amplitude for the nonspeech sounds and by the MMF latency for the vowels.

Pitch perception: A dynamical-systems perspective

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two and a half millennia ago Pythagoras initiated the scientific study of the pitch of sounds; yet our understanding of the mechanisms of pitch perception remains incomplete. Physical models of pitch perception try to explain from elementary principles why certain physical characteristics of the stimulus lead to particular pitch sensations. There are two broad categories of pitch-perception models: place or spectral models consider that pitch is mainly related to the Fourier spectrum of the stimulus, whereas for periodicity or temporal models its characteristics in the time domain are more important. Current models from either class are usually computationally intensive, implementing a series of steps more or less supported by auditory physiology. However, the brain has to analyze and react in real time to an enormous amount of information from the ear and other senses. How is all this information efficiently represented and processed in the nervous system? A proposal of nonlinear and complex systems research is that dynamical attractors may form the basis of neural information processing. Because the auditory system is a complex and highly nonlinear dynamical system, it is natural to suppose that dynamical attractors may carry perceptual and functional meaning. Here we show that this idea, scarcely developed in current pitch models, can be successfully applied to pitch perception.

«
1
2
»