15 resultados para Text-to-speech systems
em National Center for Biotechnology Information - NCBI
Resumo:
Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.
Resumo:
The conversion of text to speech is seen as an analysis of the input text to obtain a common underlying linguistic description, followed by a synthesis of the output speech waveform from this fundamental specification. Hence, the comprehensive linguistic structure serving as the substrate for an utterance must be discovered by analysis from the text. The pronunciation of individual words in unrestricted text is determined by morphological analysis or letter-to-sound conversion, followed by specification of the word-level stress contour. In addition, many text character strings, such as titles, numbers, and acronyms, are abbreviations for normal words, which must be derived. To further refine these pronunciations and to discover the prosodic structure of the utterance, word part of speech must be computed, followed by a phrase-level parsing. From this structure the prosodic structure of the utterance can be determined, which is needed in order to specify the durational framework and fundamental frequency contour of the utterance. In discourse contexts, several factors such as the specification of new and old information, contrast, and pronominal reference can be used to further modify the prosodic specification. When the prosodic correlates have been computed and the segmental sequence is assembled, a complete input suitable for speech synthesis has been determined. Lastly, multilingual systems utilizing rule frameworks are mentioned, and future directions are characterized.
Resumo:
The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community.
Resumo:
As the telecommunications industry evolves over the next decade to provide the products and services that people will desire, several key technologies will become commonplace. Two of these, automatic speech recognition and text-to-speech synthesis, will provide users with more freedom on when, where, and how they access information. While these technologies are currently in their infancy, their capabilities are rapidly increasing and their deployment in today's telephone network is expanding. The economic impact of just one application, the automation of operator services, is well over $100 million per year. Yet there still are many technical challenges that must be resolved before these technologies can be deployed ubiquitously in products and services throughout the worldwide telephone network. These challenges include: (i) High level of accuracy. The technology must be perceived by the user as highly accurate, robust, and reliable. (ii) Easy to use. Speech is only one of several possible input/output modalities for conveying information between a human and a machine, much like a computer terminal or Touch-Tone pad on a telephone. It is not the final product. Therefore, speech technologies must be hidden from the user. That is, the burden of using the technology must be on the technology itself. (iii) Quick prototyping and development of new products and services. The technology must support the creation of new products and services based on speech in an efficient and timely fashion. In this paper I present a vision of the voice-processing industry with a focus on the areas with the broadest base of user penetration: speech recognition, text-to-speech synthesis, natural language processing, and speaker recognition technologies. The current and future applications of these technologies in the telecommunications industry will be examined in terms of their strengths, limitations, and the degree to which user needs have been or have yet to be met. Although noteworthy gains have been made in areas with potentially small user bases and in the more mature speech-coding technologies, these subjects are outside the scope of this paper.
Resumo:
The activation of plant defensive genes in leaves of tomato plants in response to herbivore damage or mechanical wounding is mediated by a mobile 18-amino acid polypeptide signal called systemin. Systemin is derived from a larger, 200-amino acid precursor called prosystemin, similar to polypeptide hormones and soluble growth factors in animals. Systemin activates a lipid-based signaling cascade, also analogous to signaling systems found in animals. In plants, linolenic acid is released from membranes and is converted to the oxylipins phytodienoic acid and jasmonic acid through the octadecanoid pathway. Plant oxylipins are structural analogs of animal prostaglandins which are derived from arachidonic acid in response to various signals, including polypeptide factors. Constitutive overexpression of the prosystemin gene in transgenic tomato plants resulted in the overproduction of prosystemin and the abnormal release of systemin, conferring a constitutive overproduction of several systemic wound-response proteins (SWRPs). The data indicate that systemin is a master signal for defense against attacking herbivores. The same defensive proteins induced by wounding are synthesized in response to oligosaccharide elicitors that are generated in leaf cells in response to pathogen attacks. Inhibitors of the octadecanoid pathway, and a mutation that interrupts this pathway, block the induction of SWRPs by wounding, systemin, and oligosaccharide elicitors, indicating that the octadecanoid pathway is essential for the activation of defense genes by all of these signals. The tomato mutant line that is functionally deficient in the octadecanoid pathway is highly susceptible to attacks by Manduca sexta larvae. The similarities between the defense signaling pathway in tomato leaves and those of the defense signaling pathways of macrophages and mast cells of animals suggests that both the plant and animal pathways may have evolved from a common ancestral origin.
Resumo:
A hierarchy of enzyme-catalyzed positive feedback loops is examined by mathematical and numerical analysis. Four systems are described, from the simplest, in which an enzyme catalyzes its own formation from an inactive precursor, to the most complex, in which two sequential feedback loops act in a cascade. In the latter we also examine the function of a long-range feedback, in which the final enzyme produced in the second loop activates the initial step in the first loop. When the enzymes generated are subject to inhibition or inactivation, all four systems exhibit threshold properties akin to excitable systems like neuron firing. For those that are amenable to mathematical analysis, expressions are derived that relate the excitation threshold to the kinetics of enzyme generation and inhibition and the initial conditions. For the most complex system, it was expedient to employ numerical simulation to demonstrate threshold behavior, and in this case long-range feedback was seen to have two distinct effects. At sufficiently high catalytic rates, this feedback is capable of exciting an otherwise subthreshold system. At lower catalytic rates, where the long-range feedback does not significantly affect the threshold, it nonetheless has a major effect in potentiating the response above the threshold. In particular, oscillatory behavior observed in simulations of sequential feedback loops is abolished when a long-range feedback is present.
Resumo:
An EPR "spectroscopic ruler" was developed using a series of alpha-helical polypeptides, each modified with two nitroxide spin labels. The EPR line broadening due to electron-electron dipolar interactions in the frozen state was determined using the Fourier deconvolution method. These dipolar spectra were then used to estimate the distances between the two nitroxides separated by 8-25 A. Results agreed well with a simple alpha-helical model. The standard deviation from the model system was 0.9 A in the range of 8-25 A. This technique is applicable to complex systems such as membrane receptors and channels, which are difficult to access with high-resolution NMR or x-ray crystallography, and is expected to be particularly useful for systems for which optical methods are hampered by the presence of light-interfering membranes or chromophores.
Resumo:
In behavior reminiscent of the responsiveness of human infants to speech, young songbirds innately recognize and prefer to learn the songs of their own species. The acoustic and physiological bases for innate recognition were investigated in fledgling white-crowned sparrows lacking song experience. A behavioral test revealed that the complete conspecific song was not essential for innate recognition: songs composed of single white-crowned sparrow phrases and songs played in reverse elicited vocal responses as strongly as did normal song. In all cases, these responses surpassed those to other species’ songs. Although auditory neurons in the song nucleus HVc and the underlying neostriatum of fledglings did not prefer conspecific song over foreign song, some neurons responded strongly to particular phrase types characteristic of white-crowned sparrows and, thus, could contribute to innate song recognition.
Resumo:
By using a novel, extremely sensitive and specific gas chromatography-mass spectrometry technique we demonstrate in Pinus sylvestris (L.) trees the existence of a steep radial concentration gradient of the endogenous auxin, indole-3-acetic acid, over the lateral meristem responsible for the bulk of plant secondary growth, the vascular cambium. This is the first evidence that plant morphogens, such as indole-3-acetic acid, occur in concentration gradients over developing tissues. This finding gives evidence for a regulatory system in plants based on positional signaling, similar to animal systems.
Resumo:
Eubacterial transducers are transmembrane, methyl-accepting proteins central to chemotaxis systems and share common structural features. We identified a large family of transducer proteins in the Archaeon Halobacterium salinarium using a site-specific multiple antigenic peptide antibody raised against 23 amino acids, representing the highest homology region of eubacterial transducers. This immunological observation was confirmed by isolating 13 methyl-accepting taxis genes using a 27-mer oligonucleotide probe, corresponding to conserved regions between the eubacterial and first halobacterial phototaxis transducer gene htrI. On the basis of the comparison of the predicted structural domains of these transducers, we propose that at least three distinct subfamilies of transducers exist in the Archaeon H. salinarium: (i) a eubacterial chemotaxis transducer type with two hydrophobic membrane-spanning segments connecting sizable domains in the periplasm and cytoplasm; (ii) a cytoplasmic domain and two or more hydrophobic transmembrane segments without periplasmic domains; and (iii) a cytoplasmic domain without hydrophobic transmembrane segments. We fractionated the halobacterial cell lysate into soluble and membrane fractions and localized different halobacterial methyl-accepting taxis proteins in both fractions.
Resumo:
The pars triangular is a portion of Broca's area. The convolutions that form the inferior and caudal extent of the pars triangularis include the anterior horizontal and anterior ascending rami of the sylvian fissure, respectively. To learn if there are anatomic asymmetries of the pars triangularis, these convolutions were measured on volumetric magnetic resonance imaging scans of 11 patients who had undergone selective hemispheric anesthesia (Wada testing) to determine hemispheric speech and language lateralization. Of the 10 patients with language lateralized to the left hemisphere, 9 had a leftward asymmetry of the pars triangularis. The 1 patient with language lateralized to the right hemisphere had a significant rightward asymmetry of the pars triangularis. Our data suggest that asymmetries of the pars triangularis may be related to speech-language lateralization.
Resumo:
A transposon based on the transposable element Minos from Drosophila hydei was introduced into the genome of Drosophila melanogaster using transformation mediated by the Minos transposase. The transposon carries a wild-type version of the white gene (w) of Drosophila inserted into the second exon of Minos. Transformation was obtained by injecting the transposon into preblastoderm embryos that were expressing transposase either from a Hsp70-Minos fusion inserted into the genome via P-element-mediated transformation or from a coinjected plasmid carrying the Hsp70-Minos fusion. Between 1% and 6% of the fertile injected individuals gave transformed progeny. Four of the insertions were cloned and the DNA sequences flanking the transposon ends were determined. The "empty" sites corresponding to three of the insertions were amplified from the recipient strain by PCR, cloned, and sequenced. In all cases, the transposon has inserted into a TA dinucleotide and has created the characteristic TA target site duplication. In the absence of transposase, the insertions were stable in the soma and the germ line. However, in the presence of the Hsp70-Minos gene the Minos-w transposon excises, resulting in mosaic eyes and germ-line reversion to the white phenotype. Minos could be utilized as an alternative to existing systems for transposon tagging and enhancer trapping in Drosophila; it might also be of use as a germ-line transformation vector for non-Drosophila insects.
Resumo:
Regional cerebral blood flow was measured with positron emission tomography during the performance of a verbal free recall task, a verbal paired associate task, and tasks that required the production of verbal responses either by speaking or writing. Examination of the differences in regional cerebral blood flow between these conditions demonstrated that the left ventrolateral frontal cortical area 45 is involved in the recall of verbal information from long-term memory, in addition to its contribution to speech. The act of writing activated a network of areas involving posterior parietal cortex and sensorimotor areas but not ventrolateral frontal cortex.
Resumo:
Molecular imprinting of morphine and the endogenous neuropeptide [Leu5]enkephalin (Leu-enkephalin) in methacrylic acid-ethylene glycol dimethacrylate copolymers is described. Such molecular imprints possess the capacity to mimic the binding activity of opioid receptors. The recognition properties of the resultant imprints were analyzed by radioactive ligand binding analysis. We demonstrate that imprinted polymers also show high binding affinity and selectivity in aqueous buffers. This is a major breakthrough for molecular imprinting technology, since the binding reaction occurs under conditions relevant to biological systems. The antimorphine imprints showed high binding affinity for morphine, with Kd values as low as 10(-7) M, and levels of selectivity similar to those of antibodies. Preparation of imprints against Leu-enkephalin was greatly facilitated by the use of the anilide derivative rather than the free peptide as the print molecule, due to improved solubility in the polymerization mixture. Free Leu-enkephalin was efficiently recognized by this polymer (Kd values as low as 10(-7) M were observed). Four tetra- and pentapeptides, with unrelated amino acid sequences, were not bound. The imprints showed only weak affinity for two D-amino acid-containing analogues of Leu-enkephalin. Enantioselective recognition of the L-enantiomer of phenylalanylglycine anilide, a truncated analogue of the N-terminal end of enkephalin, was observed.
Resumo:
The focus of the Children's Vaccine Initiative is to encourage the discovery of technology that will make vaccines more readily available to developing countries. Our strategy has been to genetically engineer plants so that they can be used as inexpensive alternatives to fermentation systems for production of subunit antigens. In this paper we report on the immunological response elicited in vivo by using recombinant hepatitis B surface antigen (rHBsAg) purified from transgenic tobacco leaves. The anti-hepatitis B response to the tobacco-derived rHBsAg was qualitatively similar to that obtained by immunizing mice with yeast-derived rHBsAg (commercial vaccine). Additionally, T cells obtained from mice primed with the tobacco-derived rHBsAg could be stimulated in vitro by the tobacco-derived rHBsAg, yeast-derived rHBsAg, and by a synthetic peptide that represents part of the a determinant located in the S region (139-147) of HBsAg. Further support for the integrity of the T-cell epitope of the tobacco-derived rHBsAg was obtained by testing the ability of the primed T cells to proliferate in vitro after stimulation with a monoclonal anti-idiotype and an anti-idiotype-derived peptide, both of which mimic the group-specific a determinant of HBsAg. In total, we have conclusively demonstrated that both B- and T-cell epitopes of HBsAg are preserved when the antigen is expressed in a transgenic plant.