587 resultados para Phonetic alphabet.
Resumo:
Speech has both auditory and visual components (heard speech sounds and seen articulatory gestures). During all perception, selective attention facilitates efficient information processing and enables concentration on high-priority stimuli. Auditory and visual sensory systems interact at multiple processing levels during speech perception and, further, the classical motor speech regions seem also to participate in speech perception. Auditory, visual, and motor-articulatory processes may thus work in parallel during speech perception, their use possibly depending on the information available and the individual characteristics of the observer. Because of their subtle speech perception difficulties possibly stemming from disturbances at elemental levels of sensory processing, dyslexic readers may rely more on motor-articulatory speech perception strategies than do fluent readers. This thesis aimed to investigate the neural mechanisms of speech perception and selective attention in fluent and dyslexic readers. We conducted four functional magnetic resonance imaging experiments, during which subjects perceived articulatory gestures, speech sounds, and other auditory and visual stimuli. Gradient echo-planar images depicting blood oxygenation level-dependent contrast were acquired during stimulus presentation to indirectly measure brain hemodynamic activation. Lip-reading activated the primary auditory cortex, and selective attention to visual speech gestures enhanced activity within the left secondary auditory cortex. Attention to non-speech sounds enhanced auditory cortex activity bilaterally; this effect showed modulation by sound presentation rate. A comparison between fluent and dyslexic readers' brain hemodynamic activity during audiovisual speech perception revealed stronger activation of predominantly motor speech areas in dyslexic readers during a contrast test that allowed exploration of the processing of phonetic features extracted from auditory and visual speech. The results show that visual speech perception modulates hemodynamic activity within auditory cortex areas once considered unimodal, and suggest that the left secondary auditory cortex specifically participates in extracting the linguistic content of seen articulatory gestures. They are strong evidence for the importance of attention as a modulator of auditory cortex function during both sound processing and visual speech perception, and point out the nature of attention as an interactive process (influenced by stimulus-driven effects). Further, they suggest heightened reliance on motor-articulatory and visual speech perception strategies among dyslexic readers, possibly compensating for their auditory speech perception difficulties.
Resumo:
A construction for a family of sequences over the 8-ary AM-PSK constellation that has maximum nontrivial correlation magnitude bounded as theta(max) less than or similar to root N is presented here. The famfly is asymptotically optimal with respect to the Welch bound on maximum magnitude of correlation. The 8-ary AM-PSK constellation is a subset of the 16-QAM constellation. We also construct two families of sequences over 16-QAM with theta(max) less than or similar to root 2 root N. These families are constructed by interleaving sets of sequences. A construction for a famBy of low-correlation sequences over QAM alphabet of size 2(2m) is presented with maximum nontrivial normalized correlation parameter bounded above by less than or similar to a root N, where N is the period of the sequences in the family and where a ranges from 1.61 in the case of 16-QAM modulation to 2.76 for large m. When used in a CDMA setting, the family will permit each user to modulate the code sequence with 2m bits of data. Interestingly, the construction permits users on the reverse link of the CDMA channel to communicate using varying data rates by switching between sequence famflies; associated to different values of the parameter m. Other features of the sequence families are improved Euclidean distance between different data symbols in comparison with PSK signaling and compatibility of the QAM sequence families with sequences belonging to the large quaternary sequence families {S(p)}.
Resumo:
Although immensely complex, speech is also a very efficient means of communication between humans. Understanding how we acquire the skills necessary for perceiving and producing speech remains an intriguing goal for research. However, while learning is likely to begin as soon as we start hearing speech, the tools for studying the language acquisition strategies in the earliest stages of development remain scarce. One prospective strategy is statistical learning. In order to investigate its role in language development, we designed a new research method. The method was tested in adults using magnetoencephalography (MEG) as a measure of cortical activity. Neonatal brain activity was measured with electroencephalography (EEG). Additionally, we developed a method for assessing the integration of seen and heard syllables in the developing brain as well as a method for assessing the role of visual speech when learning phoneme categories. The MEG study showed that adults learn statistical properties of speech during passive listening of syllables. The amplitude of the N400m component of the event-related magnetic fields (ERFs) reflected the location of syllables within pseudowords. The amplitude was also enhanced for syllables in a statistically unexpected position. The results suggest a role for the N400m component in statistical learning studies in adults. Using the same research design with sleeping newborn infants, the auditory event-related potentials (ERPs) measured with EEG reflected the location of syllables within pseudowords. The results were successfully replicated in another group of infants. The results show that even newborn infants have a powerful mechanism for automatic extraction of statistical characteristics from speech. We also found that 5-month-old infants integrate some auditory and visual syllables into a fused percept, whereas other syllable combinations are not fully integrated. Auditory syllables were paired with visual syllables possessing a different phonetic identity, and the ERPs for these artificial syllable combinations were compared with the ERPs for normal syllables. For congruent auditory-visual syllable combinations, the ERPs did not differ from those for normal syllables. However, for incongruent auditory-visual syllable combinations, we observed a mismatch response in the ERPs. The results show an early ability to perceive speech cross-modally. Finally, we exposed two groups of 6-month-old infants to artificially created auditory syllables located between two stereotypical English syllables in the formant space. The auditory syllables followed, equally for both groups, a unimodal statistical distribution, suggestive of a single phoneme category. The visual syllables combined with the auditory syllables, however, were different for the two groups, one group receiving visual stimuli suggestive of two separate phoneme categories, the other receiving visual stimuli suggestive of only one phoneme category. After a short exposure, we observed different learning outcomes for the two groups of infants. The results thus show that visual speech can influence learning of phoneme categories. Altogether, the results demonstrate that complex language learning skills exist from birth. They also suggest a role for the visual component of speech in the learning of phoneme categories.
Resumo:
The stability of scheduled multiaccess communication with random coding and independent decoding of messages is investigated. The number of messages that may be scheduled for simultaneous transmission is limited to a given maximum value, and the channels from transmitters to receiver are quasistatic, flat, and have independent fades. Requests for message transmissions are assumed to arrive according to an i.i.d. arrival process. Then, we show the following: (1) in the limit of large message alphabet size, the stability region has an interference limited information-theoretic capacity interpretation, (2) state-independent scheduling policies achieve this asymptotic stability region, and (3) in the asymptotic limit corresponding to immediate access, the stability region for non-idling scheduling policies is shown to be identical irrespective of received signal powers.
Resumo:
A repetitive sequence collection is one where portions of a base sequence of length n are repeated many times with small variations, forming a collection of total length N. Examples of such collections are version control data and genome sequences of individuals, where the differences can be expressed by lists of basic edit operations. Flexible and efficient data analysis on a such typically huge collection is plausible using suffix trees. However, suffix tree occupies O(N log N) bits, which very soon inhibits in-memory analyses. Recent advances in full-text self-indexing reduce the space of suffix tree to O(N log σ) bits, where σ is the alphabet size. In practice, the space reduction is more than 10-fold, for example on suffix tree of Human Genome. However, this reduction factor remains constant when more sequences are added to the collection. We develop a new family of self-indexes suited for the repetitive sequence collection setting. Their expected space requirement depends only on the length n of the base sequence and the number s of variations in its repeated copies. That is, the space reduction factor is no longer constant, but depends on N / n. We believe the structures developed in this work will provide a fundamental basis for storage and retrieval of individual genomes as they become available due to rapid progress in the sequencing technologies.
Resumo:
Abstract (Mig or mej, själ or sjel? Problems and solutions in the transcription of Swedish song texts): In this article I am pointing out and discussing problems and solutions concerning phonetic transcription of Swedish song texts. My material consists of 66 Swedish songs phonetically transcribed. The transcriptions were published by The Academy of Finnish Art Song in 2009. The first issue was which level of accuracy should be chosen. The transcriptions were created to be clear at a glance and suitable for the needs of interpretation of non Swedish speaking singers. The principle was to use as few signs and symbols as possible without sacrificing accuracy. Certain songs were provided with additional information whenever there was a chance of misinterpretation. The second issue was which geographic variety of the language should be visible in the transcription, Standard Swedish or Finland-Swedish? The songs in the volume are a selection of well-known works that are also of international interest. Most were composed by Jean Sibelius (1865–1957), a substantial number of whose songs were based on poems written by Finland’s national poet, Johan Ludvig Runeberg (1804–1877). Thus I chose to use the variety of Swedish language spoken in Finland, in order to reflect the cultural origin of the songs. This variety differs slightly from the variety spoken in Sweden both on prosodic and phonetic level. In singing, the note-text gives the interpretor enough information about prosody. The differences concern mostly the phonemes. A fully consequent transcript was, however, difficult to make, due to vocal requirement. So, for example, in an unstressed final syllable the vowel was often indicated as a central vowel, which in singing is given a more direct emphasis than in a literal pronunciation, even if this central vowel does not occur in spoken Finland-Swedish.
Resumo:
We study the problem of guessing the realization of a finite alphabet source, when some side information is provided, in a setting where the only knowledge the guesser has about the source and the correlated side information is that the joint source is one among a family. We define a notion of redundancy, identify a quantity that measures this redundancy, and study its properties. We then identify good guessing strategies that minimize the supremum redundancy (over the family). The minimum value measures the richness of the uncertainty class.
Resumo:
In this paper we address the problem of distributed transmission of functions of correlated sources over a fast fading multiple access channel (MAC). This is a basic building block in a hierarchical sensor network used in estimating a random field where the cluster head is interested only in estimating a function of the observations. The observations are transmitted to the cluster head through a fast fading MAC. We provide sufficient conditions for lossy transmission when the encoders and decoders are provided with partial information about the channel state. Furthermore signal side information maybe available at the encoders and the decoder. Various previous studies are shown as special cases. Efficient joint-source channel coding schemes are discussed for transmission of discrete and continuous alphabet sources to recover function values.
Resumo:
Constellation Constrained (CC) capacity regions of a two-user Gaussian Multiple Access Channel(GMAC) have been recently reported. For such a channel, code pairs based on trellis coded modulation are proposed in this paper with MPSK and M-PAM alphabet pairs, for arbitrary values of M,toachieve sum rates close to the CC sum capacity of the GMAC. In particular, the structure of the sum alphabets of M-PSK and M-PAMmalphabet pairs are exploited to prove that, for certain angles of rotation between the alphabets, Ungerboeck labelling on the trellis of each user maximizes the guaranteed squared Euclidean distance of the sum trellis. Hence, such a labelling scheme can be used systematically,to construct trellis code pairs to achieve sum rates close to the CC sum capacity. More importantly, it is shown for the first time that ML decoding complexity at the destination is significantly reduced when M-PAM alphabet pairs are employed with almost no loss in the sum capacity.
Resumo:
A novel system for recognition of handprinted alphanumeric characters has been developed and tested. The system can be employed for recognition of either the alphabet or the numeral by contextually switching on to the corresponding branch of the recognition algorithm. The two major components of the system are the multistage feature extractor and the decision logic tree-type catagorizer. The importance of ldquogoodrdquo features over sophistication in the classification procedures was recognized, and the feature extractor is designed to extract features based on a variety of topological, morphological and similar properties. An information feedback path is provided between the decision logic and the feature extractor units to facilitate an interleaved or recursive mode of operation. This ensures that only those features essential to the recognition of a particular sample are extracted each time. Test implementation has demonstrated the reliability of the system in recognizing a variety of handprinted alphanumeric characters with close to 100% accuracy.
Resumo:
In the field of second language (L2) acquisition, the term `foreign accent´ is often used to refer to speech characteristics that differ from the pronunciation of native speakers. Foreign accent may affect the intelligibility and perceived comprehensibility of speech and it is also sometimes associated with negative attitudes. The degree of L2 learners foreign accent and the speech characteristics that account for it have previously been studied through speech perception experiments and acoustic measurements. Perception experiments have shown that native listeners are easily able to identify foreign accent in speech. However to date, no studies have been done on the assessment of foreign accent in the speech of non-native speakers of Finnish. The aim of this study is to examine how native speakers of Finnish rate the degree of foreign accentedness in the speech of Russian L2 learners of Finnish. Furthermore, phonetic analysis is used to study the characteristics of speech that affect the perceived strength of foreign accent. Altogether 96 native speakers of Finnish listened to excerpts of read-aloud and spontaneous Finnish speech from ten Russian and six Finnish female speakers. The Russian speakers were intermediate and advanced learners of Finnish and had all immigrated to Finland as adults. Among the listeners, was a group of teachers of Finnish as an L2, and it was presumed that these teachers had been exposed to foreign accent in Finnish and were used to hearing it. The temporal aspects and segmental properties of speech were phonetically analysed in the speech of the Russian speakers in order to measure their effect on the perceived degree of accent. Although wide differences were observed in the use of the rating scale among the listeners, they were still quite unanimous on which speakers had the strongest foreign accent and which had the mildest. The listeners background factors had little effect on their ratings, and the ratings of the teachers of Finnish as an L2 did not differ from those of the other listeners. However, a clear difference was noted in the ratings of the two types of stimuli used in the perception experiment: the read-aloud speech was rated as more strongly accented than the spontaneous speech. It is important to note that the assessment of foreign accent is affected by many factors and their complex interactions in the experimental setting. Futher the study found that, both the temporal aspects of speech, often associated with fluency, and the number of single deviant phonetic segments contributed to the perceived degree of accentedness in the speech of the native Russian speakers.
Resumo:
Structure comparison tools can be used to align related protein structures to identify structurally conserved and variable regions and to infer functional and evolutionary relationships. While the conserved regions often superimpose well, the variable regions appear non superimposable. Differences in homologous protein structures are thought to be due to evolutionary plasticity to accommodate diverged sequences during evolution. One of the kinds of differences between 3-D structures of homologous proteins is rigid body displacement. A glaring example is not well superimposed equivalent regions of homologous proteins corresponding to a-helical conformation with different spatial orientations. In a rigid body superimposition, these regions would appear variable although they may contain local similarity. Also, due to high spatial deviation in the variable region, one-to-one correspondence at the residue level cannot be determined accurately. Another kind of difference is conformational variability and the most common example is topologically equivalent loops of two homologues but with different conformations. In the current study, we present a refined view of the ``structurally variable'' regions which may contain local similarity obscured in global alignment of homologous protein structures. As structural alphabet is able to describe local structures of proteins precisely through Protein Blocks approach, conformational similarity has been identified in a substantial number of `variable' regions in a large data set of protein structural alignments; optimal residue-residue equivalences could be achieved on the basis of Protein Blocks which led to improved local alignments. Also, through an example, we have demonstrated how the additional information on local backbone structures through protein blocks can aid in comparative modeling of a loop region. In addition, understanding on sequence-structure relationships can be enhanced through our approach. This has been illustrated through examples where the equivalent regions in homologous protein structures share sequence similarity to varied extent but do not preserve local structure.
Resumo:
With the immense growth in the number of available protein structures, fast and accurate structure comparison has been essential. We propose an efficient method for structure comparison, based on a structural alphabet. Protein Blocks (PBs) is a widely used structural alphabet with 16 pentapeptide conformations that can fairly approximate a complete protein chain. Thus a 3D structure can be translated into a 1D sequence of PBs. With a simple Needleman-Wunsch approach and a raw PB substitution matrix, PB-based structural alignments were better than many popular methods. iPBA web server presents an improved alignment approach using (i) specialized PB Substitution Matrices (SM) and (ii) anchor-based alignment methodology. With these developments, the quality of similar to 88% of alignments was improved. iPBA alignments were also better than DALI, MUSTANG and GANGSTA(+) in > 80% of the cases. The webserver is designed to for both pairwise comparisons and database searches. Outputs are given as sequence alignment and superposed 3D structures displayed using PyMol and Jmol. A local alignment option for detecting subs-structural similarity is also embedded. As a fast and efficient `sequence-based' structure comparison tool, we believe that it will be quite useful to the scientific community. iPBA can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/ipba/.
Resumo:
Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.
Resumo:
The problem of guessing a random string is revisited. The relation-ship between guessing without distortion and compression is extended to the case when source alphabet size is countably in¯nite. Further, similar relationship is established for the case when distortion allowed by establishing a tight relationship between rate distortion codes and guessing strategies.