747 resultados para Sounds.
Resumo:
This paper reports the first systematic study of acoustic signals during social interactions of the Chinese alligator (Alligator sinensis). Sound pressure level (SPL) measurements revealed that Chinese alligators have an elaborate acoustic communication system with both long-distance signal-bellowing-and short-distance signals that include tooting, bubble blowing, hissing, mooing, head slapping and whining. Bellows have high SPL and appear to play an important role in the alligator's long range intercommunion. Sounds characterized by low SPL are short-distance signals used when alligators are in close spatial proximity to one another. The signal spectrographic analysis showed that the acoustic signals of Chinese alligators have a very low dominant frequency, less than 500 Hz. These frequencies are consistent with adaptation to a habitat with high density vegetation. Low dominant frequency sound attenuates less and could therefore cover a larger spatial range by diffraction in a densely vegetated environment relative to a higher dominant frequency sound. (C) 2007 Acoustical Society of America.
Resumo:
Recently, sonar signals and other sounds produced by cetaceans have been used for acoustic detection of individuals and groups in the wild. However, the detection probability ascertained by concomitant visual survey has not been demonstrated extensively. The finless porpoises (Neophocaena phocaenoides) have narrow band and high-frequency sonar signals, which are distinctive from background noises. Underwater sound monitoring with hydrophones (B&K8103) placed along the sides of a research vessel, concurrent with visual observations was conducted in the Yangtze River from Wuhan to Poyang Lake in 1998 in China. The peak to peak detection threshold was set at 133 dB re 1 mu Pa. With this threshold level, porpoises could be detected reliably within 300 m of the hydrophone. In a total of 774-km cruise, 588 finless porpoises were sighted by visual observation and 44 864 ultrasonic pulses were recorded by the acoustical observation system. The acoustic monitoring system could detect the presence of the finless porpoises 82% of the time. A false alarm in the system occurred with a frequency of 0.9%. The high-frequency acoustical observation is suggested as an effective method for field surveys of small cetaceans, which produce high-frequency sonar signals. (C) 2001 Acoustical Society of America.
Resumo:
In speaker-independent speech recognition, the disadvantage of the most diffused technology (HMMs, or Hidden Markov models) is not only the need of many more training samples, but also long train time requirement. This paper describes the use of Biomimetic pattern recognition (BPR) in recognizing some mandarin continuous speech in a speaker-independent manner. A speech database was developed for the course of study. The vocabulary of the database consists of 15 Chinese dish's names, the length of each name is 4 Chinese words. Neural networks (NNs) based on Multi-weight neuron (MWN) model are used to train and recognize the speech sounds. The number of MWN was investigated to achieve the optimal performance of the NNs-based BPR. This system, which is based on BPR and can carry out real time recognition reaches a recognition rate of 98.14% for the first option and 99.81% for the first two options to the persons from different provinces of China speaking common Chinese speech. Experiments were also carried on to evaluate Continuous density hidden Markov models (CDHMM), Dynamic time warping (DTW) and BPR for speech recognition. The Experiment results show that BPR outperforms CDHMM and DTW especially in the cases of samples of a finite size.
Resumo:
In speaker-independent speech recognition, the disadvantage of the most diffused technology ( Hidden Markov Models) is not only the need of many more training samples, but also long train time requirement. This paper describes the use of Biomimetic Pattern Recognition (BPR) in recognizing some Mandarin Speech in a speaker-independent manner. The vocabulary of the system consists of 15 Chinese dish's names. Neural networks based on Multi-Weight Neuron (MWN) model are used to train and recognize the speech sounds. Experimental results are presented to show that the system, which can carry out real time recognition of the persons from different provinces speaking common Chinese speech, outperforms HMMs especially in the cases of samples of a finite size.
Resumo:
本论文采用音乐物理和数学方法揭示了汉语四声的奥秘——频率变比3∶2,这个比例被命名为“宝石配比”.这一发现为语音研究、语音教学以及计算机语音识别提供了一条重要的科学依据.
Resumo:
Since anxiety exists as a certain kind of relation between the subject and his stimuli, present study proposed the criterion that every one of the anxiety’s structural components had to manifest the properties of the relation, not the subject’s properties, and not the stimuli’s properties, too; and that every component must be a bridge from the subject to the object. Several structural models on anxiety were judged by the criterion, and three suited factors were drawn out, which were controllability, importance and urgency. Then through various methods of interview, questionnaire, laboratorial experiment, events graded, the research examined the relations between the three factors and normal anxiety, and explored the interaction of controllability, importance and urgency when they worked together. The results were as the following: 1. The hypothesis that controllability, importance and urgency were the structural components of normal anxiety sounds reasonable, which was supported in different degrees by the results of several secondary research with various methods. 2. With a totally different method of constructing items from the prevalent integrative anxiety inventory, an effective normal anxiety inventory of middle school students was developed based on this theoretical consideration. Reliability and validity research showed that the coefficient of homogeneity and test-retest reliability were high. The construct validity and criterion validity were good enough. The inventory was a useful scale to measure the normal anxiety for middle school students. 3. It’s through their interaction that controllability and importance effect on the level of anxiety which was produced by the relevant event. 4. Anxiety arises with the increasing of importance or urgency. Though there’s a very close connection between controllability and anxiety, the concrete situation is very complicated. Not only is the level of anxiety affected by the interaction of controllability and importance, but when controllability tends to be extremely high or extremely low, importance may also begin to vary with controllability. What’s more, dispositional optimism and self-efficacy still exert influences that cannot be ignored on the relationship between controllability and anxiety. All the results indicated that it was reasonable to regard and study anxiety as a kind of relation between the subject and his environment, and to some extent it’s even necessary if the intention was to clarify the structure of anxiety. Since all the other emotion also exists as a kind of relation between the subject and his stimuli, the wholeness principle which was first proposed and recommended in this study will be of great value to the related researchers. And the interaction between controllability and importance may suggest a valid coping strategy of anxiety.
Resumo:
Considerable studies find that developmental dyslexia is associated with deficits in phonological processing skills, especially phonological awareness. In order to explore the nature of phonological awareness deficits in dyslexia, researchers have begun to investigate the role of speech perception. The findings about speech perception abilities in dyslexics are inconsistent. The heterogeneity of dyslexia may be responsible for the inconsistency of findings. Considering the general suggestion that phonological awareness deficits in dyslexia are attributed to categorical perception deficits, it is more direct to examine whether children with phonological awareness difficulties or phonological dyslexia show speech categorization deficits consistently. The present study would investigate whether Chinese children with phonological awareness deficits or phonological dyslexia showed abnormal speech perception. The whole study consisted of two parts. Part I screened children with phonological-awareness deficits from Year 3 kindergartens and examined their abilities of perceiving native category continuum, nonnative category contrasts and non-speech sound series. Part II selected phonological dyslexics from an elementary school as participants, and further explored the relation between phonological deficits and speech perception. The first two experiments of Part II examined separately the abilities to label stimuli in native category continuum and brief stops in different contexts, the last experiment investigated the adaptation effects of different participant groups. The main conclusions are as follows: 1) Children with phonological dyslexia showed categorical perception deficits: they had lower consistency than controls when perceiving stimuli within phonetic categories, especially for the stimuli which were not natural sounds. 2) Children with phonological dyslexia exhibited a general difficulty of perceiving brief segments of stops from different contexts. 3) Children with phonological dyslexia did not show adaptation to repeatedly presented stimuli. Based on the present conclusions and the findings of previous studies, we suggested that the representations of sound stimuli in phonological dyslexics’ brains are different from those in normal children’s; the representations of sound stimuli in dyslexics’ cortical neural networks are more diffuse and inconsistent.
Resumo:
Does knowledge of language consist of symbolic rules? How do children learn and use their linguistic knowledge? To elucidate these questions, we present a computational model that acquires phonological knowledge from a corpus of common English nouns and verbs. In our model the phonological knowledge is encapsulated as boolean constraints operating on classical linguistic representations of speech sounds in term of distinctive features. The learning algorithm compiles a corpus of words into increasingly sophisticated constraints. The algorithm is incremental, greedy, and fast. It yields one-shot learning of phonological constraints from a few examples. Our system exhibits behavior similar to that of young children learning phonological knowledge. As a bonus the constraints can be interpreted as classical linguistic rules. The computational model can be implemented by a surprisingly simple hardware mechanism. Our mechanism also sheds light on a fundamental AI question: How are signals related to symbols?
Resumo:
Humans rapidly and reliably learn many kinds of regularities and generalizations. We propose a novel model of fast learning that exploits the properties of sparse representations and the constraints imposed by a plausible hardware mechanism. To demonstrate our approach we describe a computational model of acquisition in the domain of morphophonology. We encapsulate phonological information as bidirectional boolean constraint relations operating on the classical linguistic representations of speech sounds in term of distinctive features. The performance model is described as a hardware mechanism that incrementally enforces the constraints. Phonological behavior arises from the action of this mechanism. Constraints are induced from a corpus of common English nouns and verbs. The induction algorithm compiles the corpus into increasingly sophisticated constraints. The algorithm yields one-shot learning from a few examples. Our model has been implemented as a computer program. The program exhibits phonological behavior similar to that of young children. As a bonus the constraints that are acquired can be interpreted as classical linguistic rules.
Resumo:
Moon Palace I takes its title from the novel “Moon Palace” by Paul Auster and is loosely influenced by the following quotation from the novel: "I had jumped off the edge, and then, at the very last moment, something reached out and caught me in midair. That something is what I define as love. It is the one thing that can stop a man from falling, the one thing powerful enough to negate the laws of gravity." (Auster has authorised reproduction of the quotation) The opening pitches of moon palace I were composed while sitting at the piano exploring the sound of my tinnitus/inner ringing. From an initial rather delicate, graceful presentation of these pitches, the music intensifies rhythmically and dynamically, becoming more aggressive, emphasising the invasive quality inherent in tinnitus. A study at any one time of the pitches and rhythms present in my tinnitus can yield interesting results, the relationship between the sounds heard in each ear sometimes producing unison pitches or clashing dissonances. However, for all the fascination and intrigue, tinnitus can be relentless and disturbing, interrupting concentration and hindering sleep. Moon Palace I is an exploration of these two opposing elements. Laurina Sableviciute gave the first performance at St John's Church, Edinburgh, May 2006. It has also been performed by Tricia Dawn Williams at the Manoel Theatre, Valletta, Malta, November 2011.
Resumo:
Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Medicina Dentária
Resumo:
Auditory signals of speech are speaker-dependent, but representations of language meaning are speaker-independent. Such a transformation enables speech to be understood from different speakers. A neural model is presented that performs speaker normalization to generate a pitchindependent representation of speech sounds, while also preserving information about speaker identity. This speaker-invariant representation is categorized into unitized speech items, which input to sequential working memories whose distributed patterns can be categorized, or chunked, into syllable and word representations. The proposed model fits into an emerging model of auditory streaming and speech categorization. The auditory streaming and speaker normalization parts of the model both use multiple strip representations and asymmetric competitive circuits, thereby suggesting that these two circuits arose from similar neural designs. The normalized speech items are rapidly categorized and stably remembered by Adaptive Resonance Theory circuits. Simulations use synthesized steady-state vowels from the Peterson and Barney [J. Acoust. Soc. Am. 24, 175-184 (1952)] vowel database and achieve accuracy rates similar to those achieved by human listeners. These results are compared to behavioral data and other speaker normalization models.
Resumo:
This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.
Resumo:
In an attempt to provide an analytical entry point into my compositional practice, I have identified eight themes which are significantly recurrent: reduction – the selection of a small number of elements; imperfection – a damaged or warped characteristic of sound; hierarchy – a concern with the roles of instruments with regard to their relative prominence; motion – apparently static sound masses consist of fine internal movement; listener perception – expectations for change influence the experience of affect; translation – the transitioning of electronic sounds to the acoustic realm, and vice versa; immersion – the creation of an accommodating soundscape; blurring – smearing and overlapping sounds or genres. Each of these eight factors is associated with relevant precedents in the history and theory of music that have been influential on my work. These include the minimalist compositions of Steve Reich and Arvo Pärt; the lo-fi aesthetic of Boards of Canada and My Bloody Valentine; concerns with political hierarchy in the work of Louis Andriessen; the variations of dynamics and microtonal shifts of Giacinto Scelsi; Leonard B. Meyer's account of expectation in music; cross-fertilisation of the acoustic and electronic in pieces by Gérard Grisey and Gyorgy Ligeti; the immersive technique of Brian Eno's ambient music; and the overlapping sounds of Aphex Twin. These eight factors are variously applicable to the eleven submitted pieces, which are individually analysed with reference to the most significant of the categories. Together they form a musical language that sustains the interaction of a variety of techniques, concepts and genres.
Resumo:
Figer (to congeal, to solidify) is a quadraphonic electroacoustic composition. It was completed in the fall of 2003. Several software programs were used in creating and assembling the piece (C-Sound, Grain Mill, AL/Erwin (grain generator), Sound Forge and Acid Music). The sounds used in the piece are of two general types: synthesized and sampled, both of which were subjected to various processing techniques. The most important of these techniques, and one that formally defines large portions of the piece, is granular synthesis. Form The notion of time perception is of great importance in this piece. Figer addresses this question in several ways. In one sense, the form of Figer is simple. There are three layers of activity (see diagram). Layer 1 is continuous and non-sectional and supplies a backdrop (not necessarily a background) for the other two. The second and third layers overlap and interrupt one another. Each consists of two blocks of sound. The layers, and blocks within, relate to each other in various ways. Layer 1 is formally continuous. Layer 2 consists of well-defined columns of sound that evolve from soft and mild to loud and abrasive. The layer is, in reality, a whole that is simply cut into two parts (block 1 and block 2). In contrast, the blocks of layer 3 do not constitute a whole. Each is a complete unit and has its own self-contained evolutionary path. Those paths, however, do cross the paths of other units (layers, blocks), influencing them and absorbing some of their essence. At the heart of Figer lies a constant process of presenting materials or ideas and immediately, or, at times, simultaneously, commenting, reflecting on, or reinterpreting that material. All of the layers of this piece deal, both at local and global levels, with the problem of time and its perception relative to the materials, sonic or otherwise, that occupy it and the manner in which they unfold and relate to each other.