406 resultados para Speech-processing technologies
Resumo:
SoundCipher is a software library written in the Java language that adds important music and sound features to the Processing environment that is widely used by media artists and otherwise has an orientation toward computational graphics. This article introduces the SoundCipher library and its features, describes its influences and design intentions, and positions it within the field of computer music programming tools. SoundCipher enables the rich history of algorithmic music techniques to be accessible within one of today’s most popular media art platforms. It also provides an accessible means for learning to create algorithmic music and sound programs.
Resumo:
Many jurisdictions have developed mature infrastructures, both administratively and legislatively, to promote competition. Substantial funds have been expended to monitor activities that are anticompetitive and many jurisdictions also have adopted a form of "Cartel Leniency Program", first developed by the US Federal Trade Commission, to assist in cartel detection. Further, some jurisdictions are now criminalizing cartel behaviour so that cartel participants can be held criminally liable with substantial custodial penalties imposed. Notwithstanding these multijurisdictional approaches, a new form of possibly anticompetitive behaviour is looming. Synergistic monopolies („synopolies‟) involve not competitors within a horizontal market but complimentors within separate vertical markets. Where two complimentary corporations are monopolists in their own market they can, through various technologies, assist each other to expand their respective monopolies thus creating a barrier to new entrants and/or blocking existing participants from further participation in that market. The nature of the technologies involved means that it is easy for this potentially anti-competitive activity to enter and affect the global marketplace. Competition regulators need to be aware of this potential for abuse and ensure that their respective competition frameworks appropriately address this activity. This paper discusses how new technologies can be used to create a synopoly.
Resumo:
This correspondence presents a microphone array shape calibration procedure for diffuse noise environments. The procedure estimates intermicrophone distances by fitting the measured noise coherence with its theoretical model and then estimates the array geometry using classical multidimensional scaling. The technique is validated on noise recordings from two office environments.
Resumo:
Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.
Resumo:
This paper proposes a clustered approach for blind beamfoming from ad-hoc microphone arrays. In such arrangements, microphone placement is arbitrary and the speaker may be close to one, all or a subset of microphones at a given time. Practical issues with such a configuration mean that some microphones might be better discarded due to poor input signal to noise ratio (SNR) or undesirable spatial aliasing effects from large inter-element spacings when beamforming. Large inter-microphone spacings may also lead to inaccuracies in delay estimation during blind beamforming. In such situations, using a cluster of microphones (ie, a sub-array), closely located both to each other and to the desired speech source, may provide more robust enhancement than the full array. This paper proposes a method for blind clustering of microphones based on the magnitude square coherence function, and evaluates the method on a database recorded using various ad-hoc microphone arrangements.
Resumo:
While spatial determinants of emmetropization have been examined extensively in animal models and spatial processing of human myopes has also been studied, there have been few studies investigating temporal aspects of emmetropization and temporal processing in human myopia. The influence of temporal light modulation on eye growth and refractive compensation has been observed in animal models and there is evidence of temporal visual processing deficits in individuals with high myopia or other pathologies. Given this, the aims of this work were to examine the relationships between myopia (i.e. degree of myopia and progression status) and temporal visual performance and to consider any temporal processing deficits in terms of the parallel retinocortical pathways. Three psychophysical studies investigating temporal processing performance were conducted in young adult myopes and non-myopes: (1) backward visual masking, (2) dot motion perception and (3) phantom contour. For each experiment there were approximately 30 young emmetropes, 30 low myopes (myopia less than 5 D) and 30 high myopes (5 to 12 D). In the backward visual masking experiment, myopes were also classified according to their progression status (30 stable myopes and 30 progressing myopes). The first study was based on the observation that the visibility of a target is reduced by a second target, termed the mask, presented quickly after the first target. Myopes were more affected by the mask when the task was biased towards the magnocellular pathway; myopes had a 25% mean reduction in performance compared with emmetropes. However, there was no difference in the effect of the mask when the task was biased towards the parvocellular system. For all test conditions, there was no significant correlation between backward visual masking task performance and either the degree of myopia or myopia progression status. The dot motion perception study measured detection thresholds for the minimum displacement of moving dots, the maximum displacement of moving dots and degree of motion coherence required to correctly determine the direction of motion. The visual processing of these tasks is dominated by the magnocellular pathway. Compared with emmetropes, high myopes had reduced ability to detect the minimum displacement of moving dots for stimuli presented at the fovea (20% higher mean threshold) and possibly at the inferior nasal retina. The minimum displacement threshold was significantly and positively correlated to myopia magnitude and axial length, and significantly and negatively correlated with retinal thickness for the inferior nasal retina. The performance of emmetropes and myopes for all the other dot motion perception tasks were similar. In the phantom contour study, the highest temporal frequency of the flickering phantom pattern at which the contour was visible was determined. Myopes had significantly lower flicker detection limits (21.8 ± 7.1 Hz) than emmetropes (25.6 ± 8.8 Hz) for tasks biased towards the magnocellular pathway for both high (99%) and low (5%) contrast stimuli. There was no difference in flicker limits for a phantom contour task biased towards the parvocellular pathway. For all phantom contour tasks, there was no significant correlation between flicker detection thresholds and magnitude of myopia. Of the psychophysical temporal tasks studied here those primarily involving processing by the magnocellular pathway revealed differences in performance of the refractive error groups. While there are a number of interpretations for this data, this suggests that there may be a temporal processing deficit in some myopes that is selective for the magnocellular system. The minimum displacement dot motion perception task appears the most sensitive test, of those studied, for investigating changes in visual temporal processing in myopia. Data from the visual masking and phantom contour tasks suggest that the alterations to temporal processing occur at an early stage of myopia development. In addition, the link between increased minimum displacement threshold and decreasing retinal thickness suggests that there is a retinal component to the observed modifications in temporal processing.
Resumo:
This paper examines the role of intuition in the way that people operate unfamiliar devices. Intuition is a type of cognitive processing that is often non-conscious and utilises stored experiential knowledge. Intuitive interaction involves the use of knowledge gained from other products and/or experiences. Two initial experimental studies revealed that prior exposure to products employing similar features helped participants to complete set tasks more quickly and intuitively, and that familiar features were intuitively used more often than unfamiliar ones. A third experiment confirmed that performance is affected by a person's level of familiarity with similar technologies, and also revealed that appearance (shape, size and labelling of features) seems to be the variable that most affects time spent on a task and intuitive uses during that time. Age also seems to have an effect. These results and their implications are discussed.
Resumo:
The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.
Resumo:
This study assesses the recently proposed data-driven background dataset refinement technique for speaker verification using alternate SVM feature sets to the GMM supervector features for which it was originally designed. The performance improvements brought about in each trialled SVM configuration demonstrate the versatility of background dataset refinement. This work also extends on the originally proposed technique to exploit support vector coefficients as an impostor suitability metric in the data-driven selection process. Using support vector coefficients improved the performance of the refined datasets in the evaluation of unseen data. Further, attempts are made to exploit the differences in impostor example suitability measures from varying features spaces to provide added robustness.
Resumo:
Defibrillator is a 16’41” musical work for solo performer, laptop computer and electric guitar. The electric guitar is processed in real-time by digital signal processing network in software, with gestural control provided by a foot-operated pedal board. --------- The work is informed by a range of ideas from the genres of electroacoustic music, western art music, popular music and cinematic sound. It seeks to fluidly cross and hybridise musical practices from these diverse sonic traditions and to develop a compositional language that draws upon multiple genres, but at the same time resists the ability to be located within a singular genre. Musical structures and sonic markers which form genre are ruptured at strategic levels of the musical structure in order to allow for a cross flow of concepts between genres. The process of rupture is facilitated by the practical implementation of music and sound reception theories into the compositional process. -------- The piece exhibits the by-products of a composer born into a media saturated environment, drawing on a range of musical and sonic traditions, actively seeking to explore the liminal space in between these traditions. The project stems from the author's research interests in locating points of connection between traditions of experimentation in diverse musical and sonic traditions arising from the broad uptake of media technologies in the early 20th century.
Resumo:
E-commerce technologies such as a website, email and the use of web browsers enables access to large amounts of information, facilitates communication and provides niche companies with an effective mechanism for competing with larger organisations world-wide. However recent literature has shown Australian SMEs have been slow in the uptake of these technologies. The aim of this research was to determine which factors were important in impacting on small firms' decision making in respect of information technology and e-commerce adoption. Findings indicate that generally the more a firm was concerned about its competitive position such a firm was likely to develop a web site. Moreover the 'Industry and Skill Demands' dimension suggested that as the formal education of the owner/manager increased, coupled with the likelihood that the firm was in the transport and storage or communication services industries, and realising the cost of IT adoption was in effect an investment, then such a firm would be inclined to develop a web site.