987 resultados para sound processing
Resumo:
British Petroleum (89A-1204); Defense Advanced Research Projects Agency (N00014-92-J-4015); National Science Foundation (IRI-90-00530); Office of Naval Research (N00014-91-J-4100); Air Force Office of Scientific Research (F49620-92-J-0225)
Resumo:
An improved Boundary Contour System (BCS) and Feature Contour System (FCS) neural network model of preattentive vision is applied to large images containing range data gathered by a synthetic aperture radar (SAR) sensor. The goal of processing is to make structures such as motor vehicles, roads, or buildings more salient and more interpretable to human observers than they are in the original imagery. Early processing by shunting center-surround networks compresses signal dynamic range and performs local contrast enhancement. Subsequent processing by filters sensitive to oriented contrast, including short-range competition and long-range cooperation, segments the image into regions. The segmentation is performed by three "copies" of the BCS and FCS, of small, medium, and large scales, wherein the "short-range" and "long-range" interactions within each scale occur over smaller or larger distances, corresponding to the size of the early filters of each scale. A diffusive filling-in operation within the segmented regions at each scale produces coherent surface representations. The combination of BCS and FCS helps to locate and enhance structure over regions of many pixels, without the resulting blur characteristic of approaches based on low spatial frequency filtering alone.
Resumo:
This article describes a nonlinear model of neural processing in the vertebrate retina, comprising model photoreceptors, model push-pull bipolar cells, and model ganglion cells. Previous analyses and simulations have shown that with a choice of parameters that mimics beta cells, the model exhibits X-like linear spatial summation (null response to contrast-reversed gratings) in spite of photoreceptor nonlinearities; on the other hand, a choice of parameters that mimics alpha cells leads to Y-like frequency doubling. This article extends the previous work by showing that the model can replicate qualitatively many of the original findings on X and Y cells with a fixed choice of parameters. The results generally support the hypothesis that X and Y cells can be seen as functional variants of a single neural circuit. The model also suggests that both depolarizing and hyperpolarizing bipolar cells converge onto both ON and OFF ganglion cell types. The push-pull connectivity enables ganglion cells to remain sensitive to deviations about the mean output level of nonlinear photoreceptors. These and other properties of the push-pull model are discussed in the general context of retinal processing of spatiotemporal luminance patterns.
Resumo:
An improved Boundary Contour System (BCS) and Feature Contour System (FCS) neural network model of preattentive vision is applied to two large images containing range data gathered by a synthetic aperture radar (SAR) sensor. The goal of processing is to make structures such as motor vehicles, roads, or buildings more salient and more interpretable to human observers than they are in the original imagery. Early processing by shunting center-surround networks compresses signal dynamic range and performs local contrast enhancement. Subsequent processing by filters sensitive to oriented contrast, including short-range competition and long-range cooperation, segments the image into regions. Finally, a diffusive filling-in operation within the segmented regions produces coherent visible structures. The combination of BCS and FCS helps to locate and enhance structure over regions of many pixels, without the resulting blur characteristic of approaches based on low spatial frequency filtering alone.
Resumo:
A computational model of visual processing in the vertebrate retina provides a unified explanation of a range of data previously treated by disparate models. Three results are reported here: the model proposes a functional explanation for the primary feed-forward retinal circuit found in vertebrate retinae, it shows how this retinal circuit combines nonlinear adaptation with the desirable properties of linear processing, and it accounts for the origin of parallel transient (nonlinear) and sustained (linear) visual processing streams as simple variants of the same retinal circuit. The retina, owing to its accessibility and to its fundamental role in the initial transduction of light into neural signals, is among the most extensively studied neural structures in the nervous system. Since the pioneering anatomical work by Ramón y Cajal at the turn of the last century[1], technological advances have abetted detailed descriptions of the physiological, pharmacological, and functional properties of many types of retinal cells. However, the relationship between structure and function in the retina is still poorly understood. This article outlines a computational model developed to address fundamental constraints of biological visual systems. Neurons that process nonnegative input signals-such as retinal illuminance-are subject to an inescapable tradeoff between accurate processing in the spatial and temporal domains. Accurate processing in both domains can be achieved with a model that combines nonlinear mechanisms for temporal and spatial adaptation within three layers of feed-forward processing. The resulting architecture is structurally similar to the feed-forward retinal circuit connecting photoreceptors to retinal ganglion cells through bipolar cells. This similarity suggests that the three-layer structure observed in all vertebrate retinae[2] is a required minimal anatomy for accurate spatiotemporal visual processing. This hypothesis is supported through computer simulations showing that the model's output layer accounts for many properties of retinal ganglion cells[3],[4],[5],[6]. Moreover, the model shows how the retina can extend its dynamic range through nonlinear adaptation while exhibiting seemingly linear behavior in response to a variety of spatiotemporal input stimuli. This property is the basis for the prediction that the same retinal circuit can account for both sustained (X) and transient (Y) cat ganglion cells[7] by simple morphological changes. The ability to generate distinct functional behaviors by simple changes in cell morphology suggests that different functional pathways originating in the retina may have evolved from a unified anatomy designed to cope with the constraints of low-level biological vision.
Resumo:
A model of pitch perception, called the Spatial Pitch Network or SPINET model, is developed and analyzed. The model neurally instantiates ideas front the spectral pitch modeling literature and joins them to basic neural network signal processing designs to simulate a broader range of perceptual pitch data than previous spectral models. The components of the model arc interpreted as peripheral mechanical and neural processing stages, which arc capable of being incorporated into a larger network architecture for separating multiple sound sources in the environment. The core of the new model transforms a spectral representation of an acoustic source into a spatial distribution of pitch strengths. The SPINET model uses a weighted "harmonic sieve" whereby the strength of activation of a given pitch depends upon a weighted sum of narrow regions around the harmonics of the nominal pitch value, and higher harmonics contribute less to a pitch than lower ones. Suitably chosen harmonic weighting functions enable computer simulations of pitch perception data involving mistuned components, shifted harmonics, and various types of continuous spectra including rippled noise. It is shown how the weighting functions produce the dominance region, how they lead to octave shifts of pitch in response to ambiguous stimuli, and how they lead to a pitch region in response to the octave-spaced Shepard tone complexes and Deutsch tritones without the use of attentional mechanisms to limit pitch choices. An on-center off-surround network in the model helps to produce noise suppression, partial masking and edge pitch. Finally, it is shown how peripheral filtering and short term energy measurements produce a model pitch estimate that is sensitive to certain component phase relationships.
Resumo:
We describe a 42.6 Gbit/s all-optical pattern recognition system which uses semiconductor optical amplifiers (SOAs). A circuit with three SOA-based logic gates is used to identify the presence of specific port numbers in an optical packet header.
Resumo:
Great demand in power optimized devices shows promising economic potential and draws lots of attention in industry and research area. Due to the continuously shrinking CMOS process, not only dynamic power but also static power has emerged as a big concern in power reduction. Other than power optimization, average-case power estimation is quite significant for power budget allocation but also challenging in terms of time and effort. In this thesis, we will introduce a methodology to support modular quantitative analysis in order to estimate average power of circuits, on the basis of two concepts named Random Bag Preserving and Linear Compositionality. It can shorten simulation time and sustain high accuracy, resulting in increasing the feasibility of power estimation of big systems. For power saving, firstly, we take advantages of the low power characteristic of adiabatic logic and asynchronous logic to achieve ultra-low dynamic and static power. We will propose two memory cells, which could run in adiabatic and non-adiabatic mode. About 90% dynamic power can be saved in adiabatic mode when compared to other up-to-date designs. About 90% leakage power is saved. Secondly, a novel logic, named Asynchronous Charge Sharing Logic (ACSL), will be introduced. The realization of completion detection is simplified considerably. Not just the power reduction improvement, ACSL brings another promising feature in average power estimation called data-independency where this characteristic would make power estimation effortless and be meaningful for modular quantitative average case analysis. Finally, a new asynchronous Arithmetic Logic Unit (ALU) with a ripple carry adder implemented using the logically reversible/bidirectional characteristic exhibiting ultra-low power dissipation with sub-threshold region operating point will be presented. The proposed adder is able to operate multi-functionally.
Resumo:
As a by-product of the ‘information revolution’ which is currently unfolding, lifetimes of man (and indeed computer) hours are being allocated for the automated and intelligent interpretation of data. This is particularly true in medical and clinical settings, where research into machine-assisted diagnosis of physiological conditions gains momentum daily. Of the conditions which have been addressed, however, automated classification of allergy has not been investigated, even though the numbers of allergic persons are rising, and undiagnosed allergies are most likely to elicit fatal consequences. On the basis of the observations of allergists who conduct oral food challenges (OFCs), activity-based analyses of allergy tests were performed. Algorithms were investigated and validated by a pilot study which verified that accelerometer-based inquiry of human movements is particularly well-suited for objective appraisal of activity. However, when these analyses were applied to OFCs, accelerometer-based investigations were found to provide very poor separation between allergic and non-allergic persons, and it was concluded that the avenues explored in this thesis are inadequate for the classification of allergy. Heart rate variability (HRV) analysis is known to provide very significant diagnostic information for many conditions. Owing to this, electrocardiograms (ECGs) were recorded during OFCs for the purpose of assessing the effect that allergy induces on HRV features. It was found that with appropriate analysis, excellent separation between allergic and nonallergic subjects can be obtained. These results were, however, obtained with manual QRS annotations, and these are not a viable methodology for real-time diagnostic applications. Even so, this was the first work which has categorically correlated changes in HRV features to the onset of allergic events, and manual annotations yield undeniable affirmation of this. Fostered by the successful results which were obtained with manual classifications, automatic QRS detection algorithms were investigated to facilitate the fully automated classification of allergy. The results which were obtained by this process are very promising. Most importantly, the work that is presented in this thesis did not obtain any false positive classifications. This is a most desirable result for OFC classification, as it allows complete confidence to be attributed to classifications of allergy. Furthermore, these results could be particularly advantageous in clinical settings, as machine-based classification can detect the onset of allergy which can allow for early termination of OFCs. Consequently, machine-based monitoring of OFCs has in this work been shown to possess the capacity to significantly and safely advance the current state of clinical art of allergy diagnosis
Resumo:
The synthetic utilities of the diazo and diazonium groups are matched only by their reputation for explosive decomposition. Continuous processing technology offers new opportunities to make and use these versatile intermediates at a range of scales with improved safety over traditional batch processes. In this minireview, the state of the art in the continuous flow processing of reactive diazo and diazonium species is discussed.
Resumo:
This thesis details an experimental and simulation investigation of some novel all-optical signal processing techniques for future optical communication networks. These all-optical techniques include modulation format conversion, phase discrimination and clock recovery. The methods detailed in this thesis use the nonlinearities associated with semiconductor optical amplifiers (SOA) to manipulate signals in the optical domain. Chapter 1 provides an introduction into the work detailed in this thesis, discusses the increased demand for capacity in today’s optical fibre networks and finally explains why all-optical signal processing may be of interest for future optical networks. Chapter 2 discusses the relevant background information required to fully understand the all-optical techniques demonstrated in this thesis. Chapter 3 details some pump-probe measurement techniques used to calculate the gain and phase recovery times of a long SOA. A remarkably fast gain recovery is observed and the wavelength dependent nature of this recovery is investigated. Chapter 4 discusses the experimental demonstration of an all-optical modulation conversion technique which can convert on-off- keyed data into either duobinary or alternative mark inversion. In Chapter 5 a novel phase sensitive frequency conversion scheme capable of extracting the two orthogonal components of a quadrature phase modulated signal into two separate frequencies is demonstrated. Chapter 6 investigates a novel all-optical clock recovery technique for phase modulated optical orthogonal frequency division multiplexing superchannels and finally Chapter 7 provides a brief conclusion.
Resumo:
Reflecting on Gus Van Sant’s films Gerry (2003) Elephant (2004) and Last Days (2005), the director’s long-term sound-designer Leslie Shatz observed that “You have to get into the totality of the experience and not just the dialogue”. Shatz’s comment expresses something fundamental about the experimental approach to cinema and to soundscapes undertaken by Van Sant in these three films, unofficially known as the “Death Trilogy”. This thesis contends that Van Sant makes deliberate aesthetic choices which do indicate a distinctly “auteurist” leaning. However, I also argue that intertextual elements, prior knowledge, and audience participation in meaningmaking enhance the experience of, and reveal the nuances in, the soundtracks themselves. This thesis aims to contribute to a growing body of work within filmmusic scholarship concerned with resisting a traditional bias in the field: that film music should be understood as a means of characterisation and as emotional signifier. The films of the “Death Quartet”, which includes Paranoid Park (2007), I believe, offer fertile ground on which to explore these new approaches. It is my contention that these films deconstruct the traditional approach to soundtracking and the relationship between soundtrack and character, and that only an approach sensitive to the aesthetic and philosophical functions of music and sound can adequately acknowledge their unique cinematic qualities.
Resumo:
The Lucumi religion (also Santeria and Regla de Ocha) developed in 19th-century colonial Cuba, by syncretizing elements of Catholicism with the Yoruba worship of orisha. When fully initiated, santeros (priests) actively participate in religious ceremonies by periodically being possessed or "mounted" by a patron saint or orisha, usually within the context of a drumming ritual, known as a toque de santo, bembe, or tambor. Within these rituals, there is a clearly defined goal of trance possession, though its manifestation is not the sole measure of success or failure. Rather than focusing on the fleeting, exciting moments that immediately precede the arrival of an orisha in the form of a possession trance, this thesis investigates the entire four- to six-hour musical performance that is central to the ceremony. It examines the brief pauses, the moments of reduced intensity, the slow but deliberate build-ups of energy and excitement, and even the periods when novices are invited to perform the sacred bata drums, and places these moments on an equal footing with the more dynamic periods where possession is imminent or in progress. This document approaches Lucumi ritual from the viewpoint of bata drummers, ritual specialists who, during the course of a toque de santo, exercise wide latitude in determining the shape of the event. Known as omo Ana (children of the orisha Ana who is manifest in drums and rhythms), bata drummers comprise a fraternity that is accessible only through ritual initiation. Though they are sensitive to the desires of the many participants during a toque de santo, and indeed make their living by satisfying the expectations of their hosts, many of the drummers' activities are inwardly focused on the cultivation and preservation of this fraternity. Occasionally interfering with spirit possession, and other expectations of the participants, these aberrant activities include teaching and learning, developing group identity or signature sound, and achieving a state of intimacy among the musicians known as "communitas."
Resumo:
Figer (to congeal, to solidify) is a quadraphonic electroacoustic composition. It was completed in the fall of 2003. Several software programs were used in creating and assembling the piece (C-Sound, Grain Mill, AL/Erwin (grain generator), Sound Forge and Acid Music). The sounds used in the piece are of two general types: synthesized and sampled, both of which were subjected to various processing techniques. The most important of these techniques, and one that formally defines large portions of the piece, is granular synthesis. Form The notion of time perception is of great importance in this piece. Figer addresses this question in several ways. In one sense, the form of Figer is simple. There are three layers of activity (see diagram). Layer 1 is continuous and non-sectional and supplies a backdrop (not necessarily a background) for the other two. The second and third layers overlap and interrupt one another. Each consists of two blocks of sound. The layers, and blocks within, relate to each other in various ways. Layer 1 is formally continuous. Layer 2 consists of well-defined columns of sound that evolve from soft and mild to loud and abrasive. The layer is, in reality, a whole that is simply cut into two parts (block 1 and block 2). In contrast, the blocks of layer 3 do not constitute a whole. Each is a complete unit and has its own self-contained evolutionary path. Those paths, however, do cross the paths of other units (layers, blocks), influencing them and absorbing some of their essence. At the heart of Figer lies a constant process of presenting materials or ideas and immediately, or, at times, simultaneously, commenting, reflecting on, or reinterpreting that material. All of the layers of this piece deal, both at local and global levels, with the problem of time and its perception relative to the materials, sonic or otherwise, that occupy it and the manner in which they unfold and relate to each other.
Resumo:
INTRODUCTION: The characterization of urinary calculi using noninvasive methods has the potential to affect clinical management. CT remains the gold standard for diagnosis of urinary calculi, but has not reliably differentiated varying stone compositions. Dual-energy CT (DECT) has emerged as a technology to improve CT characterization of anatomic structures. This study aims to assess the ability of DECT to accurately discriminate between different types of urinary calculi in an in vitro model using novel postimage acquisition data processing techniques. METHODS: Fifty urinary calculi were assessed, of which 44 had >or=60% composition of one component. DECT was performed utilizing 64-slice multidetector CT. The attenuation profiles of the lower-energy (DECT-Low) and higher-energy (DECT-High) datasets were used to investigate whether differences could be seen between different stone compositions. RESULTS: Postimage acquisition processing allowed for identification of the main different chemical compositions of urinary calculi: brushite, calcium oxalate-calcium phosphate, struvite, cystine, and uric acid. Statistical analysis demonstrated that this processing identified all stone compositions without obvious graphical overlap. CONCLUSION: Dual-energy multidetector CT with postprocessing techniques allows for accurate discrimination among the main different subtypes of urinary calculi in an in vitro model. The ability to better detect stone composition may have implications in determining the optimum clinical treatment modality for urinary calculi from noninvasive, preprocedure radiological assessment.