997 resultados para Image-sound disjunctions


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A non-linear supervised learning architecture, the Specialized Mapping Architecture (SMA) and its application to articulated body pose reconstruction from single monocular images is described. The architecture is formed by a number of specialized mapping functions, each of them with the purpose of mapping certain portions (connected or not) of the input space, and a feedback matching process. A probabilistic model for the architecture is described along with a mechanism for learning its parameters. The learning problem is approached using a maximum likelihood estimation framework; we present Expectation Maximization (EM) algorithms for two different instances of the likelihood probability. Performance is characterized by estimating human body postures from low level visual features, showing promising results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Spectral methods of graph partitioning have been shown to provide a powerful approach to the image segmentation problem. In this paper, we adopt a different approach, based on estimating the isoperimetric constant of an image graph. Our algorithm produces the high quality segmentations and data clustering of spectral methods, but with improved speed and stability.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Office of Naval Research (N00014-01-1-0624)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classifying novel terrain or objects front sparse, complex data may require the resolution of conflicting information from sensors working at different times, locations, and scales, and from sources with different goals and situations. Information fusion methods can help resolve inconsistencies, as when evidence variously suggests that an object's class is car, truck, or airplane. The methods described here consider a complementary problem, supposing that information from sensors and experts is reliable though inconsistent, as when evidence suggests that an object's class is car, vehicle, and man-made. Underlying relationships among objects are assumed to be unknown to the automated system or the human user. The ARTMAP information fusion system used distributed code representations that exploit the neural network's capacity for one-to-many learning in order to produce self-organizing expert systems that discover hierarchical knowledge structures. The system infers multi-level relationships among groups of output classes, without any supervised labeling of these relationships.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Air Force Office of Scientific Research (F49620-01-1-0423); National Geospatial-Intelligence Agency (NMA 201-01-1-2016); National Science Foundation (SBE-035437, DEG-0221680); Office of Naval Research (N00014-01-1-0624)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Under natural viewing conditions small movements of the eye, head, and body prevent the maintenance of a steady direction of gaze. It is known that stimuli tend to fade when they a restabilized on the retina for several seconds. However; it is unclear whether the physiological motion of the retinal image serves a visual purpose during the brief periods of natural visual fixation. This study examines the impact of fixational instability on the statistics of the visua1 input to the retina and on the structure of neural activity in the early visual system. We show that fixational instability introduces a component in the retinal input signals that in the presence of natural images, lacks spatial correlations. This component strongly influences neural activity in a model of the LGN. It decorrelates cell responses even if the contrast sensitivity functions of simulated cells arc not perfectly tuned to counterbalance the power-law spectrum of natural images. A decorrelation of neural activity at the early stages of the visual system has been proposed to be beneficial for discarding statistical redundancies in the input signals. The results of this study suggest that fixational instability might contribute to establishing efficient representations of natural stimuli.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A fast and efficient segmentation algorithm based on the Boundary Contour System/Feature Contour System (BCS/FCS) of Grossberg and Mingolla [3] is presented. This implementation is based on the FFT algorithm and the parallelism of the system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The What-and-Where filter forms part of a neural network architecture for spatial mapping, object recognition, and image understanding. The Where fllter responds to an image figure that has been separated from its background. It generates a spatial map whose cell activations simultaneously represent the position, orientation, ancl size of all tbe figures in a scene (where they are). This spatial map may he used to direct spatially localized attention to these image features. A multiscale array of oriented detectors, followed by competitve and interpolative interactions between position, orientation, and size scales, is used to define the Where filter. This analysis discloses several issues that need to be dealt with by a spatial mapping system that is based upon oriented filters, such as the role of cliff filters with and without normalization, the double peak problem of maximum orientation across size scale, and the different self-similar interpolation properties across orientation than across size scale. Several computationally efficient Where filters are proposed. The Where filter rnay be used for parallel transformation of multiple image figures into invariant representations that are insensitive to the figures' original position, orientation, and size. These invariant figural representations form part of a system devoted to attentive object learning and recognition (what it is). Unlike some alternative models where serial search for a target occurs, a What and Where representation can he used to rapidly search in parallel for a desired target in a scene. Such a representation can also be used to learn multidimensional representations of objects and their spatial relationships for purposes of image understanding. The What-and-Where filter is inspired by neurobiological data showing that a Where processing stream in the cerebral cortex is used for attentive spatial localization and orientation, whereas a What processing stream is used for attentive object learning and recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A feedforward neural network for invariant image preprocessing is proposed that represents the position1 orientation and size of an image figure (where it is) in a multiplexed spatial map. This map is used to generate an invariant representation of the figure that is insensitive to position1 orientation, and size for purposes of pattern recognition (what it is). A multiscale array of oriented filters followed by competition between orientations and scales is used to define the Where filter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reflecting on Gus Van Sant’s films Gerry (2003) Elephant (2004) and Last Days (2005), the director’s long-term sound-designer Leslie Shatz observed that “You have to get into the totality of the experience and not just the dialogue”. Shatz’s comment expresses something fundamental about the experimental approach to cinema and to soundscapes undertaken by Van Sant in these three films, unofficially known as the “Death Trilogy”. This thesis contends that Van Sant makes deliberate aesthetic choices which do indicate a distinctly “auteurist” leaning. However, I also argue that intertextual elements, prior knowledge, and audience participation in meaningmaking enhance the experience of, and reveal the nuances in, the soundtracks themselves. This thesis aims to contribute to a growing body of work within filmmusic scholarship concerned with resisting a traditional bias in the field: that film music should be understood as a means of characterisation and as emotional signifier. The films of the “Death Quartet”, which includes Paranoid Park (2007), I believe, offer fertile ground on which to explore these new approaches. It is my contention that these films deconstruct the traditional approach to soundtracking and the relationship between soundtrack and character, and that only an approach sensitive to the aesthetic and philosophical functions of music and sound can adequately acknowledge their unique cinematic qualities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Lucumi religion (also Santeria and Regla de Ocha) developed in 19th-century colonial Cuba, by syncretizing elements of Catholicism with the Yoruba worship of orisha. When fully initiated, santeros (priests) actively participate in religious ceremonies by periodically being possessed or "mounted" by a patron saint or orisha, usually within the context of a drumming ritual, known as a toque de santo, bembe, or tambor. Within these rituals, there is a clearly defined goal of trance possession, though its manifestation is not the sole measure of success or failure. Rather than focusing on the fleeting, exciting moments that immediately precede the arrival of an orisha in the form of a possession trance, this thesis investigates the entire four- to six-hour musical performance that is central to the ceremony. It examines the brief pauses, the moments of reduced intensity, the slow but deliberate build-ups of energy and excitement, and even the periods when novices are invited to perform the sacred bata drums, and places these moments on an equal footing with the more dynamic periods where possession is imminent or in progress. This document approaches Lucumi ritual from the viewpoint of bata drummers, ritual specialists who, during the course of a toque de santo, exercise wide latitude in determining the shape of the event. Known as omo Ana (children of the orisha Ana who is manifest in drums and rhythms), bata drummers comprise a fraternity that is accessible only through ritual initiation. Though they are sensitive to the desires of the many participants during a toque de santo, and indeed make their living by satisfying the expectations of their hosts, many of the drummers' activities are inwardly focused on the cultivation and preservation of this fraternity. Occasionally interfering with spirit possession, and other expectations of the participants, these aberrant activities include teaching and learning, developing group identity or signature sound, and achieving a state of intimacy among the musicians known as "communitas."

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The artistic play of light seen on a pyramid in some Mayan ruins located in Cancun, Mexico provided the inspiration for Vision of Equinox. On both the spring and autumn equinox days, the sunlight projected on the pyramid forms a shape which looks like a serpent moving on the stairway of the pyramid. Vision of Equinox was composed with an image of light as the model for the artistic transfiguration of sound. The light image of sound changes its shape in each stage of the piece, using the orchestra in different ways - sometimes like a chamber ensemble, sometimes like one big instrument. The image of light casting on a pyramid is expressed by descending melodic lines that can be heard several times in the piece. At the final climax of the work, a complete and embodied artistic figure is formed and stated, expressing the appearance of the Mayan god Quetzalcoatl, the serpent, in my own imagination. The light and shadow which comprise this pyramid art are treated as two contrasting elements in my composition and become the two main motives in this piece. To express these two contrasting elements, I picked the numbers "5" and "2," and used them as "key numbers" in this piece. As a result, the intervals of a fifth and a second (sometimes inverted as a seventh) are the two main intervals used in the structure. The interval of a fifth was taken into account for the construction of the pyramid, which has five points of contact. The interval of a second was selected as a contrasting sonority to the fifth. Further, the numbers "5" and "2" are used as the number of notes which form the main motives in this piece; quintuplets are used throughout this piece, and the short motive made by two sixteenth notes is used as one of the main motives in this piece. Moreover, the shape of the pyramid provided a concept of symmetry, which is expressed by the setting of a central point of the music (pitch center) as well as the use of retrograde and inversion in this piece.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Body image (BI) and body satisfaction may be important in understanding weight loss behaviors, particularly during the postpartum period. We assessed these constructs among African American and white overweight postpartum women. METHODS: The sample included 162 women (73 African American and 89 white) in the intervention arm 6 months into the Active Mothers Postpartum (AMP) Study, a nutritional and physical activity weight loss intervention. BIs, self-reported using the Stunkard figure rating scale, were compared assessing mean values by race. Body satisfaction was measured using body discrepancy (BD), calculated as perceived current image minus ideal image (BD<0: desire to be heavier; BD>0: desire to be lighter). BD was assessed by race for: BD(Ideal) (current image minus the ideal image) and BD(Ideal Mother) (current image minus ideal mother image). RESULTS: Compared with white women, African American women were younger and were less likely to report being married, having any college education, or residing in households with annual incomes >$30,000 (all p < 0.01). They also had a higher mean body mass index (BMI) (p = 0.04), although perceived current BI did not differ by race (p = 0.21). African Americans had higher mean ideal (p = 0.07) and ideal mother (p = 0.001) BIs compared with whites. African Americans' mean BDs (adjusting for age, BMI, education, income, marital status, and interaction terms) were significantly lower than those of whites, indicating greater body satisfaction among African Americans (BD(Ideal): 1.7 vs. 2.3, p = 0.005; BD(Ideal Mother): 1.1 vs. 1.8, p = 0.0002). CONCLUSIONS: Racial differences exist in postpartum weight, ideal images, and body satisfaction. Healthcare providers should consider tailored messaging that accounts for these racially different perceptions and factors when designing weight loss programs for overweight mothers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ability to isolate a single sound source among concurrent sources and reverberant energy is necessary for understanding the auditory world. The precedence effect describes a related experimental finding, that when presented with identical sounds from two locations with a short onset asynchrony (on the order of milliseconds), listeners report a single source with a location dominated by the lead sound. Single-cell recordings in multiple animal models have indicated that there are low-level mechanisms that may contribute to the precedence effect, yet psychophysical studies in humans have provided evidence that top-down cognitive processes have a great deal of influence on the perception of simulated echoes. In the present study, event-related potentials evoked by click pairs at and around listeners' echo thresholds indicate that perception of the lead and lag sound as individual sources elicits a negativity between 100 and 250 msec, previously termed the object-related negativity (ORN). Even for physically identical stimuli, the ORN is evident when listeners report hearing, as compared with not hearing, a second sound source. These results define a neural mechanism related to the conscious perception of multiple auditory objects.