6 resultados para auditory scene analysis

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since its origins, the European Union has striven to be an actor on the International scene and a place in conflict Management. Yet the EU’s lack of activity cannot be justified by a mere lack of capacities. The EU counts with numerous political, economic, and, since 2003, civil and military instruments that should allow it to precede a comprehensive conflict response. This publication consists of a description of these instruments and an analysis of the final use that the Union makes of them in the different stages of a conflict. Examples will show us the EU’s main weakness in providing a comprehensive and timely response when a conflict breaks out.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tone Mapping is the problem of compressing the range of a High-Dynamic Range image so that it can be displayed in a Low-Dynamic Range screen, without losing or introducing novel details: The final image should produce in the observer a sensation as close as possible to the perception produced by the real-world scene. We propose a tone mapping operator with two stages. The first stage is a global method that implements visual adaptation, based on experiments on human perception, in particular we point out the importance of cone saturation. The second stage performs local contrast enhancement, based on a variational model inspired by color vision phenomenology. We evaluate this method with a metric validated by psychophysical experiments and, in terms of this metric, our method compares very well with the state of the art.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The mismatch negativity is an electrophysiological marker of auditory change detection in the event-related brain potential and has been proposed to reflect an automatic comparison process between an incoming stimulus and the representation of prior items in a sequence. There is evidence for two main functional subcomponents comprising the MMN, generated by temporal and frontal brain areas, respectively. Using data obtained in an MMN paradigm, we performed time-frequency analysis to reveal the changes in oscillatory neural activity in the theta band. The results suggest that the frontal component of the MMN is brought about by an increase in theta power for the deviant trials and, possibly, by an additional contribution of theta phase alignment. By contrast, the temporal component of the MMN, best seen in recordings from mastoid electrodes, is generated by phase resetting of theta rhythm with no concomitant power modulation. Thus, frontal and temporal MMN components do not only differ with regard to their functional significance but also appear to be generated by distinct neurophysiological mechanisms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This project addresses methodological and technological challenges in the development of multi-modal data acquisition and analysis methods for the representation of instrumental playing technique in music performance through auditory-motor patterning models. The case study is violin playing: a multi-modal database of violin performances has been constructed by recording different musicians while playing short exercises on different violins. The exercise set and recording protocol have been designed to sample the space defined by dynamics (from piano to forte) and tone (from sul tasto to sul ponticello), for each bow stroke type being played on each of the four strings (three different pitches per string) at two different tempi. The data, containing audio, video, and motion capture streams, has been processed and segmented to facilitate upcoming analyses. From the acquired motion data, the positions of the instrument string ends and the bow hair ribbon ends are tracked and processed to obtain a number of bowing descriptors suited for a detailed description and analysis of the bow motion patterns taking place during performance. Likewise, a number of sound perceptual attributes are computed from the audio streams. Besides the methodology and the implementation of a number of data acquisition tools, this project introduces preliminary results from analyzing bowing technique on a multi-modal violin performance database that is unique in its class. A further contribution of this project is the data itself, which will be made available to the scientific community through the repovizz platform.