Sound Texture Modelling


Autoria(s): Kersten, Stefan
Contribuinte(s)

Agència de Gestió d'Ajuts Universitaris i de Recerca

Data(s)

02/05/2012

Resumo

In the PhD thesis “Sound Texture Modeling” we deal with statistical modelling or textural sounds like water, wind, rain, etc. For synthesis and classification. Our initial model is based on a wavelet tree signal decomposition and the modeling of the resulting sequence by means of a parametric probabilistic model, that can be situated within the family of models trainable via expectation maximization (hidden Markov tree model ). Our model is able to capture key characteristics of the source textures (water, rain, fire, applause, crowd chatter ), and faithfully reproduces some of the sound classes. In terms of a more general taxonomy of natural events proposed by Graver, we worked on models for natural event classification and segmentation. While the event labels comprise physical interactions between materials that do not have textural propierties in their enterity, those segmentation models can help in identifying textural portions of an audio recording useful for analysis and resynthesis. Following our work on concatenative synthesis of musical instruments, we have developed a pattern-based synthesis system, that allows to sonically explore a database of units by means of their representation in a perceptual feature space. Concatenative syntyhesis with “molecules” built from sparse atomic representations also allows capture low-level correlations in perceptual audio features, while facilitating the manipulation of textural sounds based on their physical and perceptual properties. We have approached the problem of sound texture modelling for synthesis from different directions, namely a low-level signal-theoretic point of view through a wavelet transform, and a more high-level point of view driven by perceptual audio features in the concatenative synthesis setting. The developed framework provides unified approach to the high-quality resynthesis of natural texture sounds. Our research is embedded within the Metaverse 1 European project (2008-2011), where our models are contributting as low level building blocks within a semi-automated soundscape generation system.

Formato

34 p.

Identificador

http://hdl.handle.net/2072/183816

Idioma(s)

eng

Relação

Els ajuts de l'AGAUR;2011FI_B200078

Direitos

info:eu-repo/semantics/openAccess

L'accés als continguts d'aquest document queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-nd/3.0/es/

Fonte

RECERCAT (Dipòsit de la Recerca de Catalunya)

Palavras-Chave #So -- Tractament per ordinador #Models lineals (Estadística) #62 - Enginyeria. Tecnologia
Tipo

info:eu-repo/semantics/article