2 resultados para music semantic model
em AMS Tesi di Laurea - Alm@DL - Universit
Resumo:
We create and study a generative model for Irish traditional music based on Variational Autoencoders and analyze the learned latent space trying to find musically significant correlations in the latent codes' distributions in order to perform musical analysis on data. We train two kinds of models: one trained on a dataset of Irish folk melodies, one trained on bars extrapolated from the melodies dataset, each one in five variations of increasing size. We conduct the following experiments: we inspect the latent space of tunes and bars in relation to key, time signature, and estimated harmonic function of bars; we search for links between tunes in a particular style (i.e. "reels'") and their positioning in latent space relative to other tunes; we compute distances between embedded bars in a tune to gain insight into the model's understanding of the similarity between bars. Finally, we show and evaluate generative examples. We find that the learned latent space does not explicitly encode musical information and is thus unusable for musical analysis of data, while generative results are generally good and not strictly dependent on the musical coherence of the model's internal representation.
Resumo:
Artificial Intelligence is reshaping the field of fashion industry in different ways. E-commerce retailers exploit their data through AI to enhance their search engines, make outfit suggestions and forecast the success of a specific fashion product. However, it is a challenging endeavour as the data they possess is huge, complex and multi-modal. The most common way to search for fashion products online is by matching keywords with phrases in the product's description which are often cluttered, inadequate and differ across collections and sellers. A customer may also browse an online store's taxonomy, although this is time-consuming and doesn't guarantee relevant items. With the advent of Deep Learning architectures, particularly Vision-Language models, ad-hoc solutions have been proposed to model both the product image and description to solve this problems. However, the suggested solutions do not exploit effectively the semantic or syntactic information of these modalities, and the unique qualities and relations of clothing items. In this work of thesis, a novel approach is proposed to address this issues, which aims to model and process images and text descriptions as graphs in order to exploit the relations inside and between each modality and employs specific techniques to extract syntactic and semantic information. The results obtained show promising performances on different tasks when compared to the present state-of-the-art deep learning architectures.