ABOUT THE PROJECT
Implementation of an embedding space for audio using only unsupervised machine learning. The approach is to adapt triplet loss to an unsupervised setting, using the distributional hypothesis from natural language, which states that words appearing in similar contexts tend to have similar meanings. The idea is based on Tile2Vec, which applies this same hypothesis to geographical tiles. The resulting embedding space is evaluated on two datasets, the DCASE 2018 task 5 dataset and a music dataset.
MADE FOR
Fabian Gröger / Hochschule Luzern Informatik