Supervised Deep Hashing for Highly Efficient Cover Song Detection

TitleSupervised Deep Hashing for Highly Efficient Cover Song Detection
Publication TypeConference Paper
Year of Publication2019
AuthorsYe, Z., Choi J., & Friedland G.
Published inProceedings of the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)

This paper proposes a supervised deep hashing approach for highly efficient and effective cover song detection. Our system consists of two identical sub-neural networks, each one having a hash layer to learn a binary representations of input audio in the form of spectral features. A loss function joins the two outputs of the sub-networks by minimizing the Hamming distance for a pair of audio files covering the same music work. We further enhance system performance by loudness embedding, beat synchronization, and early fusion of input audio features. The output of 128-bit hash reaches state-of-the-art performance with mean pairwise accuracy. This system demonstrates the possibility of memory-efficient and real-time efficient cover song detection with satisfiable accuracy in large scale.


This project is partially funded by an AWS Research Grant and a collaborative Strategic Initiative grant led by Lawrence Livermore National Laboratory (U.S. Dept. of Energy contract DE-AC52-07NA27344). Any findings and conclusions are the authors, and do not necessarily reflect the views of the funders.

ICSI Research Group

Audio and Multimedia