A Fast and Robust Audio Fingerprinting Method Using Constant Q Transform and Clustering Strategy

SIPLab
Sep 5, 2018
1 min read

Audio fingerprints help to identify the audio content from database. Audio fingerprinting is to match an audio recording from audio contents. The approach first calculates the similarity of fingerprints between audio recording and audio contents, and then matches the audio recording and audio contents by counting the number of similar fingerprints in a time ordering list. To generate audio fingerprints, the algorithm first reads audio signals from file and then transfer the signal into a spectrogram. After that, the algorithm extracts the features from the spectrogram and encodes their position as fingerprints.

In this thesis, we propose a fast and robust audio fingerprinting method using constant Q transform (CQT) and a clustering strategy. We use CQT method to generate spectrogram, which can present the intensity of signal more clearly. In audio fingerprinting, it is time-consuming while matching audio recording and audio contents. To accelerate this process, we use a two-step searching algorithm with a Intelligent K-means clustering strategy. Our proposed approach first selects the candidate audio contents, then matches audio recording with these candidates. In addition, our designed algorithm supports GPU acceleration.

In our experimental results, we compare our proposed approach with other approaches. Our approach is more accurate in many cases with distortion. On the other hand, our approach is more efficient to find the alignment time.

Jiunn-Lin Wu, PhD

A Fast and Robust Audio Fingerprinting Method Using Constant Q Transform and Clustering Strategy

Recent Posts

Comments