Publication Details
Title: Using Acoustic Diarization for Duplicate Detection
Author: M. Knox, G. Friedland, and R. P. Smith
Group: ICSI Technical Reports
Date: March 2012
PDF: http://www.icsi.berkeley.edu/pubs/techreports/TR-12-005.pdf
Overview:
The following article describes the use of an acoustic diarization engine for duplicate detection on broadcast news. Diarization is typically used to partition audio into speaker homogeneous regions, or in other words, to determine “who spoke when.” In this setting, however, we use diarization to segment the recordings and group the segments into homogeneous clusters. Diarization is performed both on the full length broadcast news recordings as well as the short clips (which we are classifying as either a duplicate or not). We then compare the similarity of models trained on the clusters to determine whether the time allocated to the cluster from the short clip is from the original broadcast news recording, or a duplicate. We tested our system under a variety of audio conditions: unmodified, with reverberation, resampled, and lowpass filtered. On our test set, the areas under the receiver operating characteristic curve for the audio conditions were 0.91, 0.89, 0.61, and 0.64 respectively.
Bibliographic Information:
ICSI Technical Report TR-12-005
Bibliographic Reference:
M. Knox, G. Friedland, and R. P. Smith. Using Acoustic Diarization for Duplicate Detection. ICSI Technical Report TR-12-005, March 2012
Author: M. Knox, G. Friedland, and R. P. Smith
Group: ICSI Technical Reports
Date: March 2012
PDF: http://www.icsi.berkeley.edu/pubs/techreports/TR-12-005.pdf
Overview:
The following article describes the use of an acoustic diarization engine for duplicate detection on broadcast news. Diarization is typically used to partition audio into speaker homogeneous regions, or in other words, to determine “who spoke when.” In this setting, however, we use diarization to segment the recordings and group the segments into homogeneous clusters. Diarization is performed both on the full length broadcast news recordings as well as the short clips (which we are classifying as either a duplicate or not). We then compare the similarity of models trained on the clusters to determine whether the time allocated to the cluster from the short clip is from the original broadcast news recording, or a duplicate. We tested our system under a variety of audio conditions: unmodified, with reverberation, resampled, and lowpass filtered. On our test set, the areas under the receiver operating characteristic curve for the audio conditions were 0.91, 0.89, 0.61, and 0.64 respectively.
Bibliographic Information:
ICSI Technical Report TR-12-005
Bibliographic Reference:
M. Knox, G. Friedland, and R. P. Smith. Using Acoustic Diarization for Duplicate Detection. ICSI Technical Report TR-12-005, March 2012
