Publication Details

Title: Using a GPU, Online Diarization = Offline Diarization
Author: G. Friedland
Group: ICSI Technical Reports
Date: January 2012
PDF: http://www.icsi.berkeley.edu/pubs/techreports/TR-12-004.pdf

Overview:
This article presents a low-latency, online speaker diarization system ("who is speaking now?") based on the repeated execution of a GPU-optimized, highly efficient offline diarization system ("who spoke when"). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In contrast to earlier attempts at online diarization, the system achieves similar accuracy to the underlying offline system and does not require explicit detection of new speakers. Using GPUs, online diarization has become a side-effect of offline diarization, obsoleting the requirement for specialized online diarization systems.

Acknowledgements:
This research is partly supported by Microsoft (Award #024263) and Intel (Award #024894) funding, by matching funding by U.C. Discovery (Award #DIG07-10227), and a CISCO URP grant. I want to thank the following persons for their support in writing this article: Adam Janin, Luke Gottlieb, Carlos Vaquero, Henry Cook, Mary Knox, and Nelson Morgan.

Bibliographic Information:
ICSI Technical Report TR-12-004

Bibliographic Reference:
G. Friedland. Using a GPU, Online Diarization = Offline Diarization. ICSI Technical Report TR-12-004, January 2012