Investigating Speaker Diarization Errors in Meetings

Mary Knox

ICSI

Tuesday, September 20, 2011
12:30pm - 1:30pm

Though many approaches have been introduced to solve the speaker diarization problem, there has been little analysis of speaker diarization performance other than reporting the Diarization Error Rate (DER). The focus of my current work is to perform error analysis in audio-only speaker diarization for the meeting domain. There are two main areas of interest. The first is to analyze speaker diarization performance on specific types of segments (e.g., speaker changes, interruption, overlapped speech, short utterances, long utterances, etc.). The second area is to compare speaker diarization performance across systems. Typically, systems are compared using the relative DER change. However, the relative difference in DER does not reflect whether the systems make similar errors. By comparing where errors occur across multiple systems, the speaker diarization community can gain insight into the strengths and weaknesses of the various systems which could lead to a more novel way of combining systems to improve speaker diarization performance. The second area is to compare speaker diarization performance across systems. Typically, systems are compared using the relative DER change. However, the relative difference in DER does not reflect whether the systems make similar errors. By comparing where errors occur across multiple systems, the speaker diarization community can gain insight into the strengths and weaknesses of the various systems which could lead to a more novel way of combining systems to improve speaker diarization performance. The second area is to compare speaker diarization performance across systems. Typically, systems are compared using the relative DER change. However, the relative difference in DER does not reflect whether the systems make similar errors. By comparing where errors occur across multiple systems, the speaker diarization community can gain insight into the strengths and weaknesses of the various systems which could lead to a more novel way of combining systems to improve speaker diarization performance. In this talk I will discuss the prelimary results I have attained while analyzing speaker diarization errors. This is still ongoing work so feedback/suggestions are welcome.