Publication Details
Title: On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings
Author: J. Kolar, E. Shriberg, and Y. Liu
Group: Speech
Date: September 2006
PDF: http://www.icsi.berkeley.edu/pubs/speech/kolar06icslp.pdf
Overview:
We explore speaker-specific prosodic modeling for dialog act segmentation of speech from the ICSI Meeting Corpus. We ask whether features beyond pauses help individual speakers, and whether some speakers benefit from prosody models trained on only their speech. We find positive results for both questions, although the second is more complex. Feature analysis reveals that duration is the most used feature type, followed by pause and pitch features. Results also suggest a difference between native and nonnative speakers in feature usage patterns. We conclude that features beyond pauses are useful for dialog act segmentation in natural conversation, and that for some speakers, speaker-specific training yields further gains.
Bibliographic Information:
Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 2014-2017
Bibliographic Reference:
J. Kolar, E. Shriberg, and Y. Liu. On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings. Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 2014-2017, September 2006
Author: J. Kolar, E. Shriberg, and Y. Liu
Group: Speech
Date: September 2006
PDF: http://www.icsi.berkeley.edu/pubs/speech/kolar06icslp.pdf
Overview:
We explore speaker-specific prosodic modeling for dialog act segmentation of speech from the ICSI Meeting Corpus. We ask whether features beyond pauses help individual speakers, and whether some speakers benefit from prosody models trained on only their speech. We find positive results for both questions, although the second is more complex. Feature analysis reveals that duration is the most used feature type, followed by pause and pitch features. Results also suggest a difference between native and nonnative speakers in feature usage patterns. We conclude that features beyond pauses are useful for dialog act segmentation in natural conversation, and that for some speakers, speaker-specific training yields further gains.
Bibliographic Information:
Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 2014-2017
Bibliographic Reference:
J. Kolar, E. Shriberg, and Y. Liu. On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings. Proceedings of the 9th International Conference on Spoken Language Processing (ICSLP-Interspeech 2006), Pittsburgh, Pennsylvania, pp. 2014-2017, September 2006
