Publication Details

Title: Automatic Detection of Prosodic Stress in American English Discourse
Author: R. Silipo and S. Greenberg
Group: ICSI Technical Reports
Date: March 2000
PDF: ftp://ftp.icsi.berkeley.edu/pub/techreports/2000/tr-00-001.pdf

Overview:
The goal of this study is twofold. First, it aims to implement an automatic detector of prosodic stress with sufficiently reliable performance. Second, the effectiveness of the acoustic features most commonly proposed in the literature is assessed. That is, the role played by duration, amplitude and fundamental frequency of syllabic nuclei is investigated. Several data-driven algorithms, such as Artificial Neural Networks (ANN), statistical decision trees and fuzzy classification techniques, and a knowledge-based heuristic algorithm are implemented for the automatic transcription of prosodic stress. As reference, two different subsets from the OGI English stories database were hand labeled in terms of prosodic stress by two individuals trained in linguistics. While the ANN based approach achieves the highest performance (77% primarily stressed vocalic nuclei vs.~79% unstressed vocalic nuclei in average for the two transcribers data sets), the other methods show that both transcribers grant a major role to duration and (to a slightly lesser degree) to amplitude. Pitch relevant features of the syllabic nuclei appear to play a much less important role than amplitude and duration.

Bibliographic Information:
ICSI Technical Report TR-00-001

Bibliographic Reference:
R. Silipo and S. Greenberg. Automatic Detection of Prosodic Stress in American English Discourse. ICSI Technical Report TR-00-001, March 2000