Stream Combination Before and/or After the Acoustic Model

Publication TypeTechnical Report
Year of Publication2000
AuthorsEllis, D. P. W.
Other Numbers1182

Combining a number of diverse feature streams has proven to be a very flexible and beneficial technique in speech recognition. In the context of hybrid connectionist-HMM recognition, feature streams can be combined at several points. In this work, we compare two forms of combination: at the input to the acoustic model, by concatenating the feature streams into a single vector (feature combination or FC), and at the output of the acoustic model, by averaging the logs of the estimated posterior probabilities of each subword unit (posterior combination or PC). Based on four feature streams with varying degrees of mutual dependence, we find that the best combination strategy is a combination of feature and posterior combination, with streams that are more independent, as measured by an approximation to conditional mutual information, showing more benefit from posterior combination.

Bibliographic Notes

ICSI Technical Report TR-00-007

Abbreviated Authors

D. P.W. Ellis

ICSI Research Group


ICSI Publication Type

Technical Report