Modeling Dynamics in Connectionist Speech Recognition - The Time Index Model
Title | Modeling Dynamics in Connectionist Speech Recognition - The Time Index Model |
Publication Type | Technical Report |
Year of Publication | 1994 |
Authors | Konig, Y., & Morgan N. |
Other Numbers | 882 |
Abstract | Here, we introduce an alternative to the Hidden Markov Model (HMM) as the underlying representation of speech production. HMMs suffer from well known limitations, such as the unrealistic assumption that the observations generated in a given state are independent and identically distributed (i.i.d). We propose a time index model that explicitly conditions the emission probability of a state on the time index, i.e., on the number of "visits" in the current state of the Markov chain in a sequence. Thus, the proposed model does not require an i.i.d. assumption. The connectionist framework enables us to represent the dependence on the time index as a non-parametric distribution and to share parameters between different speech unit models. Furthermore, we discuss an extension to the basic time index model by incorporating information about the duration of the phone segments. Our initial results show that given the position of the boundaries between basic speech units, e.g., phones, we can improve our current connectionist system performance significantly by using this model. However, we still do not know whether these boundaries can be estimated reliably, nor do we know how much benefit we can obtain from this method given less accurate boundary information. Currently we are experimenting with two possible approaches: trying to learn smooth probability densities for the boundaries, and getting a set of reasonable segmentations from an N-Best search. In both cases we will need to consider the effect of incorrect boundaries, since they will undoubtedly occur. |
URL | http://www.icsi.berkeley.edu/ftp/global/pub/techreports/1994/tr-94-012.pdf |
Bibliographic Notes | ICSI Technical Report TR-94-012 |
Abbreviated Authors | Y. Konig and N. Morgan |
ICSI Publication Type | Technical Report |