3-state phone model in Hidden Markov Model (HMM)

落花浮王杯 提交于 2019-12-21 06:18:39

问题


I want to ask regarding the meaning of 3-state phone model in HMM. This case is based on the theory of HMM in speech recognition system. So the example is based on the acoustic modeling of the speech sounds in HMM.

I get this example picture from a journal paper: http://www.intechopen.com/source/html/41188/media/image8_w.jpg

Figure 1: 3-State HMM for the sound /s/

So, my question is:

  1. what is it mean by 3 state?
  2. what actually S1, S2 & S3 mean? (I know it is state but it represent what?)
  3. How to represent the /s/ sound in this HMM state?
  4. Why is it 3? what happen if we have 4, 5 or more state?
  5. If the sound of /s/ is only a simple sound of consonant "s/", what is the used of the state and transition represent?

Do you guys have simple explanation with example (graphic analogy) of this theory?

Thank you

Nick


回答1:


what is it mean by 3 state?

The model that describes the phone S consist of tree states - S1, S2 and S3.

what actually S1, S2 & S3 mean? (I know it is state but it represent what?)

S1 represents probability distribution of feature vector in the beginning of phone S, S2 in the middle, S3 in the end. Probability distribution is essentially most probable value of the feature vector (how does this part of the phone sounds) and the variation (in what ranges it varies).

How to represent the /s/ sound in this HMM state?

S sounds is represented by a whole HMM, not just a single state.

Why is it 3? what happen if we have 4, 5 or more state?

In continuous speech recognition phone acoustics is affected by preceding phoneme and succeeding phoneme. For that reason its more precise to split each phone on 3 parts - transition from previous phone in the beginning, stable middle and transition to the next phone in the end. If phone would be isolated and stable 1 state would be enough. It is also possible to use 5 states for single phone in continuous speech, but it doesn't greatly improve the accuracy.

If the sound of /s/ is only a simple sound of consonant "s/", what is the used of the state and transition represent?

See above. Transition represents probability of moving from one state to another, essentially it models the length of the phone.



来源:https://stackoverflow.com/questions/28112608/3-state-phone-model-in-hidden-markov-model-hmm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!