Using Tensorflow's Connectionist Temporal Classification (CTC) implementation

牧云@^-^@ 提交于 2019-12-02 17:19:31
Jihyeon Seo

I'm trying to do the same thing. Here's what I found you may be interested in.

It was really hard to find the tutorial for CTC, but this example was helpful.

And for the blank label, CTC layer assumes that the blank index is num_classes - 1, so you need to provide an additional class for the blank label.

Also, CTC network performs softmax layer. In your code, RNN layer is connected to CTC loss layer. Output of RNN layer is internally activated, so you need to add one more hidden layer (it could be output layer) without activation function, then add CTC loss layer.

See here for an example with bidirectional LSTM, CTC, and edit distance implementations, training a phoneme recognition model on the TIMIT corpus. If you train on that corpus's training set, you should be able to get phoneme error rates down to 20-25% after 120 epochs or so.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!