Open source tools for recognizing untranscribed speech without a dictionary
Just doing some general research. Are there any open source (or even paid?) tools / programs that do the following:
INPUT: an audio file of some unlabeled speech, maybe a few sentences long, (no indication as to what the phonetic transcriptions are in the audio)
OUTPUT: an audio file with phonetic transcriptions (in the IPA alphebet) aligned and labeled on the audio
Is this possible to be done with just a phonetic dictionary and without a word dictionary?
Sphinx has an all phone feature that will produce this kind of output hypothesis. But most speech recognition is improved strongly by utilization of a phonetic dictionary and n-gram language model. It's possible to use those things in the creation of the hypothesis and then convert that in to labeled aligned phonemes with Sphinx.
Here is an example for just phonetic stuff.
http://cmusphinx.sourceforge.net/wiki/phonemerecognition
But I have been out of the speech rec game for a long time. I believe most people are pursuing neural nets now for these kinds of concepts and I don't know any open neural nets in that space.
链接地址: http://www.djcxy.com/p/5878.html上一篇: 谷歌语音到文本的工作原理
下一篇: 无需字典即可识别未转录语音的开源工具