Accuracy is poor in speech recognition using pocket Sphinix

I have run pocketsphinix demo example from http://ucla.jamesyxu.com/?p=118.But, I found accuracy of recognition of words is very poor.I copy the acoustic model from pocketsphinix8.0 ...pocketsphinxmodelhmm and ..lm folder to sdcard of phone.It recognize limited few words not sentences. My questions are following

1)How can I improve accuracy ?

2)Is there need to change acoustic model and dictionary(in hmm and lm folder) if yes than how can i change model and dictionary.Is there any other procedure need to follow to add model and dictionary. I also change dictionary from following link (US English HUB4 Language Model- just copy dictionary file in lm folder doesn't change in hmm folder)

http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/

3)How can i give audio file as input instead of recorded voice?

Also ,I have tried to convert audio file as input. I am reading audio file as following ( Here, .wav file used in sphinix4 lib transcribe demo file "10001-90210-01803.wav" pronounce digit for accuracy but fail recognize single word and convert that to incorrect text).

     int readAudioFile(){           
        this.done=true;
        AssetManager mngr = context.getAssets();            
        InputStream io = null;
        int current = 0;
        try {

          io= mngr.open("10001-90210-01803.wav");   


    //Create a DataInputStream to read the audio data from the saved file
            DataInputStream         dis = new DataInputStream(io);   
            int noOfByteToRead=io.available();
            int noOfShortToRead=noOfByteToRead/2;
            short[] music = new short[noOfShortToRead];
            int i = 0;                                                          //  Read the file into the "music" array
            try {
                while (dis.available() > 0)
                {
                    music[i] = dis.readShort();                     

                    i++;
                }
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
            Log.i("123",""+Arrays.toString(music));
            this.q.add(music);
            try {
                dis.close();
            } catch (IOException e) {
                e.printStackTrace();
            }        

        } catch (IOException e) {           
            e.printStackTrace();
        } catch (Exception e1) {
            e1.printStackTrace();
        }               


        return current;
        }

1). By poor what kind of accuracy are you getting? First what percentage word error are you seeing, and second, can you give us some example words you said and the output given? As Praful said a sound file transcript would be very useful.

Also, have you tried to run the default application and seeing if speaking to it produce similarly bad results?

2). You can build your own dictionary by following this: http://ucla.jamesyxu.com/?p=121

I also have a few notes from using the library (We performed several studies using it, ~20 subjects each):

  • From experience the default dictionary and library does a ok job at recognizing words and sentences. With an American accent we generally observed that simple sentences such as "I am walking upstairs" should produce no errors while more complex sentences may produce a few word errors.
  • You generally cannot expect names or abbreviations to be correct
  • If your application is only looking for certain phrases, then I recommend building a dictionary and model based only on those phrases. This is because the fewer phrases you have the classifier is forced to decide among one of the phrases, thus higher accuracy for your use case.
  • For long sentences involving many keywords, consider doing a distance computation against sentences you are expecting and selecting the closest one.
  • Accent is important
  • I got a notification because google reminded me that your link matched my domain name

    链接地址: http://www.djcxy.com/p/34364.html

    上一篇: 提高英特尔真实语音识别的准确度

    下一篇: 使用口袋Sphinix进行语音识别的准确度很差