Native Speech to Text

2018-05-31 01:08:04

I am trying to use the Watson Speech to Text API to record audio in a react-native app and then convert the audio to text.

I am having trouble figuring this out and any help would really be appreciated.

I can get the audio recorded but I am having trouble figuring out how to send the file to the backend OR just send directly to the Watson API on the frontend.

The Watson API Cloud library for node has this:

      var SpeechToTextV1 = require('watson-developer-cloud/speech-to-text/v1');
      var fs = require('fs');

      var speech_to_text = new SpeechToTextV1({
      username: '<username>',
      password: '<password>'
      });

        var params = {
        // From file
        audio: fs.createReadStream('./resources/speech.wav'),
        content_type: 'audio/l16; rate=44100'
         };

       speech_to_text.recognize(params, function(err, res) {
       if (err)
       console.log(err);
       else
      console.log(JSON.stringify(res, null, 2));
       });

Unfortunately I cannot access 'fs' on the frontend to create Streams. The file gets saved in hidden folder on the client's frontend (which I have the path too)

Eventually I would like to create a stream somehow so I can send the audio as it comes in, to to get converted to text automatically and reduce the speed.

Like this:

   fs.createReadStream('./resources/speech.wav')
   .pipe(speech_to_text.createRecognizeStream({ content_type:       'audio/l16; rate=44100' }))
  .pipe(fs.createWriteStream('./transcription.txt'));

Any idea how to do all this on the frontend with the path of the recorded audio. Any work arounds? Thank you!

React Native supports websockets out of the box: https://facebook.github.io/react-native/docs/network.html

Watson API supports websockets as part of their Speech to Text API: https://www.ibm.com/watson/developercloud/doc/speech-to-text/websockets.shtml (See section "Sending audio and receiving recognition results" websocket.send(blob)

This seems to be a reasonable solution.

I have put together a native module that uses the watson-developer-cloud/swift-sdk, and speech to text is implemented.

https://github.com/pwcremin/react-native-watson

You can refer to my code for an example of how to implement it, or just use the module.

The react-native-watson module uses the mic and handles streaming for you:

import {SpeechToText} from 'react-native-watson';

SpeechToText.initialize("username", "password")

// will transcribe microphone audio
SpeechToText.startStreaming((error, text) =>
        {
            console.log(text)
        })

SpeechToText.stopStreaming()

链接地址: http://www.djcxy.com/p/5882.html

上一篇: 我如何使用WPF绑定与RelativeSource？

下一篇: 本土语音到文本