How to identify speaker from voice pattern using Microsoft Speech?

2018-06-03 05:07:43

I'm using Microsoft Speech C# API for Home Automation commands

I'd like to know if there is a way or built-in C# method to hash Voice Input and recognize who's speaking. If it is Alice or Bob to say "Hello Alice" or "Hello Bob".

EDIT:

Microsoft Speech API can provides a .wav of the recording. It might be able to hash, process, ... to understand who's speaking:

Loud voice, slow modulation, ... => Bob

High voice, fast modulation, ... => Alice

Speaker recognition is a hard problem and is still an active research area. I don't think Microsoft speech api has any speaker recognition support, but not 100% sure.

I found the following article really helpful while researching the topic. It introduces the subject and also provides a very crude implementation. Probably a good place to start.

http://www.ibm.com/developerworks/opensource/library/os-sndpeek/index.html

You can use Microsoft Speaker Recognition APIs for doing this task: https://www.microsoft.com/cognitive-services/en-us/speaker-recognition-api

Microsoft is providing two APIs for this purpose: Speaker Verification & Speaker Identification.

You can find their C# & Python SDKs here: https://github.com/Microsoft/ProjectOxford-ClientSDK/tree/master/SpeakerRecognition

It looks like you are trying to solve the Speaker Diarization problem (finding who speaks when); there are many toolkit available on the Internet for that. I could recommend one (run on Java) called LIUM: http://www-lium.univ-lemans.fr/diarization/doku.php.

If you just interesting on distinguishing Alice and Bob, you can have a look at the Gender Detection part in the Scripting page of the website above (or go directly here http://www-lium.univ-lemans.fr/diarization/doku.php/gender_detection).

链接地址: http://www.djcxy.com/p/11080.html