Good Speech Recognition API

good Speech recognition API

I think desktop recognition is starting because you are using a shared desktop recognizer. You should use an inproc recognizer for your application only. you do this by instantiating a SpeechRecognitionEngine() in your application.

Since you are using the dictation grammar and the desktop windows recognizer, I believe it can be trained by the speaker to improve its accuracy. Go through the Windows 7 recognizer training and see if the accuracy improves.

To get started with .NET speech, there is a very good article that was published a few years ago at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. It is probably the best introductory article I’ve found so far. It is a little out of date, but very helfpul. (The AppendResultKeyValue method was dropped after the beta.)

Here is a quick sample that shows one of the simplest .NET windows forms app to use a dictation grammar that I could think of. This should work on Windows Vista or Windows 7. I created a form. Dropped a button on it and made the button big. Added a reference to System.Speech and the line:

using System.Speech.Recognition;

Then I added the following event handler to button1:

private void button1_Click(object sender, EventArgs e)
{
SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine();
Grammar dictationGrammar = new DictationGrammar();
recognizer.LoadGrammar(dictationGrammar);
try
{
button1.Text = "Speak Now";
recognizer.SetInputToDefaultAudioDevice();
RecognitionResult result = recognizer.Recognize();
button1.Text = result.Text;
}
catch (InvalidOperationException exception)
{
button1.Text = String.Format("Could not recognize input from default aduio device. Is a microphone or sound card available?\r\n{0} - {1}.", exception.Source, exception.Message);
}
finally
{
recognizer.UnloadAllGrammars();
}
}

A little more information comparing the various flavors of speech engines and APIs shipped by Microsoft can be found at What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition??

Is there any speech recognition API besides Google that returns interim results?

Microsoft's Project Oxford Speech Recognition API, used by Cortana and Skype Translator, meets both of your criteria: it supports French (and 6 other languages) and returns partial/interim/online hypotheses as you stream audio to it.

(As an aside, the usual problem that causes terrible accuracy when doing online recognition with Pocketsphinx is bad CMN (cepstral mean normalization). When you give pocketsphinx a complete piece of audio to process it computes the CMN over the entire utterance, but when you stream audio to it it does not by default compute the CMN. One solution is to give it a complete utterance, retrieve the CMN computed by pocketsphinx, then use that CMN for the streaming audio. Note that CMN is different for each audio channel/environment, and that the Python interface to pocketsphinx doesn't offer an interface to CMN data. I have a patch if this is a route you'd like to investigate.)

How to use google speech recognition api in c#?

Just tested this myself, below is a working solution if you have a valid API key.

    using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Net;
using System.IO;

namespace GoogleRequest
{
class Program
{
static void Main(string[] args)
{
try
{

FileStream fileStream = File.OpenRead("good-morning-google.flac");
MemoryStream memoryStream = new MemoryStream();
memoryStream.SetLength(fileStream.Length);
fileStream.Read(memoryStream.GetBuffer(), 0, (int)fileStream.Length);
byte[] BA_AudioFile = memoryStream.GetBuffer();
HttpWebRequest _HWR_SpeechToText = null;
_HWR_SpeechToText =
(HttpWebRequest)HttpWebRequest.Create(
"https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=YOUR_API_KEY_HERE");
_HWR_SpeechToText.Credentials = CredentialCache.DefaultCredentials;
_HWR_SpeechToText.Method = "POST";
_HWR_SpeechToText.ContentType = "audio/x-flac; rate=44100";
_HWR_SpeechToText.ContentLength = BA_AudioFile.Length;
Stream stream = _HWR_SpeechToText.GetRequestStream();
stream.Write(BA_AudioFile, 0, BA_AudioFile.Length);
stream.Close();

HttpWebResponse HWR_Response = (HttpWebResponse)_HWR_SpeechToText.GetResponse();
if (HWR_Response.StatusCode == HttpStatusCode.OK)
{
StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
Console.WriteLine(SR_Response.ReadToEnd());
}

}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}

Console.ReadLine();
}
}
}

C# - Free Offliine speech recognition library (SDK)

I got quite good results using pocketsphinx, or Sphinx if you have more available resources, in the past. Check it here:
https://cmusphinx.github.io/

Good tutorial for Google Cloud Speech API/Speech recognition on Mac/Python?

You probably just copied the one file. It's best to clone the whole repository. Then the script will also find the file. Otherwise you will find the file in the resource folder.



Related Topics



Leave a reply



Submit