Save Google Cloud Speech API operation(job) object to retrieve results later
You can monkey-patch this functionality to the version you are using, but I would advise upgrading to google-cloud-speech 0.24.0 or later. With those more current versions you can use Operation#id
and Project#operation
to accomplish this.
require "google/cloud/speech"
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :linear16,
language: "en-US",
sample_rate: 16000
op = audio.process
# get the operation's id
id = op.id #=> "1234567890"
# construct a new operation object from the id
op2 = speech.operation id
# verify the jobs are the same
op.id == op2.id #=> true
op2.done? #=> false
op2.wait_until_done!
op2.done? #=> true
results = op2.results
Update Since you can't upgrade, you can monkey-patch this functionality to an older-version using the workaround described in GoogleCloudPlatform/google-cloud-ruby#1214:
require "google/cloud/speech"
# Add monkey-patches
module Google
Module Cloud
Module Speech
class Job
def id
@grpc.name
end
end
class Project
def job id
Job.from_grpc(OpenStruct.new(name: id), speech.service).refresh!
end
end
end
end
end
# Use the new monkey-patched methods
speech = Google::Cloud::Speech.new
audio = speech.audio "path/to/audio.raw",
encoding: :linear16,
language: "en-US",
sample_rate: 16000
job = audio.recognize_job
# get the job's id
id = job.id #=> "1234567890"
# construct a new operation object from the id
job2 = speech.job id
# verify the jobs are the same
job.id == job2.id #=> true
job2.done? #=> false
job2.wait_until_done!
job2.done? #=> true
results = job2.results
How to get the result of a long-running Google Cloud Speech API operation later?
After reading the source, I found that GRPC has a 10 minute timeout. If you submit a large file, transcription can take over 10 minutes. The trick is to use the HTTP backend. The HTTP backend doesn't maintain a connection like GRPC, instead everytime you poll it sends a HTTP request. To use HTTP, do
speech_client = speech.Client(_use_grpc=False)
How to resume Google Cloud Speech API (longRunningRecognize) timeout on Cloud Functions
If your job is going to take more than 540 seconds, Cloud Functions is not really the best solution for this problem. Instead, you may want to consider using Cloud Functions as just a triggering mechanism, then offload the work to App Engine or Compute Engine using pubsub to send it the relevant data (e.g. the location of the file in Cloud Storage, and other metadata needed to make the request to recognize speech.
Google cloud -speech api return null result
Solved it by using the ffmpeg library to encode the audio to flac whit mono channel.
Why is the speech REST API response different from the go SDK API response?
The JSON-marshaled Golang (structs) are protobufs (snake_case'd fields and the times are google.protobuf.Timestamp
).
Can you try using the Golang protobuf protojson
package instead of encoding/json as this should bijectively map JSON and Golang protobuf structs.
How to work with result from google speech to text API
The MessageToJson converts the RecognizeResponse from protobuf message to JSON format but in a form of string.
You can work directly with the RecognizeResponse in the following way:
response: RecognizeResponse = client.recognize(config=your_config, audio=your_audio)
final_transcripts = []
final_transcripts_confidence = []
for result in response.results:
alternative = result.alternatives[0]
final_transcripts_confidence.append(alternative.confidence)
final_transcripts.append(alternative.transcript)
If you would like to work with MessageToJson anyway and convert it to dictionary you can do the following:
import json
from google.protobuf.json_format import MessageToJson
response: RecognizeResponse = client.recognize(config=your_config, audio=your_audio)
response_json_str = MessageToJson(response, indent=0)
response_dict = json.loads(response_json_str)
or you use MessageToDict to directly convert to dictionary.
NOTE:
From some version the proto conversion changed and results in getting an error: AttributeError: 'DESCRIPTOR'
To solve this you should use:
RecognizeResponse.to_json(response)
or alternatively:
RecognizeResponse.to_dict(response)
Related Topics
Why Must I Explicitly Call Self on Accessor When Using the Array Union Operator |= in Ruby
Custom Gem Execution Fails with Nomethoderror
How to Require Activerecord in Irb
Pow Not Loading Gem Properly While Rails S Works
Running Rake Task from Within War File
Ruby Roo Loaderror: Cannot Load Such File -- Spreadsheet/Note
What Does the |Variable| Syntax Mean
Connecting to Google Analytics API in a Rails App
Differencebetween Gsub and Sub Methods for Ruby Strings
Rails 4 Use Application Helpers Inside Initializers
Delayedjob: "Job Failed to Load: Uninitialized Constant Syck::Syck"
Heroku-18: Git Push Fails. Showing Different Versions of Ruby on Push
Prawn Gem: How to Create the .Pdf from an *Existing* File (.Xls)
Multiple Limit Condition in Mongodb
Invalid Route Name, Already in Use: 'Admin_Root' (Argumenterror) - Failed Activeadmin Install