No language code from asynchronous speech recognition response with alternative language code

谁都会走 提交于 2021-01-04 04:22:05

问题


I am trying to use the new beta alternative languages functionality that allows to give a set of languages when creating the transcription job and getting back the detected language along with the transcription results in that language.

When I run the code example from the documentation page (synchronous) everything runs fine and the detected language code is returned in the results:

from google.cloud import speech_v1p1beta1 as speech
client = speech.SpeechClient()

speech_file = 'resources/multi.wav'
first_lang = 'en-US'
second_lang = 'es'

with open(speech_file, 'rb') as audio_file:
    content = audio_file.read()

audio = speech.types.RecognitionAudio(content=content)

config = speech.types.RecognitionConfig(
    encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=44100,
    audio_channel_count=2,
    language_code=first_lang,
    alternative_language_codes=[second_lang])

print('Waiting for operation to complete...')
response = client.recognize(config, audio)

for i, result in enumerate(response.results):
    alternative = result.alternatives[0]
    print(result.language_code)  # this prints 'en-US'
    print('-' * 20)
    print('First alternative of result {}: {}'.format(i, alternative))
    print(u'Transcript: {}'.format(alternative.transcript))

But when I try the asynchronous mode, the language code is not returned along with the results:

from google.cloud import speech_v1p1beta1 as speech
client = speech.SpeechClient()

gs_url = 'gs://my-bucket-name/multi.wav'
first_lang = 'en-US'
second_lang = 'es'

audio = speech.types.RecognitionAudio(uri=gs_url)

config = speech.types.RecognitionConfig(
    encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=44100,
    audio_channel_count=2,
    language_code=first_lang,
    alternative_language_codes=[second_lang])

print('Waiting for operation to complete...')
operation = client.long_running_recognize(config, audio)
response = operation.result(timeout=40)

for i, result in enumerate(response.results):
    alternative = result.alternatives[0]
    print(result.language_code)  # this prints nothing! result.language_code is empty string 
    print('-' * 20)
    print('First alternative of result {}: {}'.format(i, alternative))
    print(u'Transcript: {}'.format(alternative.transcript))

This behaviour happens despite the documentation stating explicitly:

Speech-to-Text supports alternative language codes for all speech recognition methods: speech:recognize, speech:longrunningrecognize, and Streaming.

Any idea on how to get the detected language code also for asynchronous transcription requests?

This is my version of the google libraries

来源:https://stackoverflow.com/questions/52852166/no-language-code-from-asynchronous-speech-recognition-response-with-alternative

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!