how to read mp3 data from google cloud using python

谁都会走 提交于 2019-12-25 02:44:36

问题


I am trying to read mp3/wav data from google cloud and trying to implement audio diarization technique. Issue is that I am not able to read the result which has passed by google api in variable response.

below is my python code

speech_file = r'gs://pp003231/a4a.wav'
config = speech.types.RecognitionConfig(
    encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
    language_code='en-US',
    enable_speaker_diarization=True,
    diarization_speaker_count=2)
audio = speech.types.RecognitionAudio(uri=speech_file)
response = client.long_running_recognize(config, audio)
print response
result = response.results[-1]
print result

Output displayed on console is Traceback (most recent call last): File "a1.py", line 131, in print response.results AttributeError: 'Operation' object has no attribute 'results'

Can you please share your expert advice about what I am doing wrong? Thanks for your help.


回答1:


Do you have access to the wav file in your bucket? also, this is the entire code? It seems that the sample_rate_hertz and the imports are missing. Here you have the code copy/pasted from the google docs samples, but I edited it to have just the diarization function.

#!/usr/bin/env python
"""Google Cloud Speech API sample that demonstrates enhanced models
and recognition metadata.
Example usage:
    python diarization.py
"""

import argparse
import io



def transcribe_file_with_diarization():
    """Transcribe the given audio file synchronously with diarization."""
    # [START speech_transcribe_diarization_beta]
    from google.cloud import speech_v1p1beta1 as speech
    client = speech.SpeechClient()



    audio = speech.types.RecognitionAudio(uri="gs://<YOUR_BUCKET/<YOUR_WAV_FILE>")

    config = speech.types.RecognitionConfig(
        encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=8000,
        language_code='en-US',
        enable_speaker_diarization=True,
        diarization_speaker_count=2)

    print('Waiting for operation to complete...')
    response = client.recognize(config, audio)

    # The transcript within each result is separate and sequential per result.
    # However, the words list within an alternative includes all the words
    # from all the results thus far. Thus, to get all the words with speaker
    # tags, you only have to take the words list from the last result:
    result = response.results[-1]

    words_info = result.alternatives[0].words

    # Printing out the output:
    for word_info in words_info:
        print("word: '{}', speaker_tag: {}".format(word_info.word,
                                                   word_info.speaker_tag))
    # [END speech_transcribe_diarization_beta]



if __name__ == '__main__':

    transcribe_file_with_diarization()

To run the code just name it diarization.py and use the command:

python diarization.py

Also, you have to install the latest google-cloud-speech library:

pip install --upgrade google-cloud-speech

And you need to have the credentials of your service account in a json file, you can check more info here



来源:https://stackoverflow.com/questions/53490557/how-to-read-mp3-data-from-google-cloud-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!