Download data directly to google cloud storage

痴心易碎 提交于 2020-01-24 14:05:25

问题


I want to download data from python application/command (for eg: youtube-dl or any other library that download from 3rd party url) directly to google cloud storage(Bucket).

I have used gsutil stream command to stream data directly from process to gcs, but it saves only console output to bucket

Also i don't want to mount storage because i want to share that storage with distributed system

Is there any way in which i can download it without downloading on file system first and then copying it to google cloud storage ?

Thanks,


回答1:


The situation you are describing doesn't seem possible: looking at the documentation and source code for the Cloud Storage library in Python only leave you 3 options: upload from file (already in your disk), upload providing a filename (a path to a file already in your disk) and upload from string (upload a text as a .txt file).

You will need to download the file from whichever program you mention (as mentioned in the comments, you can download it to a temporal folder), upload the file to GCS and then delete it from your temporal folder.




回答2:


From what I understand, you are in search for another technique aside from gsutil stream to store into your bucket directly. Considering that you have a command application of Python already. You have a couple of options to achieve your goal:

Option 1: Store your data into a python variable then push it into your bucket with the help of boto client library plugin (which uses Python: 2.6.x and 2.7.x )

The Google documentation here outlines the idea of how to utilize boto within Python ( plus usage examples ).

However, here is copy/paste version of that link with a brief description provided.

upload:

dst_uri = boto.storage_uri( + '/' + , 'gs') dst_uri.new_key().set_contents_from_stream()

download:

import sys src_uri = boto.storage_uri( + '/' + , 'gs') src_uri.get_key().get_file(sys.stdout)

Where: bucket is your app bucket name that you have set up and object being the object you wish to store ( you can find the information for your bucket-name in your GCP console). Also, the great thing about GCS buckets is that you can literally store anything you want into it (i.e: no need to specify what you are storing or encode anything before storing ).

Option 2: Store your data using Google Cloud Storage Client Libraries. To be more precise, the functionality that you are looking for in your situation is uploading objects with the help of blob. ( you can store any form of data inside a blob as well )

Since you do not want to save locally and store directly into your bucket. It would be my recommendation to use the following method:

upload_from_string(data, content_type='text/plain', client=None, redefined_acl=None)

(Google definition: Upload contents of this blob from the provided string)

The important thing to note in this method is that you can set which type of data you want to store. Depending on what you are trying to store ( libraries that download from 3rd party url’s ) you can choose between str or bytes. However, it would be my recommendation to try bytes first as it allows ASCII characters.



来源:https://stackoverflow.com/questions/52624858/download-data-directly-to-google-cloud-storage

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!