How to use gsutil compose in GoogleShell and skip first rows?

China☆狼群 提交于 2021-01-28 03:04:49

问题


I am trying to use "compose" command in the shell to merge the files I get in my bucket GCP. Problem appears when this command merges those csv files but does not skip the headers.

What I finally get is a merge of 24 csv files but also 24 headers.

Trying to do this in python but also no solution.

Any help??


回答1:


There doesn't exist any flag on gsutil to skip csv headers but I have this python script that can make the workaround.

This script downloads the csv files from the bucket, append them skipping the headers and then upload the appended file to the bucket again.

import csv
from google.cloud import storage

client = storage.Client()
bucket = client.get_bucket('YOUR.BUCKET.NAME')
blob = bucket.get_blob('FILE1.NAME')
blob.download_to_filename('FILE1.NAME')
blob2 = bucket.get_blob('FILE1.NAME')
blob.download_to_filename('FILE2.NAME')
csvs = ["FILE1.NAME", "FILE2.NAME"]
writer = csv.writer(open('appended_output.csv', 'wt'))
for x in csvs:
    with open(x, "rt") as files:
        reader = csv.reader(files)
        next(reader, None)
        for data in reader:
            writer.writerow(data)

blob = bucket.blob("appended_output.csv")
blob.upload_from_filename("appended_output.csv")


来源:https://stackoverflow.com/questions/57591243/how-to-use-gsutil-compose-in-googleshell-and-skip-first-rows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!