How to extract the elements from csv to json in S3

六眼飞鱼酱① 提交于 2020-08-19 17:39:07

问题


  • I need to find the csv files from the folder
  • List all the files inside the folder
  • Convert files to json and save in the same bucket

Csv file, Like below so many csv files are there

emp_id,Name,Company
10,Aka,TCS
11,VeI,TCS

Code is below

import boto3
import pandas as pd
def lambda_handler(event, context):
    s3 = boto3.resource('s3')
    my_bucket = s3.Bucket('testfolder')
    for file in my_bucket.objects.all():
        print(file.key)
    for csv_f in file.key:
        with open(f'{csv_f.replace(".csv", ".json")}', "w") as f:
            pd.read_csv(csv_f).to_json(f, orient='index')

Not able to save if you remove bucket name it will save in the folder. How to save back to bucket name


回答1:


You can check the following code:

from io import StringIO

import boto3
import pandas as pd

s3 = boto3.resource('s3')

def lambda_handler(event, context):
    
    s3 = boto3.resource('s3')
    
    input_bucket = 'bucket-with-csv-file-44244'
    
    my_bucket = s3.Bucket(input_bucket)
    
    for file in my_bucket.objects.all():
        
        if file.key.endswith(".csv"):
           
            csv_f = f"s3://{input_bucket}/{file.key}"
            
            print(csv_f)
            
            json_file = file.key.replace(".csv", ".json")
            
            print(json_file)
            
            json_buffer = StringIO()
            
            df = pd.read_csv(csv_f)
            
            df.to_json(json_buffer, orient='index')
            
            s3.Object(input_bucket, json_file).put(Body=json_buffer.getvalue())            

Your lambda layer will need to have:

fsspec
pandas
s3fs


来源:https://stackoverflow.com/questions/63351210/how-to-extract-the-elements-from-csv-to-json-in-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!