Python: How to read and load an excel file from AWS S3?

前端 未结 3 1915
故里飘歌
故里飘歌 2020-12-17 03:57

I have uploaded an excel file to AWS S3 bucket and now I want to read it in python. Any help would be appreciated. Here is what I have achieved so far,

impo         


        
相关标签:
3条回答
  • 2020-12-17 04:31

    You can directly read xls file from S3 without having to download or save it locally. xlrd module has a provision to provide raw data to create workbook object. Following is the code snippet.

    from boto3 import Session  
    from xlrd.book import open_workbook_xls  
    
    aws_id = ''    
    aws_secret = ''
    bucket_name = ''
    object_key = ''
    
    s3_session = Session(aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
    bucket_object = s3_session.resource('s3').Bucket(bucket_name).Object(object_key)
    content = bucket_object.get()['Body'].read()
    workbook = open_workbook_xls(file_contents=content)
    
    0 讨论(0)
  • 2020-12-17 04:37

    Python doesn't support excel files natively. You could use the pandas library pandas library read_excel functionality

    0 讨论(0)
  • 2020-12-17 04:51

    Spent quite some time on it and here's how I got it working,

    import boto3
    import io
    import pandas as pd
    import json
    
    aws_id = ''
    aws_secret = ''
    bucket_name = ''
    object_key = ''
    
    s3 = boto3.client('s3', aws_access_key_id=aws_id, aws_secret_access_key=aws_secret)
    obj = s3.get_object(Bucket=bucket_name, Key=object_key)
    data = obj['Body'].read()
    df = pd.read_excel(io.BytesIO(data), encoding='utf-8')
    
    0 讨论(0)
提交回复
热议问题