How to save sklearn model on s3 using joblib.dump?

问题

I have a sklearn model and I want to save the pickle file on my s3 bucket using joblib.dump

I used joblib.dump(model, 'model.pkl') to save the model locally, but I do not know how to save it to s3 bucket.

s3_resource = boto3.resource('s3')
s3_resource.Bucket('my-bucket').Object("model.pkl").put(Body=joblib.dump(model, 'model.pkl'))

I expect the pickled file to be on my s3 bucket.

回答1:

Here's a way that worked for me. Pretty straight forward and easy. I'm using joblib (it's better for storing large sklearn models) but you could use pickle too.
Also, I'm using temporary files for transferring to/from S3. But if you want, you could store the file in a more permanent location.

import tempfile
import boto3
import joblib

bucket_name = "my-bucket"
key = "model.pkl"

# WRITE
with tempfile.TemporaryFile() as fp:
    joblib.dump(model, fp)
    fp.seek(0)
    s3_resource.put_object(Body=fp.read(), Bucket=bucket_name, Key=key)

# READ
with tempfile.TemporaryFile() as fp:
    s3_resource.download_fileobj(Fileobj=fp, Bucket=bucket_name, Key=key)
    fp.seek(0)
    model = joblib.load(fp)

# DELETE
s3_resource.delete_object(Bucket=bucket_name, Key=key)

来源：https://stackoverflow.com/questions/56571731/how-to-save-sklearn-model-on-s3-using-joblib-dump

标签

python

amazon-web-services

amazon-s3

scikit-learn

joblib

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!