问题
What is the best operator to copy a file from one s3 to another s3 in airflow? I tried S3FileTransformOperator already but it required either transform_script or select_expression. My requirement is to copy the exact file from source to destination.
回答1:
You have 2 options (even when I disregard Airflow
)
- Use AWS CLI: cp command
aws s3 cp <source> <destination>
- In
Airflow
this command can be run usingBashOperator
(local machine) orSSHOperator
(remote machine)
- Use AWS SDK aka
boto3
- Here you'll be using
boto3
's S3Client - Airflow already provides a wrapper over it in form of
S3Hook
- Even copy_object(..) method of S3Client is available in S3Hook as (again) copy_object(..)
- You can use
S3Hook
inside any suitable customoperator
or justPythonOperator
- Here you'll be using
回答2:
Use S3CopyObjectOperator
copy_step = S3CopyObjectOperator(
source_bucket_key='source_file',
dest_bucket_key='dest_file',
aws_conn_id='aws_connection_id',
source_bucket_name='source-bucket',
dest_bucket_name='dest-bucket'
)
来源:https://stackoverflow.com/questions/55135735/apache-airflow-operator-to-copy-s3-to-s3