I have an EMR Spark Job that needs to read data from S3 on one account and write to another.
I split my job into two steps.
read data from the S3 (no
For controlling access of the resources, generally IAM roles are managed as a standard practice. Assume roles are used when you want to access resources in a different account. If you or your organisation follow the same then you should follow https://aws.amazon.com/blogs/big-data/securely-analyze-data-from-another-aws-account-with-emrfs/. The basic idea here is to use a credentials provider with which the access is obtained by EMRFS to access objects in S3 buckets. You can go one step further and make the ARN for STS and buckets parameterized for the JAR created in this blog.