问题
After creating the Amazon S3 Bucket, my_bucket, I created an Elastic Map Reduce cluster via the cli:
aws emr create-cluster --name "Hive testing" --ami-version 3.3 --applications Name=Hive --use-default-roles --instance-type m3.xlarge --instance-count 3 --steps Type=Hive,Name="Hive Program",Args=[-d,INPUT=s3://my_bucket/input,-d.OUTPUT=s3://my_bucket/input,-d-LIBS=s3://my_bucket/serde_libs]
Note that I did not specify a hive *.q file. After making the S3 and EMR Cluster, I will log onto the EMR box, and then run hive interactively.
Note- I'm assuming there's an EMR box onto which I can log.
However, when I ran aws emr describe-cluster --cluster-id XYZ, I saw this error in the output:
"State": "TERMINATED_WITH_ERRORS",
"StateChangeReason": {
"Message": "EMR service role arn:aws:iam::xyz:role/EMR_DefaultRole
is invalid",
"Code": "VALIDATION_ERROR"
}
What would cause this error? Do I need to open permissions on the S3 bucket for the EMR cluster to access it?
回答1:
The issue is not with the bucket but that the expected IAM role is missing.
See http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-iam-roles-creatingroles.html#emr-iam-roles-createdefaultwithcli
Issue the AWS CLI command:
aws emr create-default-roles
Then create the cluster again. This is a one-time step needed to create the default roles.
note: beware of using a recent version of aws cli, I had problems with 1.4 (debian jessie package)
note 2: taken from mrjob code and amazon annoucments:
instance profile and service role are required for accounts created after April 6, 2015, and will eventually be required for all accounts
回答2:
I've seen this issue crop up when you create custom service roles and assign the wrong principal service.
This example will generate that error:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": "Invalid"
}
]
}
This example will not:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "elasticmapreduce.amazonaws.com"
},
"Effect": "Allow",
"Sid": "Valid"
}
]
}
For more info see here: http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-mgmt.pdf#emr-plan-access-iam
来源:https://stackoverflow.com/questions/27953582/emr-service-role-is-invalid-when-creating-emr-cluster