SageMaker client create_endpoint() error 'does not have BatchGetImage permission for image: '763104351884…/tensorflow-inference:1.15.2-gpu'

喜欢而已 提交于 2021-02-11 15:12:28

问题


I have a pre-trained Tensorflow model, I'm trying to using SagaMaker client.create_endpoint() to create an endpoint so that I can call the API to get predictions, the doc is here

After creating the model by using client.create_model() I have a model stored on SageMaker, and the base image I'm using is 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.15.2-gpu, this is my code:

model_name = `xxx`,
role = `xxx`,
model_base_image = `763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.15.2-gpu`

def create_endpoint(s3_path):

    datetime_obj = datetime.now()
    timestamp_str = datetime_obj.strftime("%Y-%m-%d-%H-%M-%S")

    create_sagemaker_model()
    
    endpoint_config_name = model_name + '-' + timestamp_str
    endpoint_name = model_name + '-' + timestamp_str

    client = boto3.client('sagemaker')
    response_endpoint_config = client.create_endpoint_config(
        EndpointConfigName=endpoint_config_name,
        ProductionVariants=[
            {
                'VariantName': 'VariantName',
                'ModelName': model_name,
                'InitialInstanceCount': 1,
                'InstanceType': 'ml.m4.2xlarge'
            },
        ],
        DataCaptureConfig={
            'InitialSamplingPercentage': 100,
            'DestinationS3Uri': s3_path + '/input',
            'CaptureOptions': [
                {
                    'CaptureMode': 'Input'
                },
            ],
            'CaptureContentTypeHeader': {
                'JsonContentTypes': [
                    'application/jsonlines',
                ]
            }
        },
        Tags=[
            {
                'Key': 'string',
                'Value': 'string'
            },
        ]
    )
    logging.info(response_endpoint_config)

    response_endpoint = client.create_endpoint(
        EndpointName=endpoint_name,
        EndpointConfigName=endpoint_config_name,
        Tags=[
            {
                'Key': 'string',
                'Value': 'string'
            },
        ]
    )
    logging.info(response_endpoint)

if __name__ == "__main__":
    create_endpoint("s3://xxx")
    pass

After running this, I'm able to create the endpoint configuration, but it failed to create the endpoint, reason:

Failure reason
The role 'arn:aws:iam::9111xxxxxxxx:role/test-role' does not have BatchGetImage permission for the image: '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:1.15.2-gpu'.

In the policy of this role, I have:

{
            "Sid": "VisualEditor3",
            "Effect": "Allow",
            "Action": [
                "ecr:BatchGetImage"
            ],
            "Resource": [
                "arn:aws:ecr:us-east-1:763104351884:repository/*sagemaker*"
            ]
        },

        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "ecr:BatchDeleteImage",
                "ecr:UploadLayerPart",
                "ecr:DeleteRepository",
                "ecr:PutImage",
                "ecr:SetRepositoryPolicy",
                "ecr:BatchGetImage",
                "ecr:CompleteLayerUpload",
                "ecr:DeleteRepositoryPolicy",
                "ecr:InitiateLayerUpload"
            ],
            "Resource": [
                "arn:aws:ecr:*:*:repository/*sagemaker*"
 ]
        }
....

I don't know what I'm missing here, I'm new to SageMaker, can someone take a look and give me some guidance please? Many thanks!

来源:https://stackoverflow.com/questions/65004515/sagemaker-client-create-endpoint-error-does-not-have-batchgetimage-permission

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!