AML - Web service TimeoutError

限于喜欢 提交于 2020-08-08 05:24:12

问题


We created a webservice endpoint and tested it with the following code, and also with POSTMAN.

We deployed the service to an AKS in the same resource group and subscription as the AML resource.

UPDATE: the attached AKS had a custom networking configuration and rejected external connections.

import numpy
import os, json, datetime, sys
from operator import attrgetter
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core.image import Image
from azureml.core.webservice import Webservice
from azureml.core.authentication import AzureCliAuthentication

cli_auth = AzureCliAuthentication()
# Get workspace
ws = Workspace.from_config(auth=cli_auth)

# Get the AKS Details
try:
    with open("../aml_config/aks_webservice.json") as f:
        config = json.load(f)
except:
    print("No new model, thus no deployment on AKS")
    # raise Exception('No new model to register as production model perform better')
    sys.exit(0)

service_name = config["aks_service_name"]
# Get the hosted web service
service = Webservice(workspace=ws, name=service_name)

# Input for Model with all features
input_j = [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]]
print(input_j)
test_sample = json.dumps({"data": input_j})
test_sample = bytes(test_sample, encoding="utf8")
try:
    prediction = service.run(input_data=test_sample)
    print(prediction)
except Exception as e:
    result = str(e)
    print(result)
    raise Exception("AKS service is not working as expected")

In AML Studio, the deployment state is "Healthy".

We get the following error when testing:

Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'

Log just after deploying the AKS Webservice here.

Log after running the test script here.

How can we know what is causing this problem and fix it?


回答1:


Did you try service.get_logs(). Please also try a local deployment first. https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-local-container-notebook-vm




回答2:


I'm not sure what's the difference between Webservice and AKSWebservice, but give the AKS variant a try link. I would also try to isolate whether this is an AKS issue by deploying through ACI and validating your dependencies and scoring script.




回答3:


We checked the AKS networking configuration and realized it has an Azure CNI profile.

In order to test the webservice we need to do it from inside the created virtual network. It worked well!



来源:https://stackoverflow.com/questions/62457880/aml-web-service-timeouterror

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!