I\'m following the Serving Inception Model with TensorFlow Serving and Kubernetes workflow and everything work well up to the point of the final serving of the inception mod
I figured it out with the help of several tensorflow experts. Things started to work after I introduced the following changes:
First, I changed inception_k8s.yaml file in the following way:
Source:
args:
- /serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
--port=9000 --model_name=inception --model_base_path=/serving/inception-export
Modification:
args:
- serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server
--port=9000 --model_name=inception --model_base_path=serving/inception-export
Second, I exposed the deployment:
kubectl expose deployments inception-deployment --type=“LoadBalancer”
and I used the IP generated from exposing the deployment, not the inception-service IP.
From this point I am able to run the inference from an external host where the client is installed using the command from the Serving Inception Model with TensorFlow Serving and Kubernetes.