There are multiple questions (examples: 1, 2, 3, 4, 5, 6, etc.) trying to address the question of how to handle image data when serving predictions for TensorFlow/Keras mode
The answer by @rhaertel above is the best treatment of this subject I've seen. For anyone working on deploying TensorFlow image-based models on Google Cloud ML, I'd recommend also having a look at the following repo:
https://github.com/mhwilder/tf-keras-gcloud-deployment.
I spent a while trying to get all of this working for several use cases and did my best to document the whole process in this repo. The repo covers the following topics: