I have a trained model that takes in a somewhat large input. I generally do this as a numpy array of the shape (1,473,473,3). When I put that to JSON I end up getting about
What I typically do is to have the json refer to a file in Google Cloud Storage. See the serving input function here for example:
https://github.com/GoogleCloudPlatform/training-data-analyst/blob/61ab2e175a629a968024a5d09e9f4666126f4894/courses/machine_learning/deepdive/08_image/flowersmodel/trainer/model.py#L119
Users would first upload their file to gcs and then invoke prediction. But this approach has other advantages, since the storage utilities allow for parallel and multithreaded uploads.