I am interested in using Google cloud Dataflow to parallel process videos. My job uses both OpenCV and tensorflow. Is it possible to just run the workers inside a docker ins
If you have a large number of videos you will have to incur the large startup cost regardless. Thus is the nature of Grid Computing in general.
The other side of this is that you could use larger machines under the job than the n1-standard-1 machines, thus amortizing the cost of the download across less machines that could potentially process more videos at once if the processing was coded correctly.