How can I maximize throughput for an embarrassingly-parallel task in Python on Google Cloud Platform?
问题 I am trying to use Apache Beam/Google Cloud Dataflow to speed up an existing Python application. The bottleneck of the application occurs after randomly permuting an input matrix N (default 125, but could be more) times, when the system runs a clustering algorithm on each matrix. The runs are fully independent of one another. I've captured the top of the pipeline below: This processes the default 125 permutations. As you can see, only the RunClustering step takes an appreciable amount of time