问题

We are using a private GCP account and we would like to process 30 GB of data and do NLP processing using SpaCy. We wanted to use more workers and we decided to start with a maxiumn number of worker of 80 as show below. We submited our job and we got some issue with some of the GCP standard user quotas:

QUOTA_EXCEEDED: Quota 'IN_USE_ADDRESSES' exceeded.  Limit: 8.0 in region XXX

So I decided to request some new quotas of 50 for IN_USE_ADDRESSES in some region (it took me few iteration to find a region who could accept this request). We submited a new jobs and we got new quotas issues:

QUOTA_EXCEEDED: Quota 'CPUS' exceeded.  Limit: 24.0 in region XXX

QUOTA_EXCEEDED: Quota 'CPUS_ALL_REGIONS' exceeded.  Limit: 32.0 globally

My questions is if I want to use 50 workers for example in one region, which quotas do I need to changed ? The doc https://cloud.google.com/dataflow/quotas doesn't seems to be up to date since they only said " To use 10 Compute Engine instances, you'll need 10 in-use IP addresses.". As you can see above this is not enought and other quotas need to be changed as well. Is there some doc, blog or other post where this is documented and explained ? Just for one region there are 49 Compute Engine quotas that can be changed!

回答1:

I would suggest that you start using Private IP's instead of Public IP addresses. This would help in you in 2 ways:-

You can bypass some of the IP address related quotas as they are related to Public IP addresses.
Reduce costs significantly by eliminating network egress costs as the VM's would not be communicating with each other over public internet. You can find more details in this excellent article [1]

To start using the private IP's please follow the instructions as mentioned here [2]

Apart from this you would need to take care of the following quota's

CPUs

You can increase the quota for a given region by setting the CPUs quota under Compute Engine appropriately.

Persistent Disk

By default each VM needs a storage of 250 GB therefore for 100 instances it would be around 25TB. Please check the disk size of the workers that you are using and set the Persistent Disk quota under Compute Instances appropriately.

The default disk size is 25 GB for Cloud Dataflow Shuffle batch pipelines.

Managed Instance Groups

You would need to take that you have enough quota in the region as Dataflow needs the following quota:-

One Instance Group per Cloud Dataflow job
One Managed Instance Group per Cloud Dataflow job
One Instance Template per Cloud Dataflow job

Once you review these quotas you should be all set for running the job.

1 - https://medium.com/@harshithdwivedi/how-disabling-external-ips-helped-us-cut-down-over-80-of-our-cloud-dataflow-costs-259d25aebe74

2 - https://cloud.google.com/dataflow/docs/guides/specifying-networks

来源：https://stackoverflow.com/questions/58893082/which-compute-engine-quotas-need-to-be-updated-to-run-dataflow-with-50-workers

标签

google-compute-engine

google-cloud-dataflow

quota

Which Compute Engine quotas need to be updated to run Dataflow with 50 workers (IN_USE_ADDRESSES, CPUS, CPUS_ALL_REGIONS ..)?

问题

回答1:

CPUs

Persistent Disk

Managed Instance Groups