问题
We are using a private GCP account and we would like to process 30 GB of data and do NLP processing using SpaCy. We wanted to use more workers and we decided to start with a maxiumn number of worker of 80 as show below. We submited our job and we got some issue with some of the GCP standard user quotas:
QUOTA_EXCEEDED: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 8.0 in region XXX
So I decided to request some new quotas of 50 for IN_USE_ADDRESSES
in some region (it took me few iteration to find a region who could accept this request). We submited a new jobs and we got new quotas issues:
QUOTA_EXCEEDED: Quota 'CPUS' exceeded. Limit: 24.0 in region XXX
QUOTA_EXCEEDED: Quota 'CPUS_ALL_REGIONS' exceeded. Limit: 32.0 globally
My questions is if I want to use 50 workers for example in one region, which quotas do I need to changed ? The doc https://cloud.google.com/dataflow/quotas doesn't seems to be up to date since they only said " To use 10 Compute Engine instances, you'll need 10 in-use IP addresses.". As you can see above this is not enought and other quotas need to be changed as well. Is there some doc, blog or other post where this is documented and explained ? Just for one region there are 49 Compute Engine quotas that can be changed!
回答1:
I would suggest that you start using Private IP's instead of Public IP addresses. This would help in you in 2 ways:-
- You can bypass some of the IP address related quotas as they are related to Public IP addresses.
- Reduce costs significantly by eliminating network egress costs as the VM's would not be communicating with each other over public internet. You can find more details in this excellent article [1]
To start using the private IP's please follow the instructions as mentioned here [2]
Apart from this you would need to take care of the following quota's
CPUs
You can increase the quota for a given region by setting the CPUs
quota under Compute Engine
appropriately.
Persistent Disk
By default each VM needs a storage of 250 GB therefore for 100 instances it would be around 25TB. Please check the disk size of the workers that you are using and set the Persistent Disk
quota under Compute Instances
appropriately.
The default disk size is 25 GB for Cloud Dataflow Shuffle batch pipelines.
Managed Instance Groups
You would need to take that you have enough quota in the region as Dataflow needs the following quota:-
- One Instance Group per Cloud Dataflow job
- One Managed Instance Group per Cloud Dataflow job
- One Instance Template per Cloud Dataflow job
Once you review these quotas you should be all set for running the job.
1 - https://medium.com/@harshithdwivedi/how-disabling-external-ips-helped-us-cut-down-over-80-of-our-cloud-dataflow-costs-259d25aebe74
2 - https://cloud.google.com/dataflow/docs/guides/specifying-networks
来源:https://stackoverflow.com/questions/58893082/which-compute-engine-quotas-need-to-be-updated-to-run-dataflow-with-50-workers