问题
I've just deployed my app in the app engine (www.mibar.es) and when it is not in use for a long while, it takes more than 40 seconds for it to wake up and serve the first request. After that one, it takes a second to serve the next requests. How can I reduce that time? Why is not always ready? How can I make it to be always ready?
I doesn't show any error on the GCP console and searching for this same problem I've found out about https://cloud.google.com/appengine/docs/standard/java11/configuring-warmup-requests#enabling_warmup_requests
So this should be included in the yaml file for:
inbound_services:
- warmup
And I wonder if there is anything else that I should do or if anyone has had the same problem:
runtime: java11
env: standard
instance_class: F4
handlers:
- url: /(.*)
script: auto
secure: always
- url: .*
script: auto
automatic_scaling:
min_idle_instances: automatic
max_idle_instances: automatic
min_pending_latency: automatic
max_pending_latency: automatic
max_instances: 1
network: {}
New UPDATED config Yaml with min-instances 1 that still takes +30 to send first request:
runtime: java11
env: standard
instance_class: F4
handlers:
- url: /(.*)
script: auto
secure: always
- url: .*
script: auto
automatic_scaling:
min_idle_instances: automatic
max_idle_instances: automatic
min_pending_latency: automatic
max_pending_latency: automatic
min_instances: 1
max_instances: 1
network: {}
Thanks for your help and let me know if you need any other config file.
回答1:
With the current settings that you have, this is expected to happen after a certain period of time. If your app does not receive any request, it will scale down to 0 instances, therefore once a request arrives, it has to create an instance again to be able to serve the traffic and this "latency" is added to the request.
Warmup requests will help speeding up the process of creating the instance when there is an increase of load to your application. But as the documentation states, sometimes instead of a warmup request, a loading request is issued, specially when the application has 0 instances serving and has to create another one:
In some situations loading requests are sent instead: for example, if the instance is the first one being started up, or if there is a steep ramp-up in traffic
To overcome that, you can have a minimum of 1 instance running or even make use of min_idle_instances
.
来源:https://stackoverflow.com/questions/62211259/long-response-time-for-first-request-40-seconds-on-my-appengine-standard-jav