How to restart Google App Engine Standard Service

给你一囗甜甜゛ 提交于 2021-02-17 03:53:51

问题


Context: I have an app that serves interactive graphs and data analysis. In order to calculate plots and data summaries, it uses a dataset that is loaded upon App initialization by querying google BigQuery. The data is then kept as a global variable (in memory) and is used in all data calculations and plots that might be run by different users (each user saves in their session their own filters/mask).

This dataset changes in BigQuery once per day during the night (I know the exact datetime of refresh). Once the data is refreshed in BigQuery, I want the global variable of the dataset to be refreshed.

I know that the proper solution would be to call a Database on each user request, but BigQuery high delay on requests doesn't make this a good solution, and I can't use another DB.

The only solution I've came across so far is to restart the Google App Engine service (all instances) after BigQuery data refresh. Please note that this should be a scheduled action, done programatically.

My questions:

  • In case restarting the service is the best possible solution, how should I be restarting the service?
  • In case there is another way to accomplish what I want, please let me know

回答1:


It's likely good practice to cache your dataset as you're doing; if you know the data hasn't changed. then there's no need to requery BigQuery for it.

However, your dataset does change, just once per day.

So, I think your approach should be to revise your app so that it refreshes the cached copy of your BigQuery dataset every day and stop|block your users from querying the dataset as it changes.

You actually need only change the dataset if a user requests it (there's no need to refresh the dataset on days when no users need it), so, depending on the time the refresh takes and your users' expectations on latency, you could trigger the refresh by a user request: has the dataset changed? If so, block this request, refresh the data and then respond to the user.

I assume you've already solved the problem that your users' data plots and calculations will differ for different datasets.




回答2:


One possible approach would be to trigger the running instances to exit (by themselves, i.e. commit suicide) once the BQ dataset is updated and leave GAE start new/replacement instances, which will load the updated dataset.

The trigger can be based on memcache, datastore or cloud storage/GCS (all faster than BQ - less penalty for checking them in every request). You want to be certain that the trigger doesn't also affect the freshly started instances:

  • make the trigger be, for example, the timestamp of the most recent BQ dataset update
  • add a global variable with the timestamp of the dataset loading in memory
  • the trigger would fire when memcache/datastore timestamp is ~24h (or just "a lot") newer than the one in memory

For the action causing the exit I'd try:

  • a regular sys.exit(0) call (not quite sure if/how this works on GAE)
  • raising an exception (not so nice, it'll leave nasty traces in the logs). If you use it try to make it as clear as possible to minimize the chances of being accidentally interpreted as a real failure. Maybe something like:

    assert False, "Intentional crash to force an instance restart"
    

Another possible approach would be to force an instance restart from outside - by re-deploying the application using the same version string. The outage associated with the instances' restarts caused by re-deploying the same version is actually why I dislike using the service version based environment implementations, see Continuous integration/deployment/delivery on Google App Engine, too risky?

But for this to work you need some other environment(s) to trigger and execute the deployment. It could be some other GAE service or even a Cloud Function (in which case using a Storage event trigger would eliminate the need for explicitly polling for the dataset updated condition).




回答3:


I finally found a way to restart all instances programatically, by using the Python API discovery Client and a service account. It first gets the list of active instances and delets all of them. Then, performs a simple request to initiate one of them.

import requests
from apiclient.discovery import build
from google.oauth2 import service_account

credentials = service_account.Credentials.from_service_account_file('credentials.json')
scoped_credentials = credentials.with_scopes(['https://www.googleapis.com/auth/appengine.admin',"https://www.googleapis.com/auth/cloud-platform"])
appengine = build(serviceName="appengine",version="v1",credentials=scoped_credentials)

VERSION_ID = "version_id"
PROJECT_ID = "project_id"
SERVICE_ID = "appengine_service_name"
APP_URL = "http://some_url.com"

active_instances_dict = appengine.apps().services().versions().instances().list(servicesId=SERVICE_ID,appsId=PROJECT_ID,versionsId=VERSION_ID).execute()
list_of_instances = active_instances_dict["instances"]

for instance in list_of_instances:
    appengine.apps().services().versions().instances().delete(servicesId=SERVICE_ID,appsId=PROJECT_ID,
                  versionsId=VERSION_ID,instancesId=instance["id"]).execute()

requests.get(url=APP_URL)


来源:https://stackoverflow.com/questions/55203095/how-to-restart-google-app-engine-standard-service

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!