Solved: Re: Graceful shutdown of GAE service with manual s...

rubenvann · 07-26-2024 04:26 AM

I have a Golang application that I deploy as an app engine standard version with manual scaling. When I deploy a new version, the application receives a GET request to the `/_ah/stop` endpoint. The documentation only briefly mentions requests to this endpoint:

> When instances are stopped, an /_ah/stop request appears in the logs. If there is an /_ah/stop handler or a registered shutdown hook, it has 30 seconds to complete before shutdown occurs.

Regardless of what I do, my handler returns a 500, but only when deploying. The error contains a "line" field with a "logMessage" that says "Process terminated because the backend was stopped.". (I don't think this is actually true, since I see logging from the service from after this message.)

I have tried many variations for the stop handler, but even when the handler does nothing but reply with a 200 OK, the handler returns a 500 when the previous version is stopped when I deploy.

Minimal reproducible example:

main.go:

package main

import (
	"log"
	"net/http"
)

func main() {
	mux := http.NewServeMux()
	mux.HandleFunc("/_ah/start", func(w http.ResponseWriter, r *http.Request) {})
	mux.HandleFunc("/_ah/stop", stopHandler)

	server := &http.Server{
		Addr:    ":8080",
		Handler: mux,
	}

	server.ListenAndServe()
}

func stopHandler(w http.ResponseWriter, r *http.Request) {
	log.Println("Returning from stop handler")
}

app.yaml:

runtime: go122

service: my-service

handlers:
- url: /.*
  script: auto

manual_scaling:
  instances: 1

instance_class: B1

First deploy it with "gcloud app deploy". Then run "gcloud app deploy" again so that the previous instance gets shut down. I see a 500 response from "/_ah/stop" in the request logs, and the line "Returning from log handler" 4 ms after that.

Note that the 500 response takes 5ms, so it's not the case that the server doesn't respond in 30 seconds. An example of how to do a graceful shutdown (e.g. one that doesn't cause errors) would be appreciated.

jaydubu

Hi @rubenvann,

As explained by @domdomegg on this Stack Overflow post,

“This error occurs when the App Engine instance is being shut down. 
If a version that is serving traffic (requests) is updated, the requests that are in the middle of getting processed will be dropped and an error will be seen.”

@domdomegg suggested to try creating the version and migrate the traffic gradually like:

VERSION_ID=$(date +%Y%m%dt%H%M%S)
gcloud app deploy --version $VERSION_ID --no-promote
gcloud app services set-traffic --splits $VERSION_ID=1 --migrate

Also, add /_ah/warmup handler to support this:

inbound_services: 
- warmup

Hope this helps.

View solution in original post

NoCommandLine

What about deploying without shutting down the currently running version (see docs)? You then migrate traffic to the new version and then go and shut down the old version

......NoCommandLine ......
https://nocommandline.com

Analytics & GUI for
App Engine & Datastore Emulator

jaydubu

Hi @rubenvann,

As explained by @domdomegg on this Stack Overflow post,

“This error occurs when the App Engine instance is being shut down. 
If a version that is serving traffic (requests) is updated, the requests that are in the middle of getting processed will be dropped and an error will be seen.”

@domdomegg suggested to try creating the version and migrate the traffic gradually like:

VERSION_ID=$(date +%Y%m%dt%H%M%S)
gcloud app deploy --version $VERSION_ID --no-promote
gcloud app services set-traffic --splits $VERSION_ID=1 --migrate

Also, add /_ah/warmup handler to support this:

inbound_services: 
- warmup

Hope this helps.

rubenvann

Thanks both. I will need to investigate how to solve this in Terraform...

Graceful shutdown of GAE service with manual scaling