The other day I was working on migrating a platform at a client from Virtual Machine based deployments using Ansible to Container-based deployments using Kubernetes.
The whole migration went smooth until we started noticing
Once we reconfigured the readiness probes we saw a drop in these 502 errors but they were not eliminated entirely. It took a colleague of mine quite some time and effort to trace the problem and found that NGINX will perform a ‘fast shutdown’ when it receives the SIGTERM signal. This signal is the default one that docker uses to gracefully shut down a container.
So what happens? NGINX doesn’t gracefully shut down at all and aborts any running requests and proxies calling it will return: HTTP Status Code 502.
In order to avoid this issue we want NGINX to gracefully shut down a container instead of killing the process outright. We accomplished that by adding the following line to the NGINX Dockerfile:
STOPSIGNAL SIGQUIT
This will instruct Docker to use the SIGQUIT signal to shut down the container, and this is the signal used by NGINX to perform a graceful shutdown. More information on which signals NGINX handles and how it handles them can be found at their documentation.