How do you know your container is running? That it is healthy and can accept traffic? Health checks answer these questions for you. They are run periodically by your hosting environment and evaluate if your container works correctly.
Most importantly, the hosting environment automatically restarts your containers when health checks fail. This reduces the likelihood that a container stays in a broken state.
All health checks work by executing a command or HTTP request you defined in a specified interval. After a particular number of failed checks, the container is considered unhealthy.
To enable health checks, you need to set them up first. The exact steps differ whether you use Docker (or Docker Swarm), or Kubernetes. You can find setup instructions for both environments in this article.
When your application is running on bare Docker or on Docker Swarm, Docker Healthchecks are the way to go.
How to use: add HEALTHCHECK <options> CMD <command>
in your Dockerfile.
The <options>
are:
--interval=DURATION
(default 30s)--timeout=DURATION
(default 30s)--start-period=DURATION
(default: 0s) (probe failure during startup is not counted against retry limit)--retries=N
(default 3)The <command>
can by any command you can run inside your container.
Often, curl
is used for this, like HEALTHCHECK CMD curl --fail http://localhost:8080/health || exit 1
.
But: it is beneficial to implement a custom health check using the same runtime as your app. This reduces compatibility issues, image size, and attack surface. For Node.js, a simple health check script can look as follows:
const http = require('http')
const options = {
host: 'localhost',
port: '8080',
timeout: 2000,
path: '/',
}
const request = http.request(options, (res) => {
console.log(`STATUS: ${res.statusCode}`)
if (res.statusCode === 200) {
process.exit(0)
} else {
process.exit(1)
}
})
request.on('error', function (err) {
console.log('ERROR', err)
process.exit(1)
})
request.end()
Now you can set up health checks in your Dockerfile like:
HEALTHCHECK --interval=12s --timeout=12s --start-period=30s CMD node ./healthcheck.js
Kubernetes doesn't use Docker Healthchecks. They are explicitly disabled within Kubernetes. Instead, Kubernetes has its own tools to determine whether a pod is healthy.
Liveness probes determine whether a pod is running.
If the number of failed liveness probes reaches the failureThreshold
, that pod will be killed (restarted).
Example for command probe:
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 2
Example for HTTP probe:
livenessProbe:
httpGet:
path: /health
port: 8080
httpHeaders:
- name: Custom-Header
value: Awesome
initialDelaySeconds: 3
periodSeconds: 3
For options and default values, see the official documentation.
Readiness probes determine whether a pod can receive traffic. A Kubernetes pod may be alive but not ready, e.g., when it is processing large amounts of data. In such a situation, you can make the liveness pass and fail the readiness probes, e.g., by using different HTTP endpoints for both.
Readiness probes have the same configuration options as liveness probes.
Just use readinessProbe
instead of the livenessProbe
in your YAML file.
Startup probes can help with legacy applications that take a long time to start.
You usually want to define the same probe as for liveness probes, but with a longer timeout.
What you want is to set up a startup probe with the same check as in your liveness probe, with failureThreshold * periodSeconds
long enough for your worst-case startup time.
When you define a startup probe, the liveness probes start only after the first startup probe passed.
Example:
ports:
- name: liveness-port
containerPort: 8080
hostPort: 8080
livenessProbe:
httpGet:
path: /health
port: liveness-port
failureThreshold: 1
periodSeconds: 10
startupProbe:
httpGet:
path: /health
port: liveness-port
failureThreshold: 30
periodSeconds: 10
For services that offer an HTTP interface, you can use any GET route (e.g., index.html
) for health checks.
You can also support advanced health checks in your service by implementing a custom route like /health
.
That route could, e.g., check that the database connection is available.
Health checks help to keep your services online by restarting them when they enter a failure state.
Do you use health checks in your applications? If not, open the last app you worked on and go to your Dockerfile or Kubernetes YAML file. Validate that you use one of the health checks from above. If you have no custom health endpoint in your service by now, evaluate if that would be a good option.