Kubernetes for Developers #12: Effective way of using K8 Liveness Probe

In General, Replication controller will keep running specified number of pods if application is crashed abruptly inside the Pod. However, there are situations where application might have crashed or deadlocked without terminating process. This is the place, where Kubernetes Health Probes comes into the picture.

Liveness Probe

Kubernetes checks if the container is alive through Liveness Probe and it will let kubelet to know when to restart a container. We can specify a liveness probe for each container in the pod specification. Kubernetes will periodically execute the liveness probe and restart the container if the probe fails.

Kubernetes can probe the container in three different ways.

1. HTTP GET probe

It performs an HTTP GET request on the container’s REST API resource. If the probe receives a response code between 2xx to 3xx is considered successful. If Http server returns different response code (i.e. other than 2xx to 3xx) or if it doesn’t respond at all, the probe is considered a failure and the container will be restarted by kubelet.

apiVersion: v1
kind: Pod
metadata:
  name: liveness-http
spec:
  containers:
    - name: liveness
      image: luksa/kubia-unhealthy
      ports:
        - containerPort: 8080
      livenessProbe:
        httpGet:
          path: /
          port: 8080
        initialDelaySeconds: 3
        periodSeconds: 3
        failureThreshold: 3

As per code written in the specified node.js app, Http GET request returns response code as 500 after each fifth request. So, Pod will restart automatically after 3 consecutive failures.

// run the pod yaml file
> kubectl apply -f ./liveness-http-test.yaml

// view running pod details
> kubectl get po liveness-http-test

// view Pod complete details
> kubectl describe po liveness-http-test

// view previous terminated container logs
> kubectl logs liveness-http-test --previous

Always set an initailDelaySeconds based on your application startup time. If you do not set the initial delay, liveness prob check starts immediately before application accept the requests and which leads to probe failing, eventually Pod restarts in infinite loop.

2. TCP Socket probe

It opens a TCP connection to the specified port of the container. If the connection is established successfully, the probe is successful. Otherwise, the container will be restarted by kubelet.

apiVersion: v1
kind: Pod
metadata:
  name: liveness-tcp
spec:
  containers:
    - name: goproxy
      image: k8s.gcr.io/goproxy:0.1
      ports:
        - containerPort: 8080
      livenessProbe:
        tcpSocket:
          port: 8080
        initialDelaySeconds: 3
        periodSeconds: 3

3. Exec probe

It executes a specified command inside the container and checks the command’s exit status code. If the status code is 0, the probe is successful. Otherwise, the container will be restarted by kubelet.

apiVersion: v1
kind: Pod
metadata:
  name: liveness-exec
spec:
  containers:
    - name: liveness
      image: k8s.gcr.io/busybox
      args:
        - /bin/sh
        - -c
        - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
      livenessProbe:
        exec:
          command:
            - cat
            - /tmp/healthy
        initialDelaySeconds: 5
        periodSeconds: 5

When the container starts, it executes this command:

/bin/sh -c "touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600"

For the first 30 seconds of the container's life, there is a /tmp/healthy file. So, during the first 30 seconds, the command cat /tmp/healthy returns a success code. After 30 seconds, cat /tmp/healthy returns a failure code.

Effective liveness probe check

you should always define a liveness probe. Without one, Kubernetes has no way of knowing whether your app is responding to HTTP requests. So, liveness probe helps us to restart the container when the application is unresponsive state.

Configure the liveness probe to perform request on a specific URL path (i.e. /health)
Make sure HTTP GET /health does not require validation or authentication
Always check only internals of the app and should not depend on external services/dependencies.
liveness probe should not return a failure when the server cannot connect to the database. If the underlying cause is in the database itself, restarting the web server container will not fix the problem.
HTTP GET /health probe should not use many computational resources and should not take too long to complete. It supposed get response back in one second.
liveness Prob failureThreshold default value is 3. So, it is not mandatory to specify again in pod yaml file.
Do not set the same specification for Liveness and Readiness Probe
Try to avoid "exec" probes as there are known problems with them
Exit the process when uncaught exception occurs (i.e. kubelet will restart the container automatically) instead of signaling to liveness probe
Liveness probe should be used as a recovery mechanism only when the process is not responsive.

Kubernetes for Developers Journey.

Happy Coding :)

Coders Classroom

Search This Blog

Kubernetes for Developers #12: Effective way of using K8 Liveness Probe

Comments

Post a Comment