Tuesday, December 14, 2021

Kubernetes for Developers #26: Managing Container CPU, Memory Requests and Limits

By default, Kubernetes doesn’t put any restrictions on using CPU and Memory for the Pod. It means, a single container in the pod can consume entire node resources. However, this makes other CPU intensive containers will slow down, Kubernetes services may become unresponsive and worker node may go down with NotReady state in worst case scenario.

So, setting up CPU and Memory limits for the containers in the Pod will helps us that only fair share of resources will be allocated by Kubernetes Cluster and will not affect other Pods performance in the Node.

Kubernetes uses following YAML request and limit structure to control container CPU, Memory resources
      resources:
        requests:
          cpu: 100m
          memory: 50Mi
        limits:
          cpu: 150m
          memory: 100Mi  

requests:
  • This is the place to specify how much CPU and Memory required for a container. Kubernetes will only schedule it on a node that can give the specified request resources.
limits:

  • This is the place to specify maximum CPU and Memory allowed to use by a single container. So, the running container is not allowed to use more than specified limits.
  • limits can never be lower than the requests. If you try this, Kubernetes will throw an error and won’t let you run the container
  • CPU is a “compressible” resource. It means when a container start hitting max CPU limit, it won’t be terminated from the node, but it will throttle it and gives worse performance.
  • Memory is a “non-compressible” resource. It means when a container start hitting max Memory limit, it will be terminated from the node(Out of Memory killed).

requests and limits are on a per-container basis. It means, we must specify for each container in the Pod. The pod’s resource requests and limits are the sum of the requests and limits of all its containers.


Kubernetes CPU Resource Units

Limits and requests for CPU resources measured in cpu units. One cpu in Kubernetes equivalent to
1 AWS vCPU (or) 1 GCP Core (or) 1 Azure vCore (or) 1 Hyperthread on a bare-metal processors.

CPU resources can be specified in both fractional and milli cores. i.e., 1 Core = 1000 milli cores 

Ex: 0.2 equivalent to 200m

Kubernetes Memory Resource Units

Limits and requests for Memory are measured in bytes. So, it can be used as plain integer number or suffixing with Mi, Pi, Ti, Gi, Mi, Ki

For example, check the below configuration where container has a request of (0.5 cpu and 200Mi bytes of memory) and limit of (1 cpu and 400Mi bytes of memory)

      resources:
        requests:
          cpu: 500m
          memory: 200Mi
        limits:
          cpu: 1000m
          memory: 400Mi  

create following yaml content and save as "pod-cpu-memory-limit.yaml"
apiVersion: v1
kind: Pod
metadata:
  name: pod-cpu-memory-limit
spec:
  containers:
    - name: nginx
      image: nginx:alpine
      ports:
        - containerPort: 80
          protocol: TCP
      resources:
        requests:
          cpu: 100m
          memory: 50Mi
        limits:
          cpu: 150m
          memory: 100Mi  
    - name: alpine
      image: alpine
      command:
        [
        "sh",
        "-c",
        "while true; do echo date;sleep 10;done"
        ]
      resources:
        requests:
          cpu: 50m
          memory: 30Mi
        limits:
          cpu: 60m
          memory: 50Mi  



The above pod has two containers.
  • First container (i.e. nginx) has request of 100m or 0.1 CPU and 50Mi memory, max limit of 150m CPU and 100Mi
  • Second container (i.e. alpine) has request of 50m CPU and 30Mi memory, max limit of 60m CPU and 50Mi

So, Pod has total request of 150m CPU and 80Mi of memory, total max limit of 210m CPU and 150Mi of memory

run following query to get node capacity(i.e. total cpu and memory of the node) and allocatable(i.e. total resources allocatable for pods by the scheduler) 
// check all worker nodes capacity
$ kubectl describe nodes
Capacity:
  cpu:                4
  memory:             7118488Ki
Allocatable:
  cpu:                4
  memory:             7016088Ki

// create pod
$ kubectl apply -f pod-cpu-memory-limit.yaml
pod/pod-cpu-memory-limit created

// display pods
$ kubectl get po
NAME                    READY   STATUS      RESTARTS  
pod-cpu-memory-limit    2/2     Running     0  

// view CPU and Memory limits for all containers in the Pod
$ kubectl describe pod/pod-cpu-memory-limit
Name:         pod-cpu-memory-limit
Namespace:    default
Containers:
  nginx:
    Image:          nginx:alpine
    Limits:
      cpu:     150m
      memory:  100Mi
    Requests:
      cpu:        100m
      memory:     50Mi
  alpine:
    Image:         alpine
    Limits:
      cpu:     60m
      memory:  50Mi
    Requests:
      cpu:        50m
      memory:     30Mi


// Pod will be in Pending state when we specify requests are bigger than node capacity

// Create Pod using imperative style
$ kubectl run requests-bigger-pod --image=busybox--restart Never \
--requests='cpu=8000m,memory=200Mi'

// display pods
$ kubectl get po
NAME                    READY    STATUS      RESTARTS  
requests-bigger-pod      0/1     Pending       0  

// Pod in pending status due to insufficient CPU. So, check Pod details
$ kubectl describe pod/requests-bigger-pod
Name:         requests-bigger-pod
Namespace:    default
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  28s   default-scheduler  0/1 nodes are available: 1 Insufficient cpu.



// Pod created successfully when limits are bigger than node capacity

// Create Pod using imperative style
$ kubectl run limits-bigger-pod --image=busybox --restart Never \
--requests='cpu=100m,memory=50Mi' \
--limits='cpu=8000m,memory=200Mi'

// If you specify limits but do not specify requests then k8 creates requests which is equal to limits
$ kubectl run no-requests-pod --image=busybox --restart Never \
--limits='cpu=100m,memory=50Mi'

// view CPU and Memory limits
$ kubectl describe pod/no-requests-pod
Name:         no-requests-pod
Namespace:    default
Containers:
  busybox:
    Image:          nginx:alpine
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:        100m
      memory:     50Mi


Kubernetes for Developers Journey.
Happy Coding :)

Wednesday, November 17, 2021

Kubernetes for Developers #25: PersistentVolume and PersistentVolumeClaim in-detail

In the previous article (Kubernetes for Developers #24: Kubernetes Volume hostPath in-detail), we discussed about hostPath volume for persisting container data in the worker node file system. However, this data is available only to the pods which are scheduled on the same worker node. This is not a feasible solution for multi-node cluster.

This problem can be solved by using external storage volumes like awsElasticBlockStore, azureDisk, GCE PD, nfs etc. However, developer must have knowledge on the network storage infrastructure details to use in the pod definition.

It means, when the developer wants to use awsEBS volume in the Pod, the developer should know the details of EBS ID and file type. If there is a change in the storage details, developer must make changes in all the pod definitions.

Kubernetes solves the above problem by using PersistentVolume and PersistentVolumeClaim. It decouples underlying storage details from the application pod definitions. Developers don’t have to know the underlaying storage infrastructure which is being used. It is more of cluster administrator responsibility.

As per diagram,
  • PersistentVolumes(PV) are cluster-level resources like worker nodes. It not belonging to any namespace.
  • PersistenVolumeClaims(PVC) can be created in a specific namespace only and it can be used by pods within same namespace only.
  • Cluster Administrator sets up cloud storage infrastructure i.e., AWS Elastic Block Storage and GCE Persistent Disk as per the need.
  • Cluster Administrator creates Kubernetes PersistentVolumes (PV) with different size and access modes by referring AWS EBS/GCE PD as per application requirements.
  • Whenever pod requires persistent storage, Kubernetes Developer creates PersistentVolumeClaim (PVC) with minimum size and access mode, and Kubernetes finds an adequate Persistent Volume with same size and access mode and binds volume (PV) to the claim (PVC).
  • Pod refers PersistentVolumeClaim (PVC) as volume whenever it is required.
  • Once PersistentVolume is bound to PVC, it cannot be used by others until it is released (i.e., we must delete PVC to reuse PV by others).
  • Kubernetes Developer don’t have to know the underlaying storage details. They just have to create PersistentVolumeClaim (PVC) whenever pod requires persistent storage.
Access Modes

The following access modes are supported by PersistentVolume(PV)
  • ReadWriteOnce (RWO): Only single worker node can mount the volume for reading and writing at the same time.
  • ReadOnlyMany (ROX): Multiple worker nodes can mount the volume for reading at the same time.
  • ReadWriteMany (RWX): Multiple worker nodes can mount the volume for reading and writing at the same time.

Reclaim Policy

Reclaim Policy tell us what happens to PersistentVolume(PV) when the PersistentVolumeClaim(PVC) is deleted.
  • Delete: It deletes volume contents and makes the volume available to be claimed again as soon as PVC is deleted.
  • Retain: PersistentVolume(PV) contents will be persisted after PVC is deleted and it cannot be re-used until Cluster Administrator reclaim the volume manually.
In general, Cluster Administrator creates multiple PersistantVolumes(PV) by using any one of cloud storages i.e. AWS EBS or GCE PD
// Ex: creating aws EBS from cli
$ aws ec2 create-volume \
  --availability-zone=eu-east-1a
  --size=10 --volume-type=gp2 ebs-data-id

Cluster Administrator creates following PV by using ebs-id
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-vol1
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ""
  awsElasticBlockStore:
    volumeID: ebs-data-id
    fsType: ext4


For local testing, lets use hostPath PersistentVolume. Create a directory called “/mydata” and “index.html” file under “mydata” directory.
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-vol1
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ""
  hostPath:
    path: "/mydata"

As per above yaml, Volume configured at “/mydata” host directory with the size of 1Gi and an access mode of “ReadWriteOnce(RWO)

save above yaml content as "pv-vol1.yaml" and run the following kubectl command
// create persistentvolume(pv)
$ kubectl apply -f pv-vol1.yaml
persistentvolume/pv-vol1 created

// display pv
$ kubectl get pv
NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      
pv-vol1   1Gi        RWO            Retain          Available
Here, status showing "Available". It means, PV is not yet bound to a PersistentVolumeClaim (PVC)

Next step is to create persistentvolumeclaim(pvc) to request physical storage for the pod
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-vol-1
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi
  storageClassName: ""

save above yaml content as "pvc-vol-1.yaml" and run the following kubectl command
// create pvc
$ kubectl apply -f pvc-vol-1.yaml
persistentvolumeclaim/pvc-vol-1 created

// display pvc
$ kubectl get pvc
NAME        STATUS   VOLUME   CAPACITY   ACCESS MODES  
pvc-vol-1   Bound    pv-vol1   1Gi        RWO        
Here, PersistentVolumeClaim is bound to PersistentVolume i.e. pv-vol1

Next step is to create a pod to use persistentvolumeclaim as a volume
apiVersion: v1
kind: Pod
metadata:
  name: pod-pv-pvc
spec:
  containers:
    - name: nginx
      image: nginx:alpine
      ports:
        - containerPort: 80
          protocol: TCP
      volumeMounts:
        - name: pod-pv-vol
          mountPath: /usr/share/nginx/html
  volumes:
    - name: pod-pv-vol
      persistentVolumeClaim:
        claimName: pvc-vol-1

save above yaml content as "pod-pv-pvc.yaml" and run the following kubectl command
// create pod
$ kubectl apply -f pod-pv-pvc.yaml
pod/pod-pv-pvc created

// display pods
$ kubectl get po
NAME             READY   STATUS      RESTARTS   AGE
pod-pv-pvc        1/1     Running     0          1m

run the following kubectl command to forward a port from local machine to the pod
// syntax
// kubectl port-forward <pod-name> <local-port>:<container-port>
$ kubectl port-forward pod-pv-pvc 8081:80
Forwarding from 127.0.0.1:8081 -> 80
Forwarding from [::1]:8081 -> 80

$ curl http://localhost:8081
text message text Tue Nov  16 12:01:10 UTC 2021

We have successfully configured a Pod to use PersistentVolumeClaim as physical storage. 

run the following kubectl commands to delete the resources
$ kubectl delete pod pod-pv-pvc
$ kubectl delete pvc pvc-vol-1
$ kubectl delete pv pv-vol1


Kubernetes for Developers Journey.
Happy Coding :)

Saturday, November 6, 2021

Kubernetes for Developers #24: Kubernetes Volume hostPath in-detail

In the previous article (Kubernetes for Developers #23: Kubernetes Volume emptyDir in-detail), we discussed about emptyDir volume for storing and sharing data among multiple/single container(s) in the pod. However, emptyDir volume and its contents get deleted automatically when the Pod is deleted from the worker node.

Kubernetes hostPath volume helps us to persist volume contents even after pod deleted from the worker node.

K8 hostPath volume mounts a file or directory from the worker node filesystem into the pod.

A pod running on the same worker node can only mount to the file/directory of that node.
  • It is useful when the container wants to access docker system files from the host (i.e., /var/lib/docker)
  • It is useful when the container needs to access kubeconfig file (or) CA certificates (or) /var/logs from the host
  • It is useful when the container needs to access host /sys files for cAdvisor
  • It is useful when the container wants to check given path exists in the host before running
Kubernetes hostPath volume supports following types while mounting

Type

Description

Directory

A directory must exist in the specified path on the host

DirectoryOrCreate

An empty directory will be created when the specified path does not exist on the host

File

A file must exist in the specified path on the host

FileOrCreate

An empty file will be created when the specified path does not exist on the host

Socket

A UNIX socket must exist in the specified path



apiVersion: v1
kind: Pod
metadata:
  name: pod-vol-hostpath
spec:
  containers:
    - name: alpine
      image: alpine
      command:
        [
          "sh",
          "-c",
          'while true; do echo "random message text `date`" >> html/index.html;sleep 10;done',
        ]
      volumeMounts:
        - name: vol-hostpath
          mountPath: /html
    - name: nginx
      image: nginx:alpine
      ports:
        - containerPort: 80
          protocol: TCP
      volumeMounts:
        - name: vol-hostpath
          mountPath: /usr/share/nginx/html
  volumes:
    - name: vol-hostpath
      hostPath:
        path: /mydoc
        type: DirectoryOrCreate


As per above yaml ,

  1. A multi-container pod gets created with volume type “hostPath” named as “vol-hostpath” and mounted on “/mydoc” directory from the host filesystem
  2. “mydoc” directory gets created automatically when a pod is assigned to a worker-node if not exists on the host filesystem as we specified volume type “DirectoryOrCreate”
  3. First “alpine” container creates random text message for every 10 seconds and appends to /html/index.html file.
  4. First “alpine” container mounted a volume at ‘/html’. So, all the new/modified files under this directory referring to “/mydoc” host filesystem
  5. Second “nginx” container mounted a same volume at ‘/usr/share/nginx/html’ (this is the default directory for nginx to serve index.html file ). As we mounted same volume which has “index.html”, nginx web server serves the file (i.e., index.html) which is created by the first container.
  6. As first container adds new random message to index.html file for every 10 seconds, we see different message each time when we request index.html from nginx webserver.
  7. Volume contents won’t be deleted on Pod termination. So, whenever the new pod is scheduled on the same node with same hostpath will see all the previous contents.

save above yaml content as "pod-vol-hostpath.yaml" and run the following kubectl command
// create pod
$ kubectl apply -f pod-vol-hostpath.yaml
pod/pod-vol-hostpath created

// display pods
$ kubectl get po
NAME                    READY   STATUS      RESTARTS   AGE
pod-vol-hostpath        2/2     Running     0          1m10s

run the following kubectl command to forward a port from local machine to the pod
// syntax
// kubectl port-forward <pod-name> <local-port>:<container-port>
$ kubectl port-forward pod-vol-hostpath 8081:80
Forwarding from 127.0.0.1:8081 -> 80
Forwarding from [::1]:8081 -> 80

run the following curl command to check random messages which are appending after every 10 seconds
$ curl http://localhost:8081
random message text Tue Nov  7 12:01:10 UTC 2021

$ curl http://localhost:8081
random message text Tue Nov  7 12:01:10 UTC 2021
random message text Tue Nov  7 12:01:20 UTC 2021
random message text Tue Nov  7 12:01:30 UTC 2021

Volume contents won’t be deleted on Pod termination. So, whenever the new pod is scheduled on the same node with same hostpath will see all the previous contents.

delete the Pod and recreate all above steps to check existing data is printing while doing curl command
// delete pod
$ kubectl delete pod/pod-vol-hostpath
pod/pod-vol-hostpath deleted

// create pod
$ kubectl apply -f pod-vol-hostpath.yaml
pod/pod-vol-hostpath created

// display pods
$ kubectl get po
NAME                    READY   STATUS      RESTARTS   AGE
pod-vol-hostpath        2/2     Running     0          1m10s

// syntax
// kubectl port-forward <pod-name> <local-port>:<container-port>
$ kubectl port-forward pod-vol-hostpath 8081:80
Forwarding from 127.0.0.1:8081 -> 80
Forwarding from [::1]:8081 -> 80


$ curl http://localhost:8081
random message text Tue Nov  7 12:01:10 UTC 2021
random message text Tue Nov  7 12:01:20 UTC 2021
random message text Tue Nov  7 12:01:30 UTC 2021
random message text Tue Nov  7 14:12:40 UTC 2021

// first 3 lines are generated by the previous pod

It is confirmed that curl command showing both previous pod generated contents and new pod contents.

Kubernetes for Developers Journey.
Happy Coding :)

Tuesday, November 2, 2021

Kubernetes for Developers #23: Kubernetes Volume emptyDir in-detail

Containers are ephemeral. It means, any container generated data gets stored into its own filesystem and will be deleted automatically if the container is deleted or restarted.

In Docker world, docker volumes provide a way to store container data into the host machine as permanent storage. However, it is less managed and limited for multi-node cluster.

Kubernetes volumes provide a way for containers to access external disk storage or share storage among containers.

Kubernetes volumes are not top-level object as like Pod, Deployment etc., however these are component of a pod and defined as part of pod YAML specification. K8 Volumes are available to all containers in the pod and must be mounted in each container specific file location.

Kubernetes supports many types of volumes like, 

  • emptyDir: Used for mounting temporary empty directory from worker node Disk/RAM
  • awsElasticBlockStore: Used for mounting AWS EBS volume into the pod
  • azureDisk: Used for mounting Microsoft Azure data disk into the pod
  • azureFile: Used for mounting Microsoft Azure File volume into the pod
  • gcePersistentDisk: Used for mounting Google PD into the pod
  • hostPath: Used for mounting Worker node filesystem into the pod
  • nfs: Used for mounting existing NFS (Network file system) into the pod
  • configMap/secret: Used for mounting these values into the pod
  • persistentVolumeClaim: Used for mounting dynamically provisioned storage into the pod

A Pod can use any number of volume types simultaneously to persist container data.

emptyDir volume


An empty directory created when a Pod is assigned to a node and remains active until pod is running. All containers in the pod can read/write the contents to the emptyDir volume. An emptyDir volume will be erased automatically once the pod is terminated from the node.

A container crashing does not remove a Pod from a node. The data in an emptyDir volume is safe across container crashes. It only erased when Pod is deleted from the node.

  • It is useful for sharing files between containers which are running in the same pod
  • It is useful for doing disk-based merge sort on large dataset where memory is low
  • It is useful when container filesystem is read-only and wants to write data temporarily.

apiVersion: v1
kind: Pod
metadata:
  name: volume-emptydir
spec:
  containers:
    - name: alpine
      image: alpine
      command:
        [
          "sh",
          "-c",
          'mkdir var/mydoc; while true; do echo "random message text `date`" >> var/mydoc/index.html;sleep 10;done',
        ]
      volumeMounts:
        - name: vol-emptydir
          mountPath: /var/mydoc
    - name: nginx
      image: nginx:alpine
      ports:
        - containerPort: 80
          protocol: TCP
      volumeMounts:
        - name: vol-emptydir
          mountPath: /usr/share/nginx/html
  volumes:
    - name: vol-emptydir
      emptyDir: {}

As per above yaml ,
  1. A multi-container pod gets created with volume type “emptyDir” named as “vol-emptydir”
  2. “vol-emptydir” volume gets created automatically when a pod is assigned to a worker-node.
  3. As name says, volume contains empty files/directories at initial stage.
  4. First “alpine” container creates random text message for every 10 seconds and appends to /var/mydoc/index.html file.
  5. First “alpine” container mounted a volume at ‘/var/mydoc’. So, all the files under this directory copied into volume (i.e., index.html file).
  6. Second “nginx” container mounted a same volume at ‘/usr/share/nginx/html’ (this is the default directory for nginx to serve index.html file ). As we mounted same volume which has “index.html”, nginx web server serves the file (i.e., index.html) which is created by the first container.
  7. As first container adds new random message to index.html file for every 10 seconds, we see different message each time when we request index.html from nginx webserver.
  8. Volume and its contents get deleted automatically when the Pod is deleted
  9. By default, Volume contents get stored on the worker node disk. However, “emptyDir” volume contents can be stored into the memory (RAM) by setting “medium” attribute

save above yaml content as "pod-vol-emptydir.yaml" and run the following kubectl command

// create pod
$ kubectl apply -f pod-vol-emptydir.yaml
pod/pod-vol-emptydir created

// display pods
$ kubectl get po
NAME                    READY   STATUS      RESTARTS   AGE
pod-vol-emptydir        2/2     Running     0          2m39s

run the following kubectl command to forward a port from local machine to the pod
// syntax
// kubectl port-forward <pod-name> <local-port>:<container-port>
$ kubectl port-forward pod-vol-emptydir 8081:80
Forwarding from 127.0.0.1:8081 -> 80
Forwarding from [::1]:8081 -> 80

run the following curl command to check random messages which are appending after every 10 seconds
$ curl http://localhost:8081
random message text Tue Nov  2 15:48:50 UTC 2021

$ curl http://localhost:8081
random message text Tue Nov  2 15:48:50 UTC 2021
random message text Tue Nov  2 15:49:00 UTC 2021
random message text Tue Nov  2 15:49:10 UTC 2021

An emptyDir volume does not persist data after pod termination. So, delete the Pod and recreate all above steps to check any existing data is printing while doing curl command
// delete pod
$ kubectl delete pod/pod-vol-emptydir
pod/pod-vol-emptydir deleted

// create pod
$ kubectl apply -f pod-vol-emptydir.yaml
pod/pod-vol-emptydir created

// display pods
$ kubectl get po
NAME                    READY   STATUS      RESTARTS   AGE
pod-vol-emptydir        2/2     Running     0          1m10s

// syntax
// kubectl port-forward <pod-name> <local-port>:<container-port>
$ kubectl port-forward pod-vol-emptydir 8081:80
Forwarding from 127.0.0.1:8081 -> 80
Forwarding from [::1]:8081 -> 80

$ curl http://localhost:8081
random message text Tue Nov  2 16:01:50 UTC 2021
It is confirmed that only new data is printing after pod recreation when using emptyDir volume.

By default, Volume contents get stored on the worker node disk. However, “emptyDir” volume contents can be stored into the memory (RAM) by setting “medium” attribute

volumes:
 - name: vol-emptydir
   emptyDir:
   medium: Memory

Kubernetes for Developers Journey.
Happy Coding :)