Saturday, July 24, 2021

Kubernetes for Developers #20: Create Automated Tasks using Jobs and CronJobs

In general, we use K8 Deployment object for creating Pods and running continuously without stopping. However, there are cases where we want to run a task and terminate once it is completed. such as data backups, exporting logs, batch process, sending email etc.

Kubernetes Job object is best suitable to do these kind of tasks. It creates a Pod for given task and terminates successfully once task is completed.

  • Kubernetes Job creates a Pod to run the task
  • Pod should be stopped once the task is completed and must not run always as K8 Deployment. Hence, Pod restartPolicy must be set to “Never” or “OnFailure”
  • Kubernetes Job can create multiple Pods by configuring parallelism
  • Job will be scheduled to another worker node automatically in case of current worker node failure
  • Set Job “completions” attribute to create total number of pods at each run. Default value is 1
  • Set Job “parallelism” attribute to spin up total number of pods in parallel at each run. Default value is 1
  • Set Job “activeDeadlineSeconds” attribute to specify how long Job should wait for the Pod to finish. Pod will be terminated automatically if it is running beyond specified time.

Create Job with Single Pod

As per below yaml, single Pod will be trigged as we are setting completions and parallelism attribute values to 1

apiVersionbatch/v1
kindJob
metadata:
  namejob-example
spec:
  completions1
  parallelism1
  activeDeadlineSeconds10
  template:
    spec:
      restartPolicyNever
      containers:
        - namebusybox
          imagek8s.gcr.io/busybox
          command:
            - echo
            - "Hello K8 Job"


Save above yaml content as "job.yaml" and run following kubectl command
// create job object from yaml file
$ kubectl apply -f job.yaml
job.batch/job-example created

// display all job objects
$ kubectl get jobs
NAME          COMPLETIONS   DURATION   AGE
job-example   1/1           6s         1m

// check the pod which is created by job
$ kubectl get po
NAME                    READY   STATUS              RESTARTS   AGE
job-example-pdchw       0/1     Completed           0          1m

// view logs from the pod
$ kubectl logs job-example-pdchw
Hello K8 Job

// delete the job
$ kubectl delete job/job-example
job.batch "job-example" deleted

Create Job with Multiple Pods Sequentially

As per below yaml, multiple Pods gets triggered sequentially one after another (i.e after completion of each Pod) until it reaches to 3 as we are setting completions: 3

First creates one pod, and when the pod completes, it creates second pod and so on, until it completes all 3 pods. 

apiVersionbatch/v1
kindJob
metadata:
  namejob-example-sequential
spec:
  completions3
  parallelism1
  template:
    spec:
      restartPolicyNever
      containers:
        - namebusybox
          imagek8s.gcr.io/busybox
          command:
            - echo
            - "Hello K8 Job"


Save above yaml content as "job-sequential.yaml" and run following kubectl command
// create job object from yaml file
$ kubectl apply -f job-sequential.yaml
job.batch/job-example-sequential created

// display all job objects
$ kubectl get job
NAME                    COMPLETIONS   DURATION   AGE
job-example-sequential   1/3           18s        25s

// 2nd Pod is creating after completion of 1st Pod
$ kubectl get pod
NAME                                  READY   STATUS              RESTARTS   AGE
pod/job-example-sequential-hl5zn       0/1     ContainerCreating   0          1s
pod/job-example-sequential-srf62       0/1     Completed           0          7s

// 3rd Pod is creating after completion of 2nd Pod
$ kubectl get pod
NAME                                  READY   STATUS              RESTARTS   AGE
pod/job-example-sequential-671zn       0/1     ContainerCreating   0          1s
pod/job-example-sequential-hl5zn       0/1     Completed           0          11s
pod/job-example-sequential-srf62       0/1     Completed           0          17s

// All three Pods are completed succefully
$ kubectl get pod
NAME                                  READY   STATUS              RESTARTS   AGE
pod/job-example-sequential-671zn       0/1     Completed           0          5s
pod/job-example-sequential-hl5zn       0/1     Completed           0          13s
pod/job-example-sequential-srf62       0/1     Completed           0          19s

Create Job with Multiple Pods in Parallel

As per below yaml, two Pods gets triggered in parallel as we are setting parallelism : 2

First creates two pods in parallel, and when any one pod completes, it creates third pod and so on, until it completes all 3 pods. 
apiVersionbatch/v1
kindJob
metadata:
  namejob-example-parallel
spec:
  completions3
  parallelism2
  template:
    spec:
      restartPolicyNever
      containers:
        - namebusybox
          imagek8s.gcr.io/busybox
          command:
            - echo
            - "Hello K8 Job"


Save above yaml content as "job-parallel.yaml" and run following kubectl command
// create job object from yaml file
$ kubectl apply -f job-parallel.yaml
job.batch/job-example-parallel created

// display all job objects
$ kubectl get job
NAME                    COMPLETIONS   DURATION   AGE
job-example-parallel     2/3           18s        25s

// Two Pods are triggered in parallel at same time
$ kubectl get pod
NAME                             READY   STATUS              RESTARTS   AGE
pod/job-example-parallel-g7jgx   0/1     ContainerCreating   0          1s
pod/job-example-parallel-vjl9w   0/1     ContainerCreating   0          1s

// 3rd Pod is creating after completion of one of the pod
$ kubectl get pod
NAME                            READY   STATUS              RESTARTS   AGE
pod/job-example-parallel-vjl9w   0/1     Completed           0          21s
pod/job-example-parallel-g7jgx   0/1     Completed           0          21s
pod/job-example-parallel-58v8j   0/1     ContainerCreating   0          11s

Create CronJob

CronJob is used for creating periodic and recurring tasks. It runs a job periodically on a given schedule, written in Cron format.

CronJob creates Job object from the jobTemplate property configured in the Cronjob yaml

# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │                                   7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * *
apiVersionbatch/v1
kindCronJob
metadata:
  namecronjob-example
spec:
  schedule"*/2 * * * *"
  jobTemplate:
    spec:
      completions1
      parallelism1
      template:
        spec:
          restartPolicyNever
          containers:
            - namebusybox
              imagek8s.gcr.io/busybox
              command:
                - echo
                - "Hello K8 Job"

As per above yaml, job is scheduled for every 2 minutes.

// create cronjob object from yaml file
$ kubectl apply -f cronjob.yaml
cronjob.batch/cronjob-example created

// display all cronjob objects
$ kubectl get cronjob
NAME              SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob-example   */2 * * * *   False     0        11s             3m6s

// display pods created by cronjob
$ kubectl get pod
NAME                                 READY   STATUS              RESTARTS   AGE
pod/cronjob-example-27119250-khqs8   0/1     Completed           0          3m4s
pod/cronjob-example-27119251-4c5kc   0/1     Completed           0          2m4s

Kubernetes for Developers Journey.
Happy Coding :)

1 comment: