Schedule MongoDB Backup to S3 using Kubernetes CronJob

Introduction

Kubernetes CronJob makes it very easy to run Jobs on a time-based schedule. These automated jobs run like Cron tasks on a Linux or UNIX system.

Get Started

Let’s go ahead and first create a user in MongoDB dedicated to perform the backup with minimum privileges.

mongo admin --host <hostname> --authenticationDatabase admin -u root
db.createUser({
user: 'backup_user',
pwd: 'oO9eV5cG6cF2oM1r',
roles: [{ role: 'backup',db:'admin'}]
})

Kubernetes Namespace

Create a dedicated namespace in Kubernetes to deploy the cronjob.

kubectl apply -f https://raw.githubusercontent.com/tuladhar/k8s-backup-mongodb/main/kubernetes/namespace.yaml
namespace/backup-mongodb created
kubectl config set-context --current --namespace=backup-mongodb

Kubernetes Secrets

Kubernetes Secrets allows us to store and manage sensitive information. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image.

export MONGODB_URI=mongodb://backup_user:oO9eV5cG6cF2oM1r@<mongodb-hostname>:27017kubectl create secret generic mongodb-uri --from-literal=MONGODB_URI=$MONGODB_URI

Store AWS credentials and S3 bucket URI

export AWS_ACCESS_KEY_ID=***
export AWS_SECRET_ACCESS_KEY=***
export BUCKET_URI=s3://bucket-name
export AWS_DEFAULT_REGION=us-east-1
kubectl create secret generic aws --from-literal=AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_IDkubectl create secret generic aws --from-literal=AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEYkubectl create secret generic aws --from-literal=BUCKET_URI=$BUCKET_URIkubectl create secret generic aws --from-literal=AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION

Deploy CronJob

Now we can go ahead and deploy the MongoDB backup cronjob by running the following command:

kubectl apply -f https://raw.githubusercontent.com/tuladhar/k8s-backup-mongodb/main/kubernetes/cronjob.yaml
cronjob.batch/backup-mongodb created
kubectl edit cronjob backup-mongodb
Fig: Adjust the schedule
kubectl get cronjob
NAME             SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULEbackup-mongodb   0 */1 * * *   False     0        <none>
kubectl get jobs
pods=$(kubectl get pods --selector=job-name=<job-name> --output=jsonpath={.items[*].metadata.name})kubectl logs $pods

Conclusion

Make Complex Simple.