Schedule MongoDB Backup to S3 using Kubernetes CronJob
Kubernetes CronJob makes it very easy to run Jobs on a time-based schedule. These automated jobs run like Cron tasks on a Linux or UNIX system.
In this post, we’ll make use of Kubernetes CronJob to schedule a recurring backup of MongoDB database and upload the backup archive to AWS S3. All the source code is available in GitHub Repository.
Get Started
Let’s go ahead and first create a user in MongoDB dedicated to perform the backup with minimum privileges.
Login to the MongoDB shell as a root user.
mongo admin --host <hostname> --authenticationDatabase admin -u root
Run the following command to create the backup user.
user: 'backup_user',
pwd: 'oO9eV5cG6cF2oM1r',
roles: [{ role: 'backup',db:'admin'}]
Kubernetes Namespace
Create a dedicated namespace in Kubernetes to deploy the cronjob.
kubectl apply -f
The output is similar to this:
namespace/backup-mongodb created
Let’s save the namespace for all subsequent kubectl
commands to run in that context.
kubectl config set-context --current --namespace=backup-mongodb
Kubernetes Secrets
Kubernetes Secrets allows us to store and manage sensitive information. Storing confidential information in a Secret is safer and more flexible than putting it verbatim in a Pod definition or in a container image.
Store MongoDB URI
export MONGODB_URI=mongodb://backup_user:oO9eV5cG6cF2oM1r@<mongodb-hostname>:27017kubectl create secret generic mongodb-uri --from-literal=MONGODB_URI=$MONGODB_URI
Store AWS credentials and S3 bucket URI
export AWS_ACCESS_KEY_ID=***
export BUCKET_URI=s3://bucket-name
export AWS_DEFAULT_REGION=us-east-1kubectl create secret generic aws --from-literal=AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_IDkubectl create secret generic aws --from-literal=AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEYkubectl create secret generic aws --from-literal=BUCKET_URI=$BUCKET_URIkubectl create secret generic aws --from-literal=AWS_DEFAULT_REGION=$AWS_DEFAULT_REGION
Deploy CronJob
Now we can go ahead and deploy the MongoDB backup cronjob by running the following command:
kubectl apply -f
The output is similar to this:
cronjob.batch/backup-mongodb created
The default schedule is to run every hour. To adjust the schedule, run the following command and modify the schedule
kubectl edit cronjob backup-mongodb
After creating the cronjob, you can get its status by running the following command:
kubectl get cronjob
The output is similar to this:
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULEbackup-mongodb 0 */1 * * * False 0 <none>
As you can see from the results of the command, the cronjob has not scheduled or run any jobs yet. You can list the jobs by running the following command:
kubectl get jobs
To view the Pod logs for a job, run the following command:
pods=$(kubectl get pods --selector=job-name=<job-name> --output=jsonpath={.items[*]})kubectl logs $pods