Deploying Sidecar Containers: Automating Docker Volume Backups to S3 and Google Drive

Docker tutorial - IT technology blog
Docker tutorial - IT technology blog

The “Data Loss” Nightmare on Docker

Docker professionals are no strangers to the “cold sweat” feeling when a server dies at 2 AM. Important Docker Volumes containing critical data suddenly vanish without a trace. While Docker Volumes provide persistent storage, leaving them on the host without an external backup plan is a high-stakes gamble.

Previously, I used to write Bash scripts, install zip and aws-cli directly on the host machine, and set up a Cronjob. This worked fine for one or two containers. However, as the system scaled, managing dozens of scripts across multiple servers became a maintenance nightmare. The system grew “cluttered” with too many unrelated auxiliary tools.

Hard-Learned Lessons from a 30-Microservice System

While running over 30 microservices for an e-commerce project, I once spent two whole days just finding the cause of a memory leak. It turned out that stuffing backup scripts to run inside the application container consumed up to 500MB of RAM per execution and failed to release it. Consequently, the main app kept crashing. After that setback, I switched entirely to the Sidecar Container pattern. Six months of “battle-testing” in production has proven this to be the cleanest and safest method.

What is a Sidecar Container? Why Use It for Backups?

Imagine a motorcycle with a small sidecar attached. In the Docker world, a Sidecar is a container that runs alongside the main container, supporting it without interfering with the core logic. It handles auxiliary tasks like logging, monitoring, or data backup.

Why is a Sidecar superior to host-based scripts?

  • Resource Isolation: If the Sidecar fails, it doesn’t bring down the main application.
  • Portability: The configuration is self-contained within docker-compose.yml. You just need to copy the file to a new server, and it works immediately without installing tools on the host.
  • Centralized Management: Each service has its own Sidecar, making it extremely easy to push data to different S3 Buckets.

Hands-on: Configuring Sidecar Backup to Amazon S3

I’ll be using the offen/docker-volume-backup image. This is a lightweight tool that supports everything from S3 and GCS to Dropbox.

Step 1: Prepare S3 Credentials

You’ll need: Access Key ID, Secret Access Key, Region, and the Bucket name. Ensure the IAM User has s3:PutObject permissions to upload files to the cloud.

Step 2: Set up Docker Compose

Suppose you have a MariaDB service that needs storage. The docker-compose.yml file would be configured as follows:

version: '3.8'

services:
  db:
    image: mariadb:10.6
    environment:
      MYSQL_ROOT_PASSWORD: secret_password
    volumes:
      - db_data:/var/lib/mysql
    restart: always

  backup:
    image: offen/docker-volume-backup:latest
    bind_volumes_from:
      - db
    environment:
      AWS_ACCESS_KEY_ID: YOUR_ACCESS_KEY
      AWS_SECRET_ACCESS_KEY: YOUR_SECRET_KEY
      AWS_REGION: ap-southeast-1
      S3_ENDPOINT: s3.amazonaws.com
      S3_BUCKET_NAME: my-app-backups
      S3_PATH: db-backups
      BACKUP_CRON_EXPRESSION: "0 2 * * *"
      BACKUP_FILENAME: "db-backup-%Y-%m-%dT%H-%M-%S.tar.gz"
      BACKUP_RETENTION_DAYS: 30
    restart: always

volumes:
  db_data:

Key parameters to note:

  • bind_volumes_from: A mechanism that helps the Sidecar automatically identify and mount all volumes from the db service.
  • BACKUP_RETENTION_DAYS: Saves you up to 70% in storage costs by automatically deleting backups older than 30 days.

Alternative Solution: Backing up to Google Drive with Rclone

If you want to leverage Google Drive’s free 15GB, rclone is a more flexible choice. You can build a minimal Sidecar like this:

services:
  app:
    image: wordpress
    volumes:
      - wp_data:/var/www/html

  backup-gdrive:
    image: rclone/rclone:latest
    volumes:
      - wp_data:/data:ro
      - ./rclone.conf:/config/rclone/rclone.conf
    entrypoint: >
      /bin/sh -c "
      while true; do
        filename=\"backup-$(date +%Y%m%d).tar.gz\"
        tar -czf /tmp/$filename /data
        rclone copy /tmp/$filename gdrive:MyBackups
        rm /tmp/$filename
        echo \"Backup completed at $(date)\"
        sleep 86400
      done"

volumes:
  wp_data:

Quick tip: Run rclone config on your local machine first to generate the configuration file before deploying it to the server.

Lessons Learned from Real-World Operations

After handling numerous incidents, I’ve derived 4 golden rules:

  1. Always compress data: Using tar -gz is mandatory. It not only saves space but also significantly reduces egress bandwidth costs when pushing to the Cloud.
  2. Prioritize Read-Only: Always add the :ro tag when mounting volumes to the Sidecar. This prevents the risk of the Sidecar accidentally overwriting or corrupting the app’s original data.
  3. Periodically verify files: Don’t blindly trust “Success” logs. Once a month, try downloading a backup file and extracting it. I’ve encountered cases where the backup was 0KB because the database was locked during copying.
  4. Never hardcode passwords: Use a .env file. Never commit Access Keys directly to GitHub unless you want to hand your AWS account over to hackers.

Conclusion

Sidecar Containers not only make your system architecture more professional but also serve as high-value “insurance” for your data. Spending 15 minutes on setup today will allow you to sleep soundly, knowing that unexpected hardware failures won’t disrupt your work.

Share: