Python Automation for Daily Tasks: File Cleanup, Backups, and Automated Notifications

Python tutorial - IT technology blog
Python tutorial - IT technology blog

The Problem: How Much Time Are You Wasting Every Day?

There are certain things I have to do almost every day when working with servers: checking old logs, cleaning up temp files, backing up the database, then sending a report to the team. Each task only takes 5–10 minutes. But add it all up over a week and that’s nearly an hour — spent on operations that are completely scriptable.

It’s not just about time. Doing things manually means it’s easy to forget, easy to skip a step — and the day your server crashes is exactly the day you forgot to run the backup. That’s why I gradually started scripting all of these repetitive tasks.

This article goes straight to the scripts I’m actually running in crontab every day — not academic examples. If you’ve read the posts about APScheduler or REST APIs on this blog, this part shows Python applied at the system level: file system, processes, and push notifications.

Core Concepts: Python Working with the System

To write automation scripts, you need to understand 4 key stdlib modules:

  • os / pathlib — Browse directories, check files, retrieve metadata
  • shutil — Copy, move, delete files and directories
  • subprocess — Run shell commands from Python (tar, mysqldump, rsync…)
  • smtplib / requests — Send emails or call notification webhooks

Python doesn’t replace bash for everything, but for more complex tasks — filtering files by multiple conditions, parsing JSON responses, calling HTTP APIs — the code ends up shorter and far easier to debug.

Practice: 3 Real-World Automation Scripts

Script 1 — Automatically Organize Your Downloads Folder

The ~/Downloads folder is where everything lands and nobody cleans up. This script categorizes files by extension and moves them into the appropriate subdirectory:

from pathlib import Path
import shutil

DOWNLOADS = Path.home() / "Downloads"

CATEGORIES = {
    "images":    [".jpg", ".jpeg", ".png", ".gif", ".webp", ".svg"],
    "documents": [".pdf", ".docx", ".xlsx", ".pptx", ".txt", ".md"],
    "archives":  [".zip", ".tar", ".gz", ".rar", ".7z"],
    "scripts":   [".py", ".sh", ".js", ".ts"],
    "videos":    [".mp4", ".mkv", ".mov", ".avi"],
}

def organize_downloads():
    moved = 0
    for file in DOWNLOADS.iterdir():
        if not file.is_file():
            continue
        ext = file.suffix.lower()
        for category, extensions in CATEGORIES.items():
            if ext in extensions:
                dest_dir = DOWNLOADS / category
                dest_dir.mkdir(exist_ok=True)
                dest = dest_dir / file.name
                # Avoid overwriting if a file with the same name already exists
                if dest.exists():
                    dest = dest_dir / f"{file.stem}_{file.stat().st_mtime_ns}{ext}"
                shutil.move(str(file), str(dest))
                moved += 1
                break
    print(f"Moved {moved} file(s).")

if __name__ == "__main__":
    organize_downloads()

The first time I ran this on my machine, the script rounded up 340 files that had piled up over several months. Add it to crontab and you’re done — every morning the folder is clean.

Script 2 — Automated MySQL Database Backup with Rotation

This is the script I actually use in production on my VPS. The nice part is that it automatically deletes backups older than N days so they don’t eat up disk space:

import subprocess
import os
from pathlib import Path
from datetime import datetime, timedelta

DB_HOST = "localhost"
DB_USER = "backup_user"
DB_PASS = os.environ["MYSQL_BACKUP_PASS"]  # Read from env, NEVER hardcode
DB_NAME = "myapp_db"
BACKUP_DIR = Path("/var/backups/mysql")
RETENTION_DAYS = 7

def backup_mysql():
    BACKUP_DIR.mkdir(parents=True, exist_ok=True)
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    filename = BACKUP_DIR / f"{DB_NAME}_{timestamp}.sql.gz"

    cmd = [
        "mysqldump",
        f"--host={DB_HOST}",
        f"--user={DB_USER}",
        f"--password={DB_PASS}",
        "--single-transaction",
        "--quick",
        DB_NAME,
    ]

    with open(filename, "wb") as f:
        dump = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        gzip = subprocess.Popen(["gzip"], stdin=dump.stdout, stdout=f)
        dump.stdout.close()
        gzip.communicate()
        dump.wait()  # Wait for mysqldump to finish so we get the correct returncode

    if dump.returncode != 0:
        raise RuntimeError(f"mysqldump failed: {dump.stderr.read().decode()}")

    print(f"Backup complete: {filename} ({filename.stat().st_size / 1024:.1f} KB)")
    return filename

def cleanup_old_backups():
    cutoff = datetime.now() - timedelta(days=RETENTION_DAYS)
    removed = 0
    for f in BACKUP_DIR.glob(f"{DB_NAME}_*.sql.gz"):
        mtime = datetime.fromtimestamp(f.stat().st_mtime)
        if mtime < cutoff:
            f.unlink()
            removed += 1
    if removed:
        print(f"Removed {removed} backup(s) older than {RETENTION_DAYS} days.")

if __name__ == "__main__":
    backup_mysql()
    cleanup_old_backups()

Important note: the password is read from the environment variable MYSQL_BACKUP_PASS and is never hardcoded in the script. I made that mistake once and ended up committing the password to Git — a costly lesson learned.

Script 3 — Monitor a Directory and Send Telegram Notifications

This script polls a directory every 30 seconds. When a new file is detected, it immediately sends a ping via Telegram Bot. Works great for client upload folders, auto-generated nightly report directories, or anywhere you need to know the moment a new file arrives:

import time
import requests
import os
from pathlib import Path

WATCH_DIR = Path("/var/www/uploads")
TG_TOKEN = os.environ["TELEGRAM_BOT_TOKEN"]
TG_CHAT_ID = os.environ["TELEGRAM_CHAT_ID"]
POLL_INTERVAL = 30  # seconds

def send_telegram(message: str):
    url = f"https://api.telegram.org/bot{TG_TOKEN}/sendMessage"
    requests.post(url, json={"chat_id": TG_CHAT_ID, "text": message}, timeout=10)

def monitor_directory():
    print(f"Monitoring: {WATCH_DIR}")
    known_files = set(WATCH_DIR.iterdir())

    while True:
        time.sleep(POLL_INTERVAL)
        current_files = set(WATCH_DIR.iterdir())
        new_files = current_files - known_files

        for f in new_files:
            size_kb = f.stat().st_size / 1024
            msg = f"📁 New file: {f.name}\nSize: {size_kb:.1f} KB\nDirectory: {WATCH_DIR}"
            send_telegram(msg)
            print(msg)

        known_files = current_files

if __name__ == "__main__":
    monitor_directory()

Running Automatically with crontab

To schedule these scripts to run periodically, add them to crontab:

# Open crontab
crontab -e

# Clean Downloads every morning at 8am
0 8 * * * /usr/bin/python3 /home/user/scripts/organize_downloads.py >> /var/log/organize.log 2>&1

# Backup MySQL every day at 2am
0 2 * * * /usr/bin/python3 /home/user/scripts/backup_mysql.py >> /var/log/backup.log 2>&1

Redirect both stdout and stderr to a log file (2>&1) so it’s easy to debug when something goes wrong in the middle of the night.

Lessons from Production: Performance When Handling Large Files

When I was processing an nginx access log around 500MB, the script crashed with a MemoryError because it was reading the entire file into RAM at once. The fix was actually very simple: drop readlines() and let Python stream the file line by line. Here are two approaches — same result, completely different memory usage:

# Naive approach — OOM if file is large
with open("big_file.csv") as f:
    lines = f.readlines()  # Loads everything into RAM!
    for line in lines:
        process(line)

# Correct approach — stream line by line
with open("big_file.csv") as f:
    for line in f:  # Generator, only loads 1 line at a time
        process(line)

The difference is stark: a 500MB file consumes 1–2GB of RAM with the first approach, while the second stays nearly flat no matter how large the file gets. My nightly log-processing script went from crashing to running under 50MB of constant memory after making this switch.

Conclusion

Three scripts, three different problems — but the same philosophy: write it once, run it forever. The syntax and modules aren’t the hard part. The hard part is building the habit of recognizing: “I’m going to have to do this again next week — I should just script it.”

The rule I follow: if you do something the same way more than 3 times, write a script. Spend 30 minutes today and the script will pay it back within a few weeks — and more importantly, you no longer have to keep a mental “manual checklist” in your head.

Share: