pg_cron: Schedule Automated Jobs Inside PostgreSQL Without System Cron

Database tutorial - IT technology blog
Database tutorial - IT technology blog

A Familiar Problem: Who’s Going to Clean Up the Database?

I’ve run into this situation more times than I can count: PostgreSQL is humming along just fine, but after a few months the audit_logs table has ballooned to tens of gigabytes because nobody cleaned it up. Or that MATERIALIZED VIEW aggregating daily revenue has to be refreshed manually every morning — forget once and the reports show wrong numbers.

The go-to fix was writing a .sh script and dropping it into Linux’s crontab. Seemed fine at first. Then the complexity started creeping in: database passwords had to live somewhere — a .pgpass file or an environment variable — scripts were scattered across the server, and where exactly do you check the logs? Whenever we moved to a new server, the cron setup got forgotten. And if you had multiple databases across multiple servers, you were on your own.

PostgreSQL has something I keep coming back to whenever I hit problems like this: an extension ecosystem broad enough that you rarely need to reach outside the database to get things done. And pg_cron is exactly the extension that cleans up that messy script situation.

Why System Cron + psql Is a Headache

On the surface, a traditional cron script looks simple enough:

# /etc/cron.d/cleanup-db
0 2 * * * postgres psql -U myuser -d mydb -c "DELETE FROM logs WHERE created_at < NOW() - INTERVAL '30 days';"

But in real production environments, it creates a whole chain of problems:

  • Credential management: Passwords have to be stored somewhere — a .pgpass file, environment variables, or peer authentication. Each approach has trade-offs in security or flexibility.
  • Hard to track history: Did the job succeed or fail? You have to dig through system logs and grep line by line. There’s no centralized view.
  • Logic is scattered: The database schema lives inside PostgreSQL, but the job schedule lives on the OS. During handovers or disaster recovery, it’s easy for things to fall through the cracks.
  • Not portable: Dump and restore the database to another server — the jobs don’t follow. You have to set up cron all over again.

The Alternatives

pgAgent

pgAgent is the job scheduler that ships with pgAdmin. It’s more powerful than cron, has a management UI, and supports multi-step jobs and retry logic. But in return, you have to run a separate pgAgent daemon on the server and need pgAdmin to manage it — the setup is fairly heavy. For a job that just deletes logs every night, spinning up an entire separate daemon is overkill.

pg_cron — An Extension Built Right Into PostgreSQL

pg_cron is an extension developed by Citus Data (now Microsoft). It runs directly inside the PostgreSQL process, uses familiar cron syntax, and stores job schedules right in the database. No external daemon, no pgAdmin, no script files on the OS.

The best part: when you back up the database with pg_dump, all of pg_cron’s job metadata is included. Migrate to a new server, restore the dump, and everything runs immediately — no need to remember to set up cron from scratch.

Installing and Configuring pg_cron

Installing on Ubuntu/Debian

# Add the PostgreSQL repo if you haven't already
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg
echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list

sudo apt update
sudo apt install postgresql-16-cron  # Replace 16 with your PostgreSQL version

Installing on Amazon RDS / Aurora PostgreSQL

The good news is pg_cron is available on both RDS PostgreSQL and Aurora PostgreSQL — you just need to enable it via a Parameter Group:

# In the RDS Parameter Group, add:
shared_preload_libraries = pg_cron
cron.database_name = postgres  # The database that stores pg_cron metadata

Configuring postgresql.conf

For self-hosted PostgreSQL, open postgresql.conf and add:

shared_preload_libraries = 'pg_cron'
cron.database_name = 'postgres'

Then restart PostgreSQL:

sudo systemctl restart postgresql

Enable the Extension in the Database

-- Connect to the 'postgres' database (or whichever database you set in cron.database_name)
CREATE EXTENSION pg_cron;

-- Grant access to users who need to schedule jobs (non-superusers)
GRANT USAGE ON SCHEMA cron TO myuser;

Using pg_cron in Practice

Basic Syntax

SELECT cron.schedule(
  'job-name',          -- Job name (must be unique)
  '0 2 * * *',         -- Cron expression (minute hour day month weekday)
  $$SQL query here$$   -- The SQL statement to run
);

The cron expression follows standard Unix cron syntax. A few common examples:

  • '0 2 * * *' — runs at 2:00 AM every day
  • '*/15 * * * *' — runs every 15 minutes
  • '0 0 * * 0' — runs at midnight every Sunday
  • '0 9 1 * *' — runs at 9:00 AM on the 1st of every month

Example 1: Delete Logs Older Than 30 Days

SELECT cron.schedule(
  'delete-old-logs',
  '0 3 * * *',
  $$DELETE FROM audit_logs WHERE created_at < NOW() - INTERVAL '30 days'$$
);

Example 2: Refresh a Materialized View Every Hour

SELECT cron.schedule(
  'refresh-daily-revenue',
  '0 * * * *',
  $$REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_revenue$$
);
-- Note: CONCURRENTLY requires the materialized view to have a UNIQUE INDEX

Example 3: Call a Stored Procedure in a Different Database

By default, pg_cron runs jobs on the database configured in cron.database_name. But starting with pg_cron v1.4+, you can specify a target database:

SELECT cron.schedule_in_database(
  'cleanup-app-db',
  '0 4 * * *',
  $$CALL cleanup_expired_sessions()$$,
  'myapp_production'  -- Target database name
);

Example 4: Periodic Vacuum and Analyze on a Large Table

SELECT cron.schedule(
  'vacuum-orders-table',
  '30 1 * * 6',  -- 1:30 AM every Saturday
  $$VACUUM ANALYZE orders$$
);

Managing and Monitoring Jobs

View All Active Jobs

SELECT jobid, jobname, schedule, command, active, database
FROM cron.job
ORDER BY jobid;

View Run History — This Is the Killer Feature

SELECT
  jrd.jobid,
  j.jobname,
  jrd.start_time,
  jrd.end_time,
  jrd.return_message,
  jrd.status
FROM cron.job_run_details jrd
JOIN cron.job j ON j.jobid = jrd.jobid
ORDER BY jrd.start_time DESC
LIMIT 20;

The status column will be either succeeded or failed. The return_message column contains the number of rows affected, or an error message if the job crashed. No more grepping through logs.

Temporarily Disable a Job

-- Disable
UPDATE cron.job SET active = false WHERE jobname = 'delete-old-logs';

-- Re-enable
UPDATE cron.job SET active = true WHERE jobname = 'delete-old-logs';

Delete a Job

SELECT cron.unschedule('delete-old-logs');
-- Or by jobid:
SELECT cron.unschedule(1);

Clean Up Old Job History (Prevent job_run_details From Growing Too Large)

A lesson learned after a few months in production: the cron.job_run_details table accumulates faster than you’d expect. Just 5 jobs running every hour means over 10,000 records after 90 days — and that’s before counting any job that runs every 15 minutes. The solution? Use pg_cron to clean itself up:

SELECT cron.schedule(
  'cleanup-job-history',
  '0 0 * * 0',  -- Every Sunday at midnight
  $$DELETE FROM cron.job_run_details WHERE end_time < NOW() - INTERVAL '7 days'$$
);

Tips and Best Practices From the Field

1. Use meaningful job names. Job names must be unique and clearly describe what the job does. job-1 or cleanup are bad names. delete-audit-logs-30d or refresh-mv-daily-revenue are good names — you know exactly what they do at a glance.

2. Watch out for timezones. pg_cron runs on the PostgreSQL server’s timezone, which is usually UTC by default. If you want a job to run at 2 AM local time (UTC+7), set it to 19:00 UTC:

-- 2:00 AM ICT (UTC+7) = 19:00 UTC the previous day
SELECT cron.schedule('nightly-cleanup', '0 19 * * *', $$...$$);

-- Or check your PostgreSQL timezone:
SHOW timezone;

3. Use transactions when you need atomicity. If a job involves multiple steps that need to roll back on failure, wrap them in a stored procedure with exception handling — don’t cram a chain of SQL statements directly into pg_cron.

4. Monitor job failures regularly. Drop this query into your monitoring script or a Grafana alert:

-- Find jobs that failed in the last 24 hours
SELECT jobname, start_time, return_message
FROM cron.job_run_details jrd
JOIN cron.job j ON j.jobid = jrd.jobid
WHERE status = 'failed'
  AND start_time > NOW() - INTERVAL '24 hours';

5. Don’t use pg_cron for heavy ETL. pg_cron is a great fit for lightweight maintenance tasks: cleanup, refreshing views, updating statistics. Complex ETL that processes millions of rows with many transformations — that’s a job for Airflow or dbt, not pg_cron.

6. Test your SQL manually before scheduling it. Don’t schedule untested SQL. Run it manually in psql first, confirm the results are correct, then add it to pg_cron. Don’t use pg_cron as your testing ground.

Since switching from cron scripts to pg_cron, managing database jobs has become significantly less stressful. No more SSH-ing into the server to check cron, no more grepping syslog for errors. Everything is queryable right inside psql and backed up with pg_dump — when you spin up a new server, restoring the database is all it takes.

Share: