Switching from MySQL to PostgreSQL usually starts with a very specific reason: you need advanced JSON support, better full-text search, window functions, or simply your team is more familiar with PostgreSQL. Whatever the reason, the hardest part isn’t installing PostgreSQL — it’s moving all your existing data over without data loss or extended downtime.
I’ve done this with a ~3GB database, around 40 tables, tangled foreign key relationships, and a few stored procedures. This article documents what actually worked — not a smooth theoretical tutorial.
Why Is MySQL → PostgreSQL Migration So Error-Prone?
On the surface the two systems look similar, but underneath there are numerous differences that can catch you off guard:
- Data types don’t map 1-to-1:
TINYINT(1)in MySQL typically acts as a boolean, but PostgreSQL has its ownBOOLEANtype. MySQL’sDATETIMEmaps to PostgreSQL’sTIMESTAMP— but timezone handling works completely differently. - AUTO_INCREMENT vs SERIAL/SEQUENCE: MySQL uses
AUTO_INCREMENT, PostgreSQL usesSERIALorGENERATED ALWAYS AS IDENTITY. Forgetting to reset sequences after migration means new inserts immediately throw duplicate key errors. - Case sensitivity: MySQL is case-insensitive for table names by default (on Windows/macOS), while PostgreSQL depends on the OS and collation — apps that use inconsistent table name casing will break.
- Backtick vs double quote: MySQL uses backticks to escape table/column names, PostgreSQL uses double quotes. Pasting a MySQL SQL dump directly into PostgreSQL will fail immediately.
- Strict mode by default: PostgreSQL refuses to insert a string into an integer column — it won’t auto-cast like MySQL does. This is the source of many silent data errors during migration.
Overlooking these differences means your migrated data may have type errors, lost NULL values, or sequences reset to 1 while your data already has IDs up to 50,000.
Comparing 3 Common Migration Approaches
Option 1: pgloader — Highest Level of Automation
pgloader is a dedicated database migration tool with native MySQL → PostgreSQL support. It automatically maps data types, handles encoding, and runs multiple parallel workers.
Pros:
- Single command handles most of the conversion work automatically
- Migrates schema + data simultaneously, no need to split into separate steps
- Supports live migration — reads from a running MySQL instance without shutting it down
Cons:
- Limited fine-grained control per table when you have special requirements
- Stored procedures and triggers must be handled manually — pgloader doesn’t touch them
- Installation on Ubuntu can be tricky; use Docker to be safe
Option 2: mysqldump + Manual SQL Conversion
Export schema and data from MySQL, run it through sed/awk or a Python script to convert the SQL, then import into PostgreSQL. A classic approach that’s still worth using in certain situations.
Pros:
- Full control over every step — you know exactly what’s happening
- Easier to debug: errors point to a specific line and file
- Ideal for small databases (under 500MB) or when you need to change the schema significantly
Cons:
- Time-consuming, easy to miss edge cases when writing the conversion script
- Large databases produce heavy SQL dump files that are slow to process
Option 3: Python ETL Script
Write a Python script that connects to both databases simultaneously, reads from MySQL, and writes to PostgreSQL in batches. The most flexible option, but also the most work.
Pros:
- Transform data freely before writing — change types, normalize values, merge tables
- Handle custom business logic that no tool can do for you
- Supports incremental runs — sync new data without re-migrating everything from scratch
Cons:
- Requires writing code, more effort than the other two approaches
- Inserting row by row is extremely slow — batch inserts are essential, otherwise a table with 1 million rows can take over an hour
Which Approach Should You Choose?
Rules of thumb I follow:
- Pure data database with a clean schema → pgloader
- Small database or heavy schema changes needed → mysqldump + convert
- Data transformation required or incremental migration → Python ETL
For my ~3GB migration: I used pgloader for 37 standard tables, then wrote a separate Python ETL for 3 tables with complex logic. It took about 2 days total — most of that time was verifying and handling edge cases, not running tools.
Migration Guide with pgloader
Step 1: Install pgloader
# Ubuntu/Debian
sudo apt-get install pgloader
# Or via Docker — recommended to avoid dependency issues
docker pull dimitri/pgloader
Step 2: Create the pgloader configuration file
cat > migrate.load << 'EOF'
LOAD DATABASE
FROM mysql://mysql_user:mysql_pass@localhost:3306/mydb
INTO postgresql://pg_user:pg_pass@localhost:5432/mydb
WITH include drop, create tables,
create indexes, reset sequences,
workers = 4, concurrency = 2
SET work_mem to '128MB',
maintenance_work_mem to '512MB'
CAST type tinyint to boolean
using tinyint-to-boolean,
type datetime to timestamptz
using zero-dates-to-null;
EOF
Step 3: Run the migration
pgloader migrate.load
# Or via Docker
docker run --rm --network host \
-v $(pwd)/migrate.load:/migrate.load \
dimitri/pgloader pgloader /migrate.load
pgloader prints detailed logs per table: rows migrated, time taken, and any failed rows. The nice thing is it doesn’t stop when a row fails — it writes errors to a separate file for later review. For my ~3GB database, pgloader took about 8 minutes with 4 workers.
Manual Migration Guide with Python (for Complex Tables)
For tables with special logic — for example, a status column storing integers 0/1/2 that PostgreSQL needs as strings — I write a separate ETL:
import mysql.connector
import psycopg2
from psycopg2.extras import execute_batch
# Connect to both databases
mysql_conn = mysql.connector.connect(
host='localhost', database='mydb',
user='root', password='mysql_pass'
)
pg_conn = psycopg2.connect(
host='localhost', dbname='mydb',
user='pg_user', password='pg_pass'
)
mysql_cur = mysql_conn.cursor(dictionary=True)
pg_cur = pg_conn.cursor()
# Read from MySQL in batches
BATCH_SIZE = 1000
mysql_cur.execute("SELECT * FROM orders")
while True:
rows = mysql_cur.fetchmany(BATCH_SIZE)
if not rows:
break
# Transform: convert status from int to string
transformed = []
for row in rows:
transformed.append((
row['id'],
row['user_id'],
{0: 'pending', 1: 'completed', 2: 'cancelled'}.get(row['status'], 'pending'),
row['created_at'],
row['total_amount']
))
execute_batch(
pg_cur,
"""INSERT INTO orders (id, user_id, status, created_at, total_amount)
VALUES (%s, %s, %s, %s, %s)
ON CONFLICT (id) DO NOTHING""",
transformed
)
pg_conn.commit()
print(f"Migrated {len(transformed)} rows")
mysql_cur.close()
pg_cur.close()
Using psycopg2’s execute_batch is roughly 10–50x faster than inserting row by row depending on the dataset. For a 500k-row table, the difference between 40 minutes and 2 minutes comes down to this.
What to Verify After Migration
1. Check Row Counts
-- Run on both MySQL and PostgreSQL and compare results
SELECT table_name, table_rows
FROM information_schema.tables
WHERE table_schema = 'mydb';
2. Reset Sequences After Import
The most common mistake I see: after migration, sequences are still at 1, and inserting a new row throws a duplicate key error. Reset them to the current max ID:
-- Reset a specific table
SELECT setval('orders_id_seq', (SELECT MAX(id) FROM orders));
-- Script to automatically reset all tables
DO $$
DECLARE
r RECORD;
BEGIN
FOR r IN SELECT tablename FROM pg_tables WHERE schemaname = 'public' LOOP
EXECUTE 'SELECT setval(pg_get_serial_sequence(''' || r.tablename || ''', ''id''), COALESCE(MAX(id), 1)) FROM ' || r.tablename;
END LOOP;
END $$;
3. Check Foreign Key Constraints
-- Find foreign key violations
SELECT conname, conrelid::regclass
FROM pg_constraint
WHERE contype = 'f'
AND NOT convalidated;
A Quick Tip for Preparing Your Data
Before migrating, I usually export some lookup tables to CSV for quick verification. When I need to convert CSV to JSON to compare with API responses, using the tool at toolcraft.app/en/tools/data/csv-to-json is faster than writing a one-off Python script — it runs in the browser so there’s no risk of exposing data.
Pre-Cutover Checklist for Production
- Row counts match between MySQL and PostgreSQL
- Sequences have been reset to the correct values (most commonly forgotten)
- All indexes have been created — pgloader usually handles this, but verify anyway
- Foreign key constraints are valid
- Stored procedures/triggers have been rewritten in PostgreSQL syntax
- Connection strings in the application have been updated
- Run a selection of critical business queries against the real data
Stored procedures are the most labor-intensive part with no shortcut. MySQL uses DELIMITER with its own syntax, PostgreSQL uses PL/pgSQL — they’re completely different and must be rewritten by hand.
Database not too large, no complex stored procedures? pgloader handles 90% of it in a few minutes. The rest is thorough verification — especially sequences and indexes, since these are the two things most often overlooked and cause the latest-appearing bugs once you’re already in production.

