Automating Excel with Python and Openpyxl: From Formatting to Workbook Security

Python tutorial - IT technology blog
Python tutorial - IT technology blog

Context: The 2 AM Nightmare

The coffee had gone cold, but Slack kept pinging. My boss sent an urgent request: “Need revenue reports for 50 branches, including comparison charts and KPI formatting, and the file must be locked and sent to the partner by this morning.” Looking at the messy CSVs from the database, I knew if I did this manually for 50 sheets, I’d have to call in sick tomorrow. I was exhausted and prone to data errors.

At this point, the Python and Openpyxl combo was a lifesaver. In reality, a professional report isn’t just about dumping data into cells; it requires aesthetics and consistency. Using VBA can sometimes lead to compatibility issues or macros being blocked by antivirus software. Openpyxl handles .xlsx files directly without needing Excel installed on the server, making it perfect for cron jobs or backend APIs.

Setting Up Your Tools

First, create a virtual environment (venv). I’ve learned the hard way how library conflicts on production can happen just because I was too lazy to create one. Don’t make the same mistake.

# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate

# Install libraries
pip install openpyxl

If you plan to insert images or draw complex charts, install pillow as well. For basic reporting needs, openpyxl is more than enough to handle the job.

Practical Implementation

Here is the scenario I often use: Read data, format automatically, draw charts, and finish off with password protection.

1. Initialization and Styling

Never send a plain file with the default Calibri font. Make an impression with a professional header featuring a background color and clear white text. Openpyxl uses the styles module to manage this smoothly.

from openpyxl import Workbook
from openpyxl.styles import Font, Alignment, PatternFill, Border, Side

wb = Workbook()
ws = wb.active
ws.title = "Revenue Report"

# Define styles for the Header
header_font = Font(bold=True, color="FFFFFF", size=12)
header_fill = PatternFill(start_color="4F81BD", end_color="4F81BD", fill_type="solid")
center_aligned = Alignment(horizontal="center", vertical="center")
thin_border = Border(left=Side(style='thin'), right=Side(style='thin'), 
                    top=Side(style='thin'), bottom=Side(style='thin'))

headers = ["Month", "Revenue (VND)", "Cost (VND)", "Profit (VND)"]
ws.append(headers)

# Apply style to the first row
for cell in ws[1]:
    cell.font = header_font
    cell.fill = header_fill
    cell.alignment = center_aligned
    cell.border = thin_border

While processing raw data, I often use Regex to filter phone numbers or format strings. If you need to quickly test complex patterns without re-running Python code, you can use the tool at toolcraft.app/en/tools/developer/regex-tester. It runs right in your browser, which is very convenient when you’re in a hurry.

2. Populating Data and Conditional Formatting

Instead of calculating in Excel, I calculate profit directly in Python and highlight it in red if the number is negative.

data = [
    ["January", 150000000, 120000000],
    ["February", 200000000, 180000000],
    ["March", 100000000, 115000000], # 15 million loss
]

for row_idx, row_data in enumerate(data, start=2):
    revenue, cost = row_data[1], row_data[2]
    profit = revenue - cost
    
    # Write data
    ws.cell(row=row_idx, column=1, value=row_data[0])
    ws.cell(row=row_idx, column=2, value=revenue).number_format = '#,##0'
    ws.cell(row=row_idx, column=3, value=cost).number_format = '#,##0'
    profit_cell = ws.cell(row=row_idx, column=4, value=profit)
    profit_cell.number_format = '#,##0'

    # Highlight if loss (Profit < 0)
    if profit < 0:
        profit_cell.font = Font(color="FF0000", bold=True)

3. Automated Charting

Managers usually prefer looking at charts rather than staring at dry data tables. A column chart (Bar Chart) comparing revenue will make your report much more valuable.

from openpyxl.chart import BarChart, Reference

chart = BarChart()
chart.type = "col"
chart.title = "Monthly Revenue Comparison"
chart.y_axis.title = "VND"

# Define data range (Column B)
data_ref = Reference(ws, min_col=2, min_row=1, max_row=4)
cats_ref = Reference(ws, min_col=1, min_row=2, max_row=4)

chart.add_data(data_ref, titles_from_data=True)
chart.set_categories(cats_ref)
ws.add_chart(chart, "F2")

4. Security and File Locking

Note: openpyxl supports locking sheet structures to prevent users from editing formulas. However, to set an “Open Password,” you’ll need to use msoffcrypto-python. At a basic level, locking the sheet is enough to protect the report’s integrity.

ws.protection.sheet = True
ws.protection.password = "secure2026"
wb.save("Final_Revenue_Report.xlsx")

Practical Experience: Don’t Let Your Script Crash on Production

Once the script finishes running, don’t rush off to sleep. With large datasets of around 100,000 rows, Openpyxl can be quite RAM-intensive (potentially consuming 500MB – 1GB). Consider using write_only=True mode to optimize memory.

Additionally, always wrap your save command in a try...except block. A very common error is the file being open by another user, which prevents Python from having overwrite permissions (PermissionError). Finally, check the file size. If your input data has millions of rows but the output file is only a few KB, there is definitely an issue with your logic.

Automation helps us escape being slaves to spreadsheets. Watching a script run in 15 seconds instead of spending 4 hours manually is well worth the time spent learning Openpyxl.

Share: