Hướng dẫn sử dụng OpenAI API với Python: Từ zero đến có kết quả trong 5 phút – ITFROMZERO

Table of Contents

Làm ngay trong 5 phút — Quick Start

Lần đầu thử OpenAI API, mình mất cả buổi sáng vì đọc nhầm docs cũ từ thời openai.ChatCompletion.create() — cú pháp đó đã deprecated từ phiên bản 1.0. Thực ra chỉ cần 3 bước là xong.

Bước 1: Cài thư viện

pip install openai

Bước 2: Lấy API Key

Vào platform.openai.com → API Keys → Create new secret key. Copy chuỗi sk-proj-... đó — đây là thứ duy nhất bạn cần để xác thực.

Bước 3: Gọi API đầu tiên

from openai import OpenAI

client = OpenAI(api_key="sk-...your-key-here...")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Giải thích Docker là gì trong 2 câu"}
    ]
)

print(response.choices[0].message.content)

Chạy lên, khoảng 2-3 giây là thấy kết quả. Xong phần quick start — giờ đi vào những thứ thực sự cần hiểu.

Hiểu đúng để dùng đúng

Cấu trúc messages là gì?

Trường messages là cốt lõi của mọi request. OpenAI nhận một mảng tin nhắn với 3 loại role:

system: Thiết lập ngữ cảnh hoặc “nhân cách” cho model — chạy một lần đầu, ảnh hưởng toàn bộ cuộc trò chuyện
user: Tin nhắn từ phía người dùng
assistant: Phản hồi trước đó của AI — cần thiết khi muốn model nhớ lịch sử hội thoại

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": "Bạn là một senior Python developer. Trả lời ngắn gọn, thực tế."
        },
        {
            "role": "user",
            "content": "Khi nào nên dùng list comprehension thay vì for loop?"
        }
    ]
)

Chọn model nào?

Câu hỏi này phần lớn quy về bài toán chi phí vs chất lượng. Dưới đây là cách mình phân loại:

gpt-4o-mini: Rẻ ($0.15/1M input tokens), nhanh, đủ tốt cho 80% tác vụ — summarize, classify, Q&A đơn giản
gpt-4o: Mạnh hơn đáng kể, phù hợp khi cần reasoning phức tạp, xử lý ảnh, hoặc output chất lượng cao
gpt-3.5-turbo: Cũ hơn nhưng cực rẻ, vẫn dùng được khi scale lớn với task đơn giản

Chiến lược thực tế: luôn bắt đầu với gpt-4o-mini. Chỉ upgrade lên gpt-4o khi test thực tế cho thấy chất lượng output chưa đạt.

Đọc response đúng cách

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}]
)

# Nội dung chính
text = response.choices[0].message.content

# Số token đã dùng — theo dõi cái này để tính chi phí chính xác
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total: {response.usage.total_tokens}")

Nâng cao — Những thứ thực sự hữu ích

Streaming: Hiển thị text từng chữ như ChatGPT

Không streaming: user nhìn màn hình trắng 3-5 giây rồi text hiện ập ra một lúc. Có streaming: chữ chạy ngay từ giây đầu tiên. Tổng thời gian không đổi, nhưng user cảm giác app phản hồi nhanh hơn hẳn — đây là UX trick khá hiệu quả.

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Viết một đoạn code đọc file CSV bằng Python"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

print()  # Xuống dòng sau khi xong

Quản lý lịch sử hội thoại (Chatbot đơn giản)

OpenAI API không tự nhớ hội thoại cũ — mỗi request là hoàn toàn độc lập. Muốn model nhớ ngữ cảnh, bạn phải tự truyền lại lịch sử mỗi lần:

from openai import OpenAI

client = OpenAI(api_key="sk-...")

history = [
    {"role": "system", "content": "Bạn là trợ lý IT cho lập trình viên Việt Nam."}
]

while True:
    user_input = input("Bạn: ")
    if user_input.lower() in ["quit", "exit"]:
        break

    history.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=history
    )

    reply = response.choices[0].message.content
    history.append({"role": "assistant", "content": reply})

    print(f"AI: {reply}\n")

Xử lý lỗi đúng cách

Ba loại lỗi hay gặp nhất — và cách xử lý từng cái:

from openai import OpenAI, RateLimitError, APIConnectionError, AuthenticationError
import time

client = OpenAI(api_key="sk-...")

def safe_completion(messages, retries=3):
    for attempt in range(retries):
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages
            )
            return response.choices[0].message.content

        except AuthenticationError:
            print("API key sai hoặc hết hạn — kiểm tra lại")
            return None  # Không retry, lỗi này cần fix thủ công

        except RateLimitError:
            wait = 2 ** attempt  # Exponential backoff: 1s → 2s → 4s
            print(f"Rate limit — chờ {wait}s rồi thử lại...")
            time.sleep(wait)

        except APIConnectionError:
            print(f"Lỗi kết nối (lần {attempt + 1}/{retries})")
            time.sleep(1)

    return None

Tips thực tế — Học từ sai lầm của mình

1. Đặt API key qua biến môi trường, đừng hardcode

# Trong terminal hoặc file .env
export OPENAI_API_KEY="sk-...your-key..."

import os
from openai import OpenAI

# OpenAI tự đọc OPENAI_API_KEY từ environment — không cần truyền api_key
client = OpenAI()

Bài học xương máu: mình từng commit nhầm API key lên GitHub public repo. OpenAI revoke trong vòng chưa đầy 5 phút và gửi email cảnh báo ngay. May là phát hiện kịp trước khi ai đó dùng để bill vào tài khoản mình.

2. Kiểm soát chi phí với max_tokens

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    max_tokens=500,      # Giới hạn độ dài response
    temperature=0.7      # 0 = deterministic, 1 = creative
)

3. temperature phù hợp cho từng use case

temperature=0: Extraction, classification, code generation — cần output nhất quán, tái lập được
temperature=0.7: Viết content, brainstorm ý tưởng — sáng tạo vừa đủ, không loạn
temperature=1.0+: Creative writing, câu slogan — chấp nhận output bất ngờ, đôi khi hay đôi khi tệ

4. Dùng python-dotenv để quản lý credential

pip install python-dotenv

# File .env (bắt buộc thêm vào .gitignore)
OPENAI_API_KEY=sk-...

# File main.py
from dotenv import load_dotenv
load_dotenv()

client = OpenAI()  # Tự đọc từ .env

5. Theo dõi chi phí trước khi bill tăng không kiểm soát

Giá gpt-4o-mini: $0.15/1M input tokens và $0.60/1M output tokens. Một request bình thường tốn khoảng 300-500 tokens, tức chưa tới $0.001. Nghe rẻ — nhưng nếu app bạn xử lý 10.000 request/ngày với context dài, con số cuối tháng sẽ khác hẳn. Vào platform.openai.com → Settings → Billing, set alert ở mức $10 và hard limit ở $50 trước khi deploy bất cứ thứ gì lên production.