OpenAI API with Python: From Zero to Working Results in 5 Minutes

OpenAI API with Python: From Zero to Working Results in 5 Minutes

Get It Running in 5 Minutes — Quick Start

The first time I tried the OpenAI API, I wasted an entire morning because I was reading outdated docs from the openai.ChatCompletion.create() era — that syntax has been deprecated since version 1.0. In reality, you only need 3 steps.

Step 1: Install the library

pip install openai

Step 2: Get your API Key

Go to platform.openai.com → API Keys → Create new secret key. Copy the sk-proj-... string — that’s the only thing you need for authentication.

Step 3: Make your first API call

from openai import OpenAI

client = OpenAI(api_key="sk-...your-key-here...")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "Giải thích Docker là gì trong 2 câu"}
    ]
)

print(response.choices[0].message.content)

Run it, and you’ll see results in about 2–3 seconds. That’s the quick start covered — now let’s get into what you actually need to understand.

Understanding the Fundamentals

What is the messages structure?

The messages field is the core of every request. OpenAI accepts an array of messages with 3 role types:

  • system: Sets the context or “persona” for the model — runs once at the start and influences the entire conversation
  • user: Messages from the user’s side
  • assistant: Previous AI responses — necessary when you want the model to remember conversation history
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "system",
            "content": "Bạn là một senior Python developer. Trả lời ngắn gọn, thực tế."
        },
        {
            "role": "user",
            "content": "Khi nào nên dùng list comprehension thay vì for loop?"
        }
    ]
)

Which model should you choose?

This question mostly comes down to a cost vs. quality tradeoff. Here’s how I categorize them:

  • gpt-4o-mini: Cheap ($0.15/1M input tokens), fast, and good enough for 80% of tasks — summarization, classification, simple Q&A
  • gpt-4o: Significantly more capable, ideal when you need complex reasoning, image processing, or high-quality output
  • gpt-3.5-turbo: Older but extremely cheap, still viable when scaling up with simple tasks

Practical strategy: always start with gpt-4o-mini. Only upgrade to gpt-4o when real-world testing shows the output quality isn’t cutting it.

Reading the response correctly

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}]
)

# Main content
text = response.choices[0].message.content

# Token usage — track this to accurately calculate costs
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total: {response.usage.total_tokens}")

Going Further — The Stuff That’s Actually Useful

Streaming: Display text character by character like ChatGPT

Without streaming, users stare at a blank screen for 3–5 seconds before text suddenly appears all at once. With streaming, characters start flowing from the very first second. The total time is the same, but the app feels dramatically more responsive — it’s a surprisingly effective UX trick.

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Viết một đoạn code đọc file CSV bằng Python"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

print()  # Xuống dòng sau khi xong

Managing conversation history (Simple Chatbot)

The OpenAI API does not remember previous conversations — every request is completely independent. If you want the model to retain context, you have to pass the history back yourself each time:

from openai import OpenAI

client = OpenAI(api_key="sk-...")

history = [
    {"role": "system", "content": "Bạn là trợ lý IT cho lập trình viên Việt Nam."}
]

while True:
    user_input = input("Bạn: ")
    if user_input.lower() in ["quit", "exit"]:
        break

    history.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=history
    )

    reply = response.choices[0].message.content
    history.append({"role": "assistant", "content": reply})

    print(f"AI: {reply}\n")

Proper error handling

The three most common errors you’ll run into — and how to handle each one:

from openai import OpenAI, RateLimitError, APIConnectionError, AuthenticationError
import time

client = OpenAI(api_key="sk-...")

def safe_completion(messages, retries=3):
    for attempt in range(retries):
        try:
            response = client.chat.completions.create(
                model="gpt-4o-mini",
                messages=messages
            )
            return response.choices[0].message.content

        except AuthenticationError:
            print("API key sai hoặc hết hạn — kiểm tra lại")
            return None  # Không retry, lỗi này cần fix thủ công

        except RateLimitError:
            wait = 2 ** attempt  # Exponential backoff: 1s → 2s → 4s
            print(f"Rate limit — chờ {wait}s rồi thử lại...")
            time.sleep(wait)

        except APIConnectionError:
            print(f"Lỗi kết nối (lần {attempt + 1}/{retries})")
            time.sleep(1)

    return None

Practical Tips — Lessons Learned the Hard Way

1. Set your API key via environment variable, never hardcode it

# Trong terminal hoặc file .env
export OPENAI_API_KEY="sk-...your-key..."
import os
from openai import OpenAI

# OpenAI tự đọc OPENAI_API_KEY từ environment — không cần truyền api_key
client = OpenAI()

A painful lesson learned: I once accidentally committed an API key to a public GitHub repo. OpenAI revoked it in under 5 minutes and sent a warning email immediately. Fortunately, I caught it before anyone could use it to rack up charges on my account.

2. Control costs with max_tokens

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    max_tokens=500,      # Giới hạn độ dài response
    temperature=0.7      # 0 = deterministic, 1 = creative
)

3. Match temperature to your use case

  • temperature=0: Extraction, classification, code generation — when you need consistent, reproducible output
  • temperature=0.7: Content writing, brainstorming — just enough creativity without going off the rails
  • temperature=1.0+: Creative writing, slogans — embrace unpredictable output that can be brilliant or a complete miss

4. Use python-dotenv to manage credentials

pip install python-dotenv
# File .env (bắt buộc thêm vào .gitignore)
OPENAI_API_KEY=sk-...

# File main.py
from dotenv import load_dotenv
load_dotenv()

client = OpenAI()  # Tự đọc từ .env

5. Monitor your spending before the bill spirals out of control

gpt-4o-mini pricing: $0.15/1M input tokens and $0.60/1M output tokens. A typical request consumes around 300–500 tokens, which is less than $0.001. Sounds cheap — but if your app handles 10,000 requests per day with long contexts, the end-of-month bill will look very different. Go to platform.openai.com → Settings → Billing, set a spending alert at $10 and a hard limit at $50 before deploying anything to production.

Share: