Get It Running in 5 Minutes — Quick Start
The first time I tried the OpenAI API, I wasted an entire morning because I was reading outdated docs from the openai.ChatCompletion.create() era — that syntax has been deprecated since version 1.0. In reality, you only need 3 steps.
Step 1: Install the library
pip install openai
Step 2: Get your API Key
Go to platform.openai.com → API Keys → Create new secret key. Copy the sk-proj-... string — that’s the only thing you need for authentication.
Step 3: Make your first API call
from openai import OpenAI
client = OpenAI(api_key="sk-...your-key-here...")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "Giải thích Docker là gì trong 2 câu"}
]
)
print(response.choices[0].message.content)
Run it, and you’ll see results in about 2–3 seconds. That’s the quick start covered — now let’s get into what you actually need to understand.
Understanding the Fundamentals
What is the messages structure?
The messages field is the core of every request. OpenAI accepts an array of messages with 3 role types:
- system: Sets the context or “persona” for the model — runs once at the start and influences the entire conversation
- user: Messages from the user’s side
- assistant: Previous AI responses — necessary when you want the model to remember conversation history
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "Bạn là một senior Python developer. Trả lời ngắn gọn, thực tế."
},
{
"role": "user",
"content": "Khi nào nên dùng list comprehension thay vì for loop?"
}
]
)
Which model should you choose?
This question mostly comes down to a cost vs. quality tradeoff. Here’s how I categorize them:
- gpt-4o-mini: Cheap ($0.15/1M input tokens), fast, and good enough for 80% of tasks — summarization, classification, simple Q&A
- gpt-4o: Significantly more capable, ideal when you need complex reasoning, image processing, or high-quality output
- gpt-3.5-turbo: Older but extremely cheap, still viable when scaling up with simple tasks
Practical strategy: always start with gpt-4o-mini. Only upgrade to gpt-4o when real-world testing shows the output quality isn’t cutting it.
Reading the response correctly
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}]
)
# Main content
text = response.choices[0].message.content
# Token usage — track this to accurately calculate costs
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total: {response.usage.total_tokens}")
Going Further — The Stuff That’s Actually Useful
Streaming: Display text character by character like ChatGPT
Without streaming, users stare at a blank screen for 3–5 seconds before text suddenly appears all at once. With streaming, characters start flowing from the very first second. The total time is the same, but the app feels dramatically more responsive — it’s a surprisingly effective UX trick.
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Viết một đoạn code đọc file CSV bằng Python"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)
print() # Xuống dòng sau khi xong
Managing conversation history (Simple Chatbot)
The OpenAI API does not remember previous conversations — every request is completely independent. If you want the model to retain context, you have to pass the history back yourself each time:
from openai import OpenAI
client = OpenAI(api_key="sk-...")
history = [
{"role": "system", "content": "Bạn là trợ lý IT cho lập trình viên Việt Nam."}
]
while True:
user_input = input("Bạn: ")
if user_input.lower() in ["quit", "exit"]:
break
history.append({"role": "user", "content": user_input})
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=history
)
reply = response.choices[0].message.content
history.append({"role": "assistant", "content": reply})
print(f"AI: {reply}\n")
Proper error handling
The three most common errors you’ll run into — and how to handle each one:
from openai import OpenAI, RateLimitError, APIConnectionError, AuthenticationError
import time
client = OpenAI(api_key="sk-...")
def safe_completion(messages, retries=3):
for attempt in range(retries):
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return response.choices[0].message.content
except AuthenticationError:
print("API key sai hoặc hết hạn — kiểm tra lại")
return None # Không retry, lỗi này cần fix thủ công
except RateLimitError:
wait = 2 ** attempt # Exponential backoff: 1s → 2s → 4s
print(f"Rate limit — chờ {wait}s rồi thử lại...")
time.sleep(wait)
except APIConnectionError:
print(f"Lỗi kết nối (lần {attempt + 1}/{retries})")
time.sleep(1)
return None
Practical Tips — Lessons Learned the Hard Way
1. Set your API key via environment variable, never hardcode it
# Trong terminal hoặc file .env
export OPENAI_API_KEY="sk-...your-key..."
import os
from openai import OpenAI
# OpenAI tự đọc OPENAI_API_KEY từ environment — không cần truyền api_key
client = OpenAI()
A painful lesson learned: I once accidentally committed an API key to a public GitHub repo. OpenAI revoked it in under 5 minutes and sent a warning email immediately. Fortunately, I caught it before anyone could use it to rack up charges on my account.
2. Control costs with max_tokens
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
max_tokens=500, # Giới hạn độ dài response
temperature=0.7 # 0 = deterministic, 1 = creative
)
3. Match temperature to your use case
- temperature=0: Extraction, classification, code generation — when you need consistent, reproducible output
- temperature=0.7: Content writing, brainstorming — just enough creativity without going off the rails
- temperature=1.0+: Creative writing, slogans — embrace unpredictable output that can be brilliant or a complete miss
4. Use python-dotenv to manage credentials
pip install python-dotenv
# File .env (bắt buộc thêm vào .gitignore)
OPENAI_API_KEY=sk-...
# File main.py
from dotenv import load_dotenv
load_dotenv()
client = OpenAI() # Tự đọc từ .env
5. Monitor your spending before the bill spirals out of control
gpt-4o-mini pricing: $0.15/1M input tokens and $0.60/1M output tokens. A typical request consumes around 300–500 tokens, which is less than $0.001. Sounds cheap — but if your app handles 10,000 requests per day with long contexts, the end-of-month bill will look very different. Go to platform.openai.com → Settings → Billing, set a spending alert at $10 and a hard limit at $50 before deploying anything to production.

