Building a Simple Chatbot with Python and LLM: Comparing Approaches and a Practical Guide – ITFROMZERO

Table of Contents

Bad chatbots come from choosing the wrong approach, not from bad code

I once spent nearly two weeks building a customer support chatbot with pure if/else logic. The result? The bot understood messages typed exactly according to its templates — anything else got total silence. If a user typed “i forgot my pw” instead of “I forgot my password,” the bot was completely lost.

The problem wasn’t bad code — it was choosing the wrong architecture from the start. There are really only three ways to build a chatbot, and the boundaries between them are clearer than I initially thought.

3 Ways to Build a Chatbot: A Real Comparison

Approach 1: Rule-based (if/else, regex)

The oldest approach — and don’t underestimate it. The bot reads input, matches patterns, and returns a fixed response. This type still runs in production at many FAQ systems, phone IVRs, and banking chatbots.

def simple_bot(user_input):
    msg = user_input.lower()
    if "password" in msg or "pass" in msg:
        return "Go to 'Forgot Password' to reset it."
    elif "opening hours" in msg or "hours" in msg:
        return "Open 8AM–5PM, Monday through Friday."
    else:
        return "Sorry, I don't understand that question."

Pros:

No API needed, no cost, runs completely offline
100% controllable responses — critical for legal or medical use cases
Simple deployment, zero external dependencies

Cons:

Need to write rules for every question variation — not realistic with hundreds of questions
Can’t handle slang, abbreviations, or typos
A bloated rule base becomes a maintenance nightmare

Approach 2: Intent classification + traditional ML

Train a small model (Naive Bayes, SVM, or a small BERT) to classify intent from questions, then map intents to fixed responses. I tried this with scikit-learn + TF-IDF for a 12-intent chatbot — achieved ~80% accuracy with around 200 training sentences per intent. Not bad, but data preparation alone ate up a full week.

Pros:

More flexible than rule-based — handles a certain degree of language variation
Runs locally, no external API dependency once trained

Cons:

Requires labeled training data — at least a few hundred sentences per intent
Still doesn’t generate flexible responses, only picks from a fixed list
Pipeline setup is significantly more complex than the other two approaches

Approach 3: LLM-based (Claude, GPT, Gemini…)

Send the user’s message to an LLM and receive a naturally generated response. The bot “understands” context, abbreviations, and typos, and maintains conversation history across multiple turns.

Pros:

Understands natural language — including “i forgot my pw lol”
Very little code — focus on the system prompt instead of writing logic
Easy to change behavior without retraining or rewriting rules

Cons:

Costs money per token — need to calculate cost before scaling
Depends on internet connectivity and an external service
Responses are not fully deterministic — sometimes “creative” beyond what’s needed

Which Approach Should You Choose?

After trying all three on real projects, I’ve arrived at a fairly clear selection principle:

Rule-based: Internal bots with fewer than 20 question types that need 100% predictability — menu navigation bots, hard-coded FAQs, or air-gapped environments.
Traditional ML: You already have labeled data, need to run offline, budget is tight but requirements exceed what if/else can handle.
LLM: Most other cases — especially when you need natural conversation, a short build time, or questions that can’t all be predicted in advance.

I’ve deployed LLM-based in production for an internal support chatbot — employees ask all kinds of questions with no consistent pattern. Results were more stable than expected, and there’s no time wasted maintaining rules.

Building an LLM Chatbot with Python: Step by Step

In this example I’m using the Anthropic Claude API. You can swap it for OpenAI or Gemini with similar syntax — the concept is exactly the same.

Step 1: Install the library

pip install anthropic

Step 2: Basic chatbot (single-turn)

import anthropic

client = anthropic.Anthropic(api_key="your-api-key")

def chat(user_message: str) -> str:
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system="You are a customer support assistant for XYZ fashion store. Keep responses brief and friendly.",
        messages=[
            {"role": "user", "content": user_message}
        ]
    )
    return response.content[0].text

# Quick test
print(chat("i forgot my password what do i do"))
# → "Click 'Forgot Password' on the login page, enter your email and you're good to go!"

Step 3: Add conversation history (multi-turn)

A real chatbot needs to remember context — users don’t want to re-explain everything from scratch each turn. The trick is to pass the entire history on every API call:

import anthropic
import os

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

SYSTEM_PROMPT = """You are an internal IT assistant for the company.
Only answer questions about IT, software, and company processes.
If you don't know something, say so honestly rather than guessing."""

def run_chatbot():
    conversation_history = []
    print("Internal IT Chatbot. Type 'quit' to exit.\n")

    while True:
        user_input = input("You: ").strip()
        if user_input.lower() in ["quit", "exit"]:
            print("Bot: Goodbye!")
            break
        if not user_input:
            continue

        conversation_history.append({"role": "user", "content": user_input})

        try:
            response = client.messages.create(
                model="claude-sonnet-4-6",
                max_tokens=1024,
                system=SYSTEM_PROMPT,
                messages=conversation_history
            )
            bot_reply = response.content[0].text
        except anthropic.APIConnectionError:
            bot_reply = "Connection issue. Please try again later."
        except anthropic.RateLimitError:
            bot_reply = "Bot is overloaded, please wait a few seconds and try again."

        conversation_history.append({"role": "assistant", "content": bot_reply})
        print(f"Bot: {bot_reply}\n")

if __name__ == "__main__":
    run_chatbot()

Step 4: Limit history to avoid token bloat

A long conversation history means more tokens, which means more cost. In practice, keeping only the last N turns provides sufficient context:

MAX_TURNS = 10  # Keep the last 10 turns (each turn = 1 user + 1 assistant)

def trim_history(history: list) -> list:
    if len(history) > MAX_TURNS * 2:
        return history[-(MAX_TURNS * 2):]
    return history

# Use before calling the API:
conversation_history = trim_history(conversation_history)
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=SYSTEM_PROMPT,
    messages=conversation_history
)

Estimate Costs Before Going to Production

With Claude Sonnet, pricing is approximately $3/1M input tokens and $15/1M output tokens. A 10-turn conversation with ~200 input tokens + 150 output tokens per turn costs roughly $0.003 — quite affordable for an internal bot with a small user base.

Want to cut costs further? Use Claude Haiku instead of Sonnet — roughly 10x cheaper, and well-suited for simple FAQ chatbots that don’t require complex reasoning. For a deeper breakdown of token optimization strategies, see Optimizing LLM API Costs: Prompt Caching, Batching, and Eliminating Unnecessary Tokens.

Quick Takeaway

For internal chatbots or prototypes that need to ship fast, LLM is the most practical choice — minimal code, great natural language handling, no training data required. Rule-based still has its place when you need 100% deterministic behavior or fully offline operation.

The code above runs out of the box — just add your API key as an environment variable. Next steps if you want to level up: integrate RAG so the bot answers based on internal documents. Or wrap it in FastAPI, expose it over HTTP, and embed it in a web app — that’s only about 50 more lines of code.