Intermediate

Building AI Assistants

A practical guide to building AI assistants using LLM APIs. From choosing the model and designing system prompts to implementing function calling and managing conversations.

Choosing the LLM

Requirement	Recommended	Why
Best quality, safety-critical	Claude Sonnet 4	Strong instruction following, safety, 200K context
Multimodal (images + audio)	GPT-4o	Native multimodal, fast, good quality
Long documents / large context	Gemini 2.5 Pro	Up to 2M token context window
High volume, low cost	GPT-4o mini / Gemini Flash	Cheapest quality models
Self-hosted / privacy	Llama 3.3 70B	Best open-weight model for the size

System Prompt Design

The system prompt is the most important component of your assistant. It defines the assistant's identity, capabilities, boundaries, and behavior.

System Prompt - Customer Support Assistant

You are a customer support assistant for TechStore,
an online electronics retailer.

## Your Role
- Help customers with orders, returns, products, and
  account questions
- Be friendly, professional, and efficient
- Always prioritize the customer's satisfaction

## Guidelines
- Use the customer's name when available
- For order issues, always look up the order first
- Never share other customers' information
- If you cannot resolve an issue, offer to escalate
  to a human agent
- Do not make promises about refunds or replacements
  without checking the policy tool first

## Boundaries
- Only discuss TechStore products and services
- Do not provide advice on competitors' products
- Do not engage in personal conversations
- If asked about topics outside your scope, politely
  redirect to the relevant resource

## Tone
- Warm but professional
- Clear and concise
- Empathetic when customers are frustrated

Building with Anthropic Messages API

Python - Anthropic Assistant

import anthropic

client = anthropic.Anthropic()

class SupportAssistant:
    def __init__(self):
        self.model = "claude-sonnet-4-20250514"
        self.system = """You are a customer support
assistant for TechStore..."""
        self.messages = []
        self.tools = [
            {
                "name": "lookup_order",
                "description": "Look up order details by order ID",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "order_id": {"type": "string"}
                    },
                    "required": ["order_id"]
                }
            },
            {
                "name": "check_return_policy",
                "description": "Check the return policy for a product category",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "category": {"type": "string"}
                    },
                    "required": ["category"]
                }
            }
        ]

    def chat(self, user_message):
        self.messages.append({
            "role": "user",
            "content": user_message
        })

        # Agent loop for tool use
        while True:
            response = client.messages.create(
                model=self.model,
                max_tokens=1024,
                system=self.system,
                tools=self.tools,
                messages=self.messages
            )

            self.messages.append({
                "role": "assistant",
                "content": response.content
            })

            if response.stop_reason != "tool_use":
                return response.content[0].text

            # Execute tools and return results
            results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = self._run_tool(block)
                    results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })

            self.messages.append({
                "role": "user",
                "content": results
            })

# Usage
assistant = SupportAssistant()
reply = assistant.chat("Where is my order #12345?")
print(reply)

Threading and Conversation Management

Session management: Create unique session IDs for each conversation. Store messages per session.
Context window management: Monitor token count. When approaching limits, summarize older messages or use a sliding window.
Conversation state: Track metadata like customer ID, issue type, resolution status alongside the message history.
Persistence: Store conversations in a database (PostgreSQL, Redis) for continuity and analytics.

File Handling

Modern assistants can process uploaded files:

Documents: PDFs, Word docs, spreadsheets — extract text and pass to the LLM
Images: Use multimodal models (GPT-4o, Claude, Gemini) to analyze images directly
Code files: Parse and analyze source code for coding assistants
Implementation: Extract content, chunk if needed, pass as context or use RAG

✅

Start simple: Build the simplest version first — a system prompt, message handling, and one or two tools. Add complexity (RAG, file handling, multi-channel) only after the basic assistant works well. Premature complexity is the enemy of good assistant design.

← Previous Types of Assistants Next → Conversation Design