Intermediate

Building AI Assistants

A practical guide to building AI assistants using LLM APIs. From choosing the model and designing system prompts to implementing function calling and managing conversations.

Choosing the LLM

RequirementRecommendedWhy
Best quality, safety-criticalClaude Sonnet 4Strong instruction following, safety, 200K context
Multimodal (images + audio)GPT-4oNative multimodal, fast, good quality
Long documents / large contextGemini 2.5 ProUp to 2M token context window
High volume, low costGPT-4o mini / Gemini FlashCheapest quality models
Self-hosted / privacyLlama 3.3 70BBest open-weight model for the size

System Prompt Design

The system prompt is the most important component of your assistant. It defines the assistant's identity, capabilities, boundaries, and behavior.

System Prompt - Customer Support Assistant
You are a customer support assistant for TechStore,
an online electronics retailer.

## Your Role
- Help customers with orders, returns, products, and
  account questions
- Be friendly, professional, and efficient
- Always prioritize the customer's satisfaction

## Guidelines
- Use the customer's name when available
- For order issues, always look up the order first
- Never share other customers' information
- If you cannot resolve an issue, offer to escalate
  to a human agent
- Do not make promises about refunds or replacements
  without checking the policy tool first

## Boundaries
- Only discuss TechStore products and services
- Do not provide advice on competitors' products
- Do not engage in personal conversations
- If asked about topics outside your scope, politely
  redirect to the relevant resource

## Tone
- Warm but professional
- Clear and concise
- Empathetic when customers are frustrated

Building with Anthropic Messages API

Python - Anthropic Assistant
import anthropic

client = anthropic.Anthropic()

class SupportAssistant:
    def __init__(self):
        self.model = "claude-sonnet-4-20250514"
        self.system = """You are a customer support
assistant for TechStore..."""
        self.messages = []
        self.tools = [
            {
                "name": "lookup_order",
                "description": "Look up order details by order ID",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "order_id": {"type": "string"}
                    },
                    "required": ["order_id"]
                }
            },
            {
                "name": "check_return_policy",
                "description": "Check the return policy for a product category",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "category": {"type": "string"}
                    },
                    "required": ["category"]
                }
            }
        ]

    def chat(self, user_message):
        self.messages.append({
            "role": "user",
            "content": user_message
        })

        # Agent loop for tool use
        while True:
            response = client.messages.create(
                model=self.model,
                max_tokens=1024,
                system=self.system,
                tools=self.tools,
                messages=self.messages
            )

            self.messages.append({
                "role": "assistant",
                "content": response.content
            })

            if response.stop_reason != "tool_use":
                return response.content[0].text

            # Execute tools and return results
            results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = self._run_tool(block)
                    results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })

            self.messages.append({
                "role": "user",
                "content": results
            })

# Usage
assistant = SupportAssistant()
reply = assistant.chat("Where is my order #12345?")
print(reply)

Threading and Conversation Management

  • Session management: Create unique session IDs for each conversation. Store messages per session.
  • Context window management: Monitor token count. When approaching limits, summarize older messages or use a sliding window.
  • Conversation state: Track metadata like customer ID, issue type, resolution status alongside the message history.
  • Persistence: Store conversations in a database (PostgreSQL, Redis) for continuity and analytics.

File Handling

Modern assistants can process uploaded files:

  • Documents: PDFs, Word docs, spreadsheets — extract text and pass to the LLM
  • Images: Use multimodal models (GPT-4o, Claude, Gemini) to analyze images directly
  • Code files: Parse and analyze source code for coding assistants
  • Implementation: Extract content, chunk if needed, pass as context or use RAG
Start simple: Build the simplest version first — a system prompt, message handling, and one or two tools. Add complexity (RAG, file handling, multi-channel) only after the basic assistant works well. Premature complexity is the enemy of good assistant design.