Best Practices Advanced

Deploying function calling in production requires careful attention to security, reliability, and user experience. This lesson covers the essential patterns and common pitfalls for tool-using AI systems.

Security Checklist

Critical Security Rules:
  • Never execute arbitrary code from model output
  • Always validate and sanitize function arguments
  • Use allowlists for permitted operations, not blocklists
  • Implement rate limiting per tool and per user
  • Log all tool executions for audit trails
  • Require human confirmation for destructive actions (delete, send, publish)

Error Handling Patterns

Python
def safe_execute(tool_name, arguments, registry, max_retries=2):
    """Execute a tool with retries and structured error handling."""
    for attempt in range(max_retries + 1):
        try:
            result = registry.execute(tool_name, arguments)
            return result
        except TimeoutError:
            if attempt == max_retries:
                return json.dumps({
                    "error": f"{tool_name} timed out after {max_retries} retries",
                    "suggestion": "Try a simpler query or different approach"
                })
        except PermissionError:
            return json.dumps({
                "error": "Permission denied for this operation",
                "suggestion": "This action requires elevated permissions"
            })

Tool Description Best Practices

Aspect Bad Example Good Example
Name do_stuff search_knowledge_base
Description "Searches things" "Search the company knowledge base for articles matching a query. Returns top 5 results with title, snippet, and URL."
Parameter desc "The query" "Search query string. Supports natural language questions and keyword searches. Max 200 characters."

Production Patterns

🔒

Scoped Permissions

Give each user session a scoped set of tools based on their role. Admin users get write tools; regular users get read-only tools.

🕒

Timeout & Circuit Breakers

Set timeouts for every tool execution. If a tool fails repeatedly, use a circuit breaker to disable it temporarily.

📊

Observability

Log tool calls with timing, arguments (sanitized), results, and errors. Track tool usage patterns to optimize your tool set.

🛠

Max Iterations

Limit the number of tool call rounds (e.g., max 10) to prevent infinite loops where the model keeps calling tools without converging.

Testing Strategies

  • Unit test each tool: Test the handler function independently with valid, invalid, and edge-case inputs
  • Mock the model: Create test harnesses that simulate tool_calls to test your execution loop
  • Integration tests: Send real prompts and verify the model selects the right tools with the right arguments
  • Red team testing: Try adversarial prompts that attempt to misuse tools (SQL injection, path traversal, etc.)
Production Readiness Checklist:
  • All tools have input validation and structured error responses
  • Rate limiting is configured per tool and per user
  • Destructive actions require human-in-the-loop confirmation
  • Tool execution is logged with timing and sanitized arguments
  • Maximum iteration count prevents infinite tool-call loops
  • Timeouts are set for all external API calls in tools