Pydantic Output Intermediate

Pydantic is a Python data validation library that lets you define output schemas as Python classes. Combined with AI APIs, it gives you type-safe, validated objects directly from model responses — not raw strings or dictionaries.

OpenAI Native Pydantic Support

Python
from openai import OpenAI
from pydantic import BaseModel
from typing import Optional

class Person(BaseModel):
    name: str
    age: int
    city: str
    email: Optional[str] = None

client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Alice is 30, lives in NYC, email alice@example.com"}
    ],
    response_format=Person
)

person = completion.choices[0].message.parsed
print(person.name)   # "Alice"
print(person.age)    # 30 (int, not str)
print(person.city)   # "NYC"
print(person.email)  # "alice@example.com"

The Instructor Library

Instructor is a popular library that adds Pydantic support to any AI provider:

Python
import instructor
from openai import OpenAI
from pydantic import BaseModel, Field
from typing import List

class ExtractedEntity(BaseModel):
    name: str = Field(description="Entity name")
    entity_type: str = Field(description="Type: person, org, location")
    confidence: float = Field(ge=0, le=1, description="Confidence 0-1")

class Entities(BaseModel):
    entities: List[ExtractedEntity]

# Patch the client with Instructor
client = instructor.from_openai(OpenAI())

result = client.chat.completions.create(
    model="gpt-4o",
    response_model=Entities,
    messages=[
        {"role": "user", "content": "Apple CEO Tim Cook announced the new iPhone in Cupertino."}
    ]
)

for entity in result.entities:
    print(f"{entity.name} ({entity.entity_type}): {entity.confidence}")

Instructor with Anthropic

Python
import instructor
from anthropic import Anthropic

# Works the same way with Claude!
client = instructor.from_anthropic(Anthropic())

person = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    response_model=Person,
    messages=[
        {"role": "user", "content": "Alice is 30 and lives in NYC."}
    ]
)

Advanced Pydantic Features

Python
from pydantic import BaseModel, Field, field_validator
from typing import Literal, List
from enum import Enum

class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"

class ReviewAnalysis(BaseModel):
    sentiment: Sentiment
    score: float = Field(ge=0, le=10, description="Rating 0-10")
    key_points: List[str] = Field(max_length=5)
    category: Literal["product", "service", "support"]

    @field_validator('key_points')
    @classmethod
    def validate_points(cls, v):
        if len(v) == 0:
            raise ValueError("At least one key point required")
        return v
Why Pydantic?
  • Type checking at parse time — wrong types raise errors immediately
  • Field constraints (min/max, regex, enums) enforce business rules
  • Custom validators add semantic checks beyond structural validation
  • IDE autocomplete works on the parsed result
  • Easy serialization back to JSON/dict when needed
Instructor Retries: When the model's output fails Pydantic validation, Instructor can automatically retry with the validation error message, giving the model a chance to self-correct. Set max_retries=3 for production use.