TL;DR

Structured output ensures AI models return data in predictable formats (like JSON) that your code can reliably parse. Function calling lets models request specific actions with structured parameters. Together, they transform LLMs from conversational interfaces into reliable components in production systems.

Why it matters

Free-form text responses are great for chat, but terrible for automation. When you need an AI to extract customer data, update databases, or trigger workflows, you need guarantees that the output will be valid JSON, not creative prose. Structured output makes AI production-ready.

The problem with free-form text

Ask an LLM to extract information, and you'll get creative but inconsistent responses:

Prompt: "Extract the customer name and email from this message."

Possible responses:

All correct, but how do you parse this reliably? You can't. Every format requires different parsing logic, and edge cases will break your code.

The parsing nightmare

Imagine extracting structured data from thousands of customer messages:

# This is fragile and will break
response = llm.generate("Extract name and email from: " + message)

# Which parser do you use?
if "name is" in response:
    name = response.split("name is ")[1].split(" and")[0]
elif "Name:" in response:
    name = response.split("Name: ")[1].split(",")[0]
# ... endless edge cases

This approach fails the moment the LLM gets creative with formatting.

Enter structured output

Structured output (also called JSON mode or constrained generation) forces the model to return valid JSON that matches a specific schema. No more parsing guesswork.

Same prompt, with structured output:

{
  "customer_name": "John Doe",
  "email": "john@example.com"
}

Every time. Guaranteed. Your code can rely on response["customer_name"] existing.

How structured output works

Behind the scenes, the model's token generation is constrained to only produce valid JSON:

  1. You define a schema: "I want an object with customer_name and email fields"
  2. The model generates: While producing tokens, invalid JSON is prevented
  3. You receive: Guaranteed-valid JSON matching your schema

The model can't output "The customer's name is..." because that's not valid JSON according to your schema.

Jargon: "Constrained generation"
Limiting the model's outputs to only valid tokens for a specific format. For JSON, this means the next token must be valid JSON syntax—no free-form text allowed.

JSON schema basics

JSON Schema is a standard for describing JSON structure. It's how you tell the model what format you want.

Simple example

{
  "type": "object",
  "properties": {
    "customer_name": {
      "type": "string",
      "description": "Full name of the customer"
    },
    "email": {
      "type": "string",
      "description": "Customer's email address"
    },
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high"],
      "description": "Urgency level"
    }
  },
  "required": ["customer_name", "email"]
}

What this means:

  • Object with properties: The response must be a JSON object
  • customer_name: A string (required)
  • email: A string (required)
  • priority: One of "low", "medium", or "high" (optional)
  • Descriptions: Help the model understand what to put in each field

Nested structures

Schemas can be complex:

{
  "type": "object",
  "properties": {
    "customer": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "contact": {
          "type": "object",
          "properties": {
            "email": {"type": "string"},
            "phone": {"type": "string"}
          }
        }
      }
    },
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "product": {"type": "string"},
          "quantity": {"type": "number"}
        }
      }
    }
  }
}

This describes nested objects and arrays—perfect for complex data extraction.

Function calling explained

Function calling (also called tool use) lets models request specific actions with structured parameters.

Instead of: "Please search for 'AI news' using the search tool"

The model outputs:

{
  "function": "web_search",
  "parameters": {
    "query": "AI news",
    "limit": 10
  }
}

Your code interprets this, runs the function, and returns results to the model.

How it works

  1. Define available functions with schemas
  2. Model decides when to call them based on the user's request
  3. Model outputs a function call (structured JSON)
  4. Your code executes the function and returns results
  5. Model uses results to continue or finish the task

Example: Calculator function

Function definition:

{
  "name": "calculate",
  "description": "Perform mathematical calculations",
  "parameters": {
    "type": "object",
    "properties": {
      "expression": {
        "type": "string",
        "description": "Math expression to evaluate (e.g., '15 * 234 + 67')"
      }
    },
    "required": ["expression"]
  }
}

User: "What's 15 times 234 plus 67?"

Model outputs:

{
  "function": "calculate",
  "parameters": {
    "expression": "15 * 234 + 67"
  }
}

Your code: Runs eval("15 * 234 + 67") → returns 3577

Model: "The result is 3,577."

Provider-specific implementations

Each AI provider implements structured output slightly differently.

OpenAI

JSON mode (basic):

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4-turbo",
  response_format={"type": "json_object"},
  messages=[
    {"role": "system", "content": "Extract customer info as JSON with name and email fields."},
    {"role": "user", "content": "Customer John Doe contacted us at john@example.com"}
  ]
)

data = json.loads(response.choices[0].message.content)

Structured outputs (with schema):

from pydantic import BaseModel

class CustomerInfo(BaseModel):
  customer_name: str
  email: str
  priority: str

response = client.beta.chat.completions.parse(
  model="gpt-4o",
  messages=[...],
  response_format=CustomerInfo
)

data = response.choices[0].message.parsed
# data.customer_name, data.email are guaranteed to exist

Function calling:

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string"},
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
      }
    }
  }
]

response = client.chat.completions.create(
  model="gpt-4-turbo",
  messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
  tools=tools
)

# Check if model wants to call a function
if response.choices[0].message.tool_calls:
  function_call = response.choices[0].message.tool_calls[0]
  # function_call.function.name == "get_weather"
  # function_call.function.arguments == '{"location": "Tokyo"}'

Anthropic (Claude)

Claude uses tool use instead of function calling:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
  model="claude-3-5-sonnet-20241022",
  max_tokens=1024,
  tools=[
    {
      "name": "get_weather",
      "description": "Get weather for a location",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {"type": "string"},
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
      }
    }
  ],
  messages=[{"role": "user", "content": "Weather in Paris?"}]
)

# Check for tool use
for block in response.content:
  if block.type == "tool_use":
    print(f"Tool: {block.name}")
    print(f"Input: {block.input}")

Structured output via prompt:

response = client.messages.create(
  model="claude-3-5-sonnet-20241022",
  max_tokens=1024,
  messages=[
    {
      "role": "user",
      "content": """Extract customer info from this message and return ONLY valid JSON with this exact structure:
      {
        "customer_name": "string",
        "email": "string",
        "priority": "low" | "medium" | "high"
      }

      Message: John Doe contacted us urgently at john@example.com"""
    }
  ]
)

data = json.loads(response.content[0].text)

Google (Gemini)

import google.generativeai as genai

model = genai.GenerativeModel('gemini-pro')

# Function calling
tools = [
  {
    "name": "search_database",
    "description": "Search customer database",
    "parameters": {
      "type": "object",
      "properties": {
        "query": {"type": "string"},
        "limit": {"type": "integer"}
      },
      "required": ["query"]
    }
  }
]

chat = model.start_chat()
response = chat.send_message(
  "Find customers named John",
  tools=tools
)

# Check for function calls
for part in response.parts:
  if fn := part.function_call:
    print(f"Function: {fn.name}")
    print(f"Args: {fn.args}")

Validation and error handling

Structured output reduces errors, but doesn't eliminate them. Always validate.

Schema validation

from jsonschema import validate, ValidationError

schema = {
  "type": "object",
  "properties": {
    "customer_name": {"type": "string", "minLength": 1},
    "email": {"type": "string", "pattern": "^[a-zA-Z0-9+_.-]+@[a-zA-Z0-9.-]+$"}
  },
  "required": ["customer_name", "email"]
}

try:
  response_json = json.loads(llm_response)
  validate(instance=response_json, schema=schema)
  # Safe to use
  process_customer(response_json)
except json.JSONDecodeError:
  # LLM didn't return valid JSON (rare with structured output)
  log_error("Invalid JSON from LLM")
except ValidationError as e:
  # JSON doesn't match schema
  log_error(f"Schema validation failed: {e.message}")

Pydantic for type safety

Pydantic provides runtime validation and type hints:

from pydantic import BaseModel, EmailStr, validator

class CustomerInfo(BaseModel):
  customer_name: str
  email: EmailStr  # Validates email format
  priority: str

  @validator('priority')
  def validate_priority(cls, v):
    if v not in ['low', 'medium', 'high']:
      raise ValueError('Priority must be low, medium, or high')
    return v

  @validator('customer_name')
  def validate_name(cls, v):
    if len(v.strip()) == 0:
      raise ValueError('Name cannot be empty')
    return v.strip()

# Usage
try:
  customer = CustomerInfo(**response_json)
  # All fields validated, types guaranteed
except ValidationError as e:
  print(e.json())

Handling missing fields

Even with required fields, always have fallbacks:

customer_name = response_json.get("customer_name", "Unknown Customer")
email = response_json.get("email")

if not email:
  # Retry with more explicit prompt
  retry_response = llm.generate_with_schema(
    prompt="IMPORTANT: Extract the email address. If no email is present, return null.",
    schema=schema
  )

Common patterns and use cases

1. Data extraction from unstructured text

Use case: Extract structured data from customer emails, forms, or documents.

schema = {
  "type": "object",
  "properties": {
    "order_id": {"type": "string"},
    "products": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "quantity": {"type": "integer"},
          "price": {"type": "number"}
        }
      }
    },
    "total": {"type": "number"},
    "shipping_address": {"type": "string"}
  }
}

# LLM extracts structured order info from free-form text
order_data = extract_with_schema(customer_email, schema)
create_order_in_database(order_data)

2. Classification and routing

Use case: Categorize support tickets, emails, or content.

schema = {
  "type": "object",
  "properties": {
    "category": {
      "type": "string",
      "enum": ["billing", "technical", "sales", "general"]
    },
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high", "urgent"]
    },
    "requires_human": {"type": "boolean"},
    "suggested_response": {"type": "string"}
  },
  "required": ["category", "priority", "requires_human"]
}

ticket_info = classify_ticket(ticket_text, schema)

# Route based on structured output
if ticket_info["requires_human"]:
  assign_to_agent(ticket_info["category"])
else:
  send_auto_response(ticket_info["suggested_response"])

3. API parameter generation

Use case: Convert natural language to API calls.

# Function definition for a CRM API
create_contact_tool = {
  "name": "create_contact",
  "description": "Create a new contact in the CRM",
  "parameters": {
    "type": "object",
    "properties": {
      "first_name": {"type": "string"},
      "last_name": {"type": "string"},
      "email": {"type": "string"},
      "company": {"type": "string"},
      "phone": {"type": "string"},
      "tags": {"type": "array", "items": {"type": "string"}}
    },
    "required": ["first_name", "last_name", "email"]
  }
}

# User: "Add Sarah Chen from Acme Corp, email sarah@acme.com, tag as VIP customer"
# Model outputs structured function call
function_call = {
  "function": "create_contact",
  "parameters": {
    "first_name": "Sarah",
    "last_name": "Chen",
    "email": "sarah@acme.com",
    "company": "Acme Corp",
    "tags": ["VIP", "customer"]
  }
}

# Execute API call with validated parameters
crm_api.create_contact(**function_call["parameters"])

4. Multi-step workflows

Use case: Break complex tasks into structured steps.

workflow_schema = {
  "type": "object",
  "properties": {
    "steps": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "action": {
            "type": "string",
            "enum": ["search", "analyze", "summarize", "notify"]
          },
          "parameters": {"type": "object"},
          "depends_on": {"type": "array", "items": {"type": "integer"}}
        }
      }
    }
  }
}

# Model plans a workflow
plan = generate_plan("Research AI safety and send summary to team", workflow_schema)

# Execute steps in order
for step in plan["steps"]:
  execute_step(step["action"], step["parameters"])

5. Database operations

Use case: Natural language to database queries.

db_query_tool = {
  "name": "query_database",
  "description": "Query the customer database",
  "parameters": {
    "type": "object",
    "properties": {
      "table": {
        "type": "string",
        "enum": ["customers", "orders", "products"]
      },
      "filters": {
        "type": "object",
        "properties": {
          "field": {"type": "string"},
          "operator": {"type": "string", "enum": ["equals", "contains", "greater_than", "less_than"]},
          "value": {"type": "string"}
        }
      },
      "limit": {"type": "integer", "maximum": 100}
    },
    "required": ["table"]
  }
}

# User: "Show me customers in California who spent over $1000"
# Model generates structured query
query = {
  "function": "query_database",
  "parameters": {
    "table": "customers",
    "filters": [
      {"field": "state", "operator": "equals", "value": "California"},
      {"field": "total_spent", "operator": "greater_than", "value": "1000"}
    ],
    "limit": 50
  }
}

Troubleshooting

Problem: Model returns text instead of JSON

Symptoms: You expect JSON but get "Here's the information you requested..."

Solutions:

  • Use provider-specific JSON mode (OpenAI's response_format)
  • Add explicit instructions: "Return ONLY valid JSON, no other text"
  • Use structured output APIs, not basic completion
  • Check your system prompt clearly specifies JSON output

Problem: Schema validation fails

Symptoms: JSON is valid but doesn't match your schema

Solutions:

  • Simplify the schema (complex nested schemas confuse models)
  • Add detailed description fields to each property
  • Show examples in your prompt
  • Use enum for constrained choices instead of free text
  • Make fewer fields required initially

Problem: Inconsistent field names

Symptoms: Sometimes customerName, sometimes customer_name

Solutions:

  • Explicitly specify field names in the schema
  • Use additionalProperties: false to prevent extra fields
  • Provide an example in your prompt showing exact field names
  • Use TypeScript/Pydantic schemas that enforce consistency

Problem: Hallucinated data

Symptoms: Model invents information to fill required fields

Solutions:

  • Make fields optional unless truly required
  • Add validation that checks for placeholder values
  • Prompt the model to use null for missing information
  • Include instructions: "Only extract information explicitly present in the text"

Problem: Function calling fails

Symptoms: Model doesn't call functions or calls the wrong one

Solutions:

  • Improve function descriptions (be specific about when to use each)
  • Reduce the number of available functions (fewer choices = better accuracy)
  • Add examples of correct function usage in the prompt
  • Make parameter descriptions clearer
  • Check function names are descriptive (not func1, func2)

Problem: Performance is slow

Symptoms: Structured output takes longer than free text

Solutions:

  • This is normal (constrained generation is slower)
  • Use smaller models for simple schemas
  • Simplify your schema
  • Cache results for repeated queries
  • Use parallel function calls when possible

Best practices

Schema design

1. Start simple, add complexity gradually

// Start with this
{"type": "object", "properties": {"name": {"type": "string"}}}

// Not this
{"type": "object", "properties": {"person": {"type": "object", "properties": {...}}}}

2. Use enums for constrained values

// Good
{"type": "string", "enum": ["small", "medium", "large"]}

// Bad (model might return "M", "med", "MEDIUM")
{"type": "string", "description": "Size: small, medium, or large"}

3. Provide clear descriptions

{
  "properties": {
    "confidence": {
      "type": "number",
      "description": "Confidence score from 0.0 to 1.0, where 1.0 is highest confidence",
      "minimum": 0,
      "maximum": 1
    }
  }
}

4. Make fields optional when appropriate

{
  "required": ["customer_name"],  // Only truly required fields
  "properties": {
    "customer_name": {"type": "string"},
    "phone": {"type": "string"},  // Optional, not everyone has one
    "notes": {"type": "string"}   // Optional
  }
}

Prompt engineering for structure

1. Show examples

Extract customer info as JSON. Example output:
{
  "customer_name": "Jane Smith",
  "email": "jane@example.com",
  "priority": "high"
}

Now extract from this message: [message]

2. Be explicit about edge cases

Extract fields. If a field is not present in the text, use null.
Do not invent or guess information.

3. Use system prompts for consistency

system_prompt = """You are a data extraction assistant.
Always return valid JSON matching the provided schema.
Extract only information explicitly stated in the input.
Use null for missing fields. Never invent data."""

Error handling strategy

def extract_with_retry(text, schema, max_retries=3):
  for attempt in range(max_retries):
    try:
      response = llm.generate_structured(text, schema)
      validated = validate_schema(response, schema)
      return validated
    except ValidationError as e:
      if attempt == max_retries - 1:
        # Final attempt failed
        log_error(f"Schema validation failed after {max_retries} attempts")
        return get_safe_default()
      else:
        # Retry with more explicit instructions
        text = f"{text}\n\nPrevious attempt failed: {e.message}. Please ensure all required fields are present."
    except Exception as e:
      log_error(f"Unexpected error: {e}")
      return get_safe_default()

Testing and validation

1. Test with edge cases

  • Empty input
  • Missing fields
  • Unexpected formats
  • Very long input
  • Special characters

2. Monitor in production

def track_structured_output(response, schema_name):
  metrics.increment(f"structured_output.{schema_name}.total")

  if validation_failed(response):
    metrics.increment(f"structured_output.{schema_name}.validation_failed")
    log_sample(response, schema_name)

  if has_null_required_fields(response):
    metrics.increment(f"structured_output.{schema_name}.missing_required")

3. A/B test schemas

  • Try different field names
  • Test with/without descriptions
  • Compare simple vs. nested structures
  • Measure accuracy and latency

Use responsibly

  • Validate all outputs: Never trust structured output blindly
  • Handle failures gracefully: Have fallback behavior when parsing fails
  • Don't over-constrain: Too strict schemas frustrate the model
  • Monitor costs: Function calling increases token usage
  • Test edge cases: Empty inputs, missing data, special characters
  • Log failures: Track validation errors to improve schemas
  • Privacy matters: Structured output can leak sensitive data if logged

What's next?

Now that you understand structured output, you might explore:

  • Agents & Tools: Building AI systems that take actions
  • Evaluating AI Answers: Measuring accuracy of extracted data
  • Orchestration Options: Frameworks like LangChain for managing function calls
  • APIs & Integration: Connecting AI to your existing systems
  • Guardrails & Policy: Setting boundaries on what AI can output