Intermediate15 min read

Structured Output and Function Calling: Getting Reliable JSON from AI

Learn how to get reliable, parseable JSON output from AI models using structured output, function calling, and JSON schema. Essential for production AI applications.

JSONfunction callingstructured dataAPIsproduction

TL;DR

Structured output ensures AI models return data in predictable formats (like JSON) that your code can reliably parse. Function calling lets models request specific actions with structured parameters. Together, they transform LLMs from conversational interfaces into reliable components in production systems.

Why it matters

Free-form text responses are great for chat, but terrible for automation. When you need an AI to extract customer data, update databases, or trigger workflows, you need guarantees that the output will be valid JSON, not creative prose. Structured output makes AI production-ready.

The problem with free-form text

Ask an LLM to extract information, and you'll get creative but inconsistent responses:

Prompt: "Extract the customer name and email from this message."

Possible responses:

"The customer's name is John Doe and their email is john@example.com"
"Name: John Doe, Email: john@example.com"
"Sure! The name is John Doe (email: john@example.com)"
"John Doe john@example.com"

All correct, but how do you parse this reliably? You can't. Every format requires different parsing logic, and edge cases will break your code.

The parsing nightmare

Imagine extracting structured data from thousands of customer messages:

# This is fragile and will break
response = llm.generate("Extract name and email from: " + message)

# Which parser do you use?
if "name is" in response:
    name = response.split("name is ")[1].split(" and")[0]
elif "Name:" in response:
    name = response.split("Name: ")[1].split(",")[0]
# ... endless edge cases

This approach fails the moment the LLM gets creative with formatting.

Enter structured output

Structured output (also called JSON mode or constrained generation) forces the model to return valid JSON that matches a specific schema. No more parsing guesswork.

Same prompt, with structured output:

{
  "customer_name": "John Doe",
  "email": "john@example.com"
}

Every time. Guaranteed. Your code can rely on response["customer_name"] existing.

How structured output works

Behind the scenes, the model's token generation is constrained to only produce valid JSON:

You define a schema: "I want an object with customer_name and email fields"
The model generates: While producing tokens, invalid JSON is prevented
You receive: Guaranteed-valid JSON matching your schema

The model can't output "The customer's name is..." because that's not valid JSON according to your schema.

Jargon: "Constrained generation"
Limiting the model's outputs to only valid tokens for a specific format. For JSON, this means the next token must be valid JSON syntax—no free-form text allowed.

JSON schema basics

JSON Schema is a standard for describing JSON structure. It's how you tell the model what format you want.

Simple example

{
  "type": "object",
  "properties": {
    "customer_name": {
      "type": "string",
      "description": "Full name of the customer"
    },
    "email": {
      "type": "string",
      "description": "Customer's email address"
    },
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high"],
      "description": "Urgency level"
    }
  },
  "required": ["customer_name", "email"]
}

What this means:

Object with properties: The response must be a JSON object
customer_name: A string (required)
email: A string (required)
priority: One of "low", "medium", or "high" (optional)
Descriptions: Help the model understand what to put in each field

Nested structures

Schemas can be complex:

{
  "type": "object",
  "properties": {
    "customer": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "contact": {
          "type": "object",
          "properties": {
            "email": {"type": "string"},
            "phone": {"type": "string"}
          }
        }
      }
    },
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "product": {"type": "string"},
          "quantity": {"type": "number"}
        }
      }
    }
  }
}

This describes nested objects and arrays—perfect for complex data extraction.

Function calling explained

Function calling (also called tool use) lets models request specific actions with structured parameters.

Instead of: "Please search for 'AI news' using the search tool"

The model outputs:

{
  "function": "web_search",
  "parameters": {
    "query": "AI news",
    "limit": 10
  }
}

Your code interprets this, runs the function, and returns results to the model.

How it works

Define available functions with schemas
Model decides when to call them based on the user's request
Model outputs a function call (structured JSON)
Your code executes the function and returns results
Model uses results to continue or finish the task

Example: Calculator function

Function definition:

{
  "name": "calculate",
  "description": "Perform mathematical calculations",
  "parameters": {
    "type": "object",
    "properties": {
      "expression": {
        "type": "string",
        "description": "Math expression to evaluate (e.g., '15 * 234 + 67')"
      }
    },
    "required": ["expression"]
  }
}

User: "What's 15 times 234 plus 67?"

Model outputs:

{
  "function": "calculate",
  "parameters": {
    "expression": "15 * 234 + 67"
  }
}

Your code: Runs eval("15 * 234 + 67") → returns 3577

Model: "The result is 3,577."

Provider-specific implementations

Each AI provider implements structured output slightly differently.

OpenAI

JSON mode (basic):

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4-turbo",
  response_format={"type": "json_object"},
  messages=[
    {"role": "system", "content": "Extract customer info as JSON with name and email fields."},
    {"role": "user", "content": "Customer John Doe contacted us at john@example.com"}
  ]
)

data = json.loads(response.choices[0].message.content)

Structured outputs (with schema):

from pydantic import BaseModel

class CustomerInfo(BaseModel):
  customer_name: str
  email: str
  priority: str

response = client.beta.chat.completions.parse(
  model="gpt-4o",
  messages=[...],
  response_format=CustomerInfo
)

data = response.choices[0].message.parsed
# data.customer_name, data.email are guaranteed to exist

Function calling:

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {"type": "string"},
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
      }
    }
  }
]

response = client.chat.completions.create(
  model="gpt-4-turbo",
  messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
  tools=tools
)

# Check if model wants to call a function
if response.choices[0].message.tool_calls:
  function_call = response.choices[0].message.tool_calls[0]
  # function_call.function.name == "get_weather"
  # function_call.function.arguments == '{"location": "Tokyo"}'

Anthropic (Claude)

Claude uses tool use instead of function calling:

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
  model="claude-3-5-sonnet-20241022",
  max_tokens=1024,
  tools=[
    {
      "name": "get_weather",
      "description": "Get weather for a location",
      "input_schema": {
        "type": "object",
        "properties": {
          "location": {"type": "string"},
          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
        },
        "required": ["location"]
      }
    }
  ],
  messages=[{"role": "user", "content": "Weather in Paris?"}]
)

# Check for tool use
for block in response.content:
  if block.type == "tool_use":
    print(f"Tool: {block.name}")
    print(f"Input: {block.input}")

Structured output via prompt:

response = client.messages.create(
  model="claude-3-5-sonnet-20241022",
  max_tokens=1024,
  messages=[
    {
      "role": "user",
      "content": """Extract customer info from this message and return ONLY valid JSON with this exact structure:
      {
        "customer_name": "string",
        "email": "string",
        "priority": "low" | "medium" | "high"
      }

      Message: John Doe contacted us urgently at john@example.com"""
    }
  ]
)

data = json.loads(response.content[0].text)

Google (Gemini)

import google.generativeai as genai

model = genai.GenerativeModel('gemini-pro')

# Function calling
tools = [
  {
    "name": "search_database",
    "description": "Search customer database",
    "parameters": {
      "type": "object",
      "properties": {
        "query": {"type": "string"},
        "limit": {"type": "integer"}
      },
      "required": ["query"]
    }
  }
]

chat = model.start_chat()
response = chat.send_message(
  "Find customers named John",
  tools=tools
)

# Check for function calls
for part in response.parts:
  if fn := part.function_call:
    print(f"Function: {fn.name}")
    print(f"Args: {fn.args}")

Validation and error handling

Structured output reduces errors, but doesn't eliminate them. Always validate.

Schema validation

from jsonschema import validate, ValidationError

schema = {
  "type": "object",
  "properties": {
    "customer_name": {"type": "string", "minLength": 1},
    "email": {"type": "string", "pattern": "^[a-zA-Z0-9+_.-]+@[a-zA-Z0-9.-]+$"}
  },
  "required": ["customer_name", "email"]
}

try:
  response_json = json.loads(llm_response)
  validate(instance=response_json, schema=schema)
  # Safe to use
  process_customer(response_json)
except json.JSONDecodeError:
  # LLM didn't return valid JSON (rare with structured output)
  log_error("Invalid JSON from LLM")
except ValidationError as e:
  # JSON doesn't match schema
  log_error(f"Schema validation failed: {e.message}")

Pydantic for type safety

Pydantic provides runtime validation and type hints:

from pydantic import BaseModel, EmailStr, validator

class CustomerInfo(BaseModel):
  customer_name: str
  email: EmailStr  # Validates email format
  priority: str

  @validator('priority')
  def validate_priority(cls, v):
    if v not in ['low', 'medium', 'high']:
      raise ValueError('Priority must be low, medium, or high')
    return v

  @validator('customer_name')
  def validate_name(cls, v):
    if len(v.strip()) == 0:
      raise ValueError('Name cannot be empty')
    return v.strip()

# Usage
try:
  customer = CustomerInfo(**response_json)
  # All fields validated, types guaranteed
except ValidationError as e:
  print(e.json())

Handling missing fields

Even with required fields, always have fallbacks:

customer_name = response_json.get("customer_name", "Unknown Customer")
email = response_json.get("email")

if not email:
  # Retry with more explicit prompt
  retry_response = llm.generate_with_schema(
    prompt="IMPORTANT: Extract the email address. If no email is present, return null.",
    schema=schema
  )

Common patterns and use cases

1. Data extraction from unstructured text

Use case: Extract structured data from customer emails, forms, or documents.

schema = {
  "type": "object",
  "properties": {
    "order_id": {"type": "string"},
    "products": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "quantity": {"type": "integer"},
          "price": {"type": "number"}
        }
      }
    },
    "total": {"type": "number"},
    "shipping_address": {"type": "string"}
  }
}

# LLM extracts structured order info from free-form text
order_data = extract_with_schema(customer_email, schema)
create_order_in_database(order_data)

2. Classification and routing

Use case: Categorize support tickets, emails, or content.

schema = {
  "type": "object",
  "properties": {
    "category": {
      "type": "string",
      "enum": ["billing", "technical", "sales", "general"]
    },
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high", "urgent"]
    },
    "requires_human": {"type": "boolean"},
    "suggested_response": {"type": "string"}
  },
  "required": ["category", "priority", "requires_human"]
}

ticket_info = classify_ticket(ticket_text, schema)

# Route based on structured output
if ticket_info["requires_human"]:
  assign_to_agent(ticket_info["category"])
else:
  send_auto_response(ticket_info["suggested_response"])

3. API parameter generation

Use case: Convert natural language to API calls.

# Function definition for a CRM API
create_contact_tool = {
  "name": "create_contact",
  "description": "Create a new contact in the CRM",
  "parameters": {
    "type": "object",
    "properties": {
      "first_name": {"type": "string"},
      "last_name": {"type": "string"},
      "email": {"type": "string"},
      "company": {"type": "string"},
      "phone": {"type": "string"},
      "tags": {"type": "array", "items": {"type": "string"}}
    },
    "required": ["first_name", "last_name", "email"]
  }
}

# User: "Add Sarah Chen from Acme Corp, email sarah@acme.com, tag as VIP customer"
# Model outputs structured function call
function_call = {
  "function": "create_contact",
  "parameters": {
    "first_name": "Sarah",
    "last_name": "Chen",
    "email": "sarah@acme.com",
    "company": "Acme Corp",
    "tags": ["VIP", "customer"]
  }
}

# Execute API call with validated parameters
crm_api.create_contact(**function_call["parameters"])

4. Multi-step workflows

Use case: Break complex tasks into structured steps.

workflow_schema = {
  "type": "object",
  "properties": {
    "steps": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "action": {
            "type": "string",
            "enum": ["search", "analyze", "summarize", "notify"]
          },
          "parameters": {"type": "object"},
          "depends_on": {"type": "array", "items": {"type": "integer"}}
        }
      }
    }
  }
}

# Model plans a workflow
plan = generate_plan("Research AI safety and send summary to team", workflow_schema)

# Execute steps in order
for step in plan["steps"]:
  execute_step(step["action"], step["parameters"])

5. Database operations

Use case: Natural language to database queries.

db_query_tool = {
  "name": "query_database",
  "description": "Query the customer database",
  "parameters": {
    "type": "object",
    "properties": {
      "table": {
        "type": "string",
        "enum": ["customers", "orders", "products"]
      },
      "filters": {
        "type": "object",
        "properties": {
          "field": {"type": "string"},
          "operator": {"type": "string", "enum": ["equals", "contains", "greater_than", "less_than"]},
          "value": {"type": "string"}
        }
      },
      "limit": {"type": "integer", "maximum": 100}
    },
    "required": ["table"]
  }
}

# User: "Show me customers in California who spent over $1000"
# Model generates structured query
query = {
  "function": "query_database",
  "parameters": {
    "table": "customers",
    "filters": [
      {"field": "state", "operator": "equals", "value": "California"},
      {"field": "total_spent", "operator": "greater_than", "value": "1000"}
    ],
    "limit": 50
  }
}

Troubleshooting

Problem: Model returns text instead of JSON

Symptoms: You expect JSON but get "Here's the information you requested..."

Solutions:

Use provider-specific JSON mode (OpenAI's response_format)
Add explicit instructions: "Return ONLY valid JSON, no other text"
Use structured output APIs, not basic completion
Check your system prompt clearly specifies JSON output

Problem: Schema validation fails

Symptoms: JSON is valid but doesn't match your schema

Solutions:

Simplify the schema (complex nested schemas confuse models)
Add detailed description fields to each property
Show examples in your prompt
Use enum for constrained choices instead of free text
Make fewer fields required initially

Problem: Inconsistent field names

Symptoms: Sometimes customerName, sometimes customer_name

Solutions:

Explicitly specify field names in the schema
Use additionalProperties: false to prevent extra fields
Provide an example in your prompt showing exact field names
Use TypeScript/Pydantic schemas that enforce consistency

Problem: Hallucinated data

Symptoms: Model invents information to fill required fields

Solutions:

Make fields optional unless truly required
Add validation that checks for placeholder values
Prompt the model to use null for missing information
Include instructions: "Only extract information explicitly present in the text"

Problem: Function calling fails

Symptoms: Model doesn't call functions or calls the wrong one

Solutions:

Improve function descriptions (be specific about when to use each)
Reduce the number of available functions (fewer choices = better accuracy)
Add examples of correct function usage in the prompt
Make parameter descriptions clearer
Check function names are descriptive (not func1, func2)

Problem: Performance is slow

Symptoms: Structured output takes longer than free text

Solutions:

This is normal (constrained generation is slower)
Use smaller models for simple schemas
Simplify your schema
Cache results for repeated queries
Use parallel function calls when possible

Best practices

Schema design

1. Start simple, add complexity gradually

// Start with this
{"type": "object", "properties": {"name": {"type": "string"}}}

// Not this
{"type": "object", "properties": {"person": {"type": "object", "properties": {...}}}}

2. Use enums for constrained values

// Good
{"type": "string", "enum": ["small", "medium", "large"]}

// Bad (model might return "M", "med", "MEDIUM")
{"type": "string", "description": "Size: small, medium, or large"}

3. Provide clear descriptions

{
  "properties": {
    "confidence": {
      "type": "number",
      "description": "Confidence score from 0.0 to 1.0, where 1.0 is highest confidence",
      "minimum": 0,
      "maximum": 1
    }
  }
}

4. Make fields optional when appropriate

{
  "required": ["customer_name"],  // Only truly required fields
  "properties": {
    "customer_name": {"type": "string"},
    "phone": {"type": "string"},  // Optional, not everyone has one
    "notes": {"type": "string"}   // Optional
  }
}

Prompt engineering for structure

1. Show examples

Extract customer info as JSON. Example output:
{
  "customer_name": "Jane Smith",
  "email": "jane@example.com",
  "priority": "high"
}

Now extract from this message: [message]

2. Be explicit about edge cases

Extract fields. If a field is not present in the text, use null.
Do not invent or guess information.

3. Use system prompts for consistency

system_prompt = """You are a data extraction assistant.
Always return valid JSON matching the provided schema.
Extract only information explicitly stated in the input.
Use null for missing fields. Never invent data."""

Error handling strategy

def extract_with_retry(text, schema, max_retries=3):
  for attempt in range(max_retries):
    try:
      response = llm.generate_structured(text, schema)
      validated = validate_schema(response, schema)
      return validated
    except ValidationError as e:
      if attempt == max_retries - 1:
        # Final attempt failed
        log_error(f"Schema validation failed after {max_retries} attempts")
        return get_safe_default()
      else:
        # Retry with more explicit instructions
        text = f"{text}\n\nPrevious attempt failed: {e.message}. Please ensure all required fields are present."
    except Exception as e:
      log_error(f"Unexpected error: {e}")
      return get_safe_default()

Testing and validation

1. Test with edge cases

Empty input
Missing fields
Unexpected formats
Very long input
Special characters

2. Monitor in production

def track_structured_output(response, schema_name):
  metrics.increment(f"structured_output.{schema_name}.total")

  if validation_failed(response):
    metrics.increment(f"structured_output.{schema_name}.validation_failed")
    log_sample(response, schema_name)

  if has_null_required_fields(response):
    metrics.increment(f"structured_output.{schema_name}.missing_required")

3. A/B test schemas

Try different field names
Test with/without descriptions
Compare simple vs. nested structures
Measure accuracy and latency

Use responsibly

Validate all outputs: Never trust structured output blindly
Handle failures gracefully: Have fallback behavior when parsing fails
Don't over-constrain: Too strict schemas frustrate the model
Monitor costs: Function calling increases token usage
Test edge cases: Empty inputs, missing data, special characters
Log failures: Track validation errors to improve schemas
Privacy matters: Structured output can leak sensitive data if logged

What's next?

Now that you understand structured output, you might explore:

Agents & Tools: Building AI systems that take actions
Evaluating AI Answers: Measuring accuracy of extracted data
Orchestration Options: Frameworks like LangChain for managing function calls
APIs & Integration: Connecting AI to your existing systems
Guardrails & Policy: Setting boundaries on what AI can output

Frequently Asked Questions

Is structured output the same as function calling?

Not exactly. Structured output ensures responses match a JSON schema. Function calling is a specific use case where the model requests to call functions with structured parameters. Function calling uses structured output under the hood.

Can I use structured output with any LLM?

Most modern LLMs support it (GPT-4, Claude 3, Gemini Pro, etc.), but implementation varies. Some have native support (OpenAI's JSON mode), others need careful prompting. Check your provider's documentation.

Will structured output make responses slower?

Yes, slightly. Constraining generation to valid JSON takes more computation than free text. The difference is usually negligible (<500ms) but can add up with complex schemas or many requests.

What happens if the model can't fill all required fields?

Behavior varies by provider. Some will hallucinate data to satisfy the schema, others will return an error. Always validate outputs and make fields optional when appropriate.

Can I combine multiple function calls in one response?

Yes! Many providers support parallel function calling where the model requests multiple functions simultaneously. Useful for complex tasks requiring several API calls.

How do I debug when the model returns wrong data?

Check: (1) Is your schema description clear? (2) Are you providing examples? (3) Is the input ambiguous? (4) Try simplifying the schema. Log failures and iterate on prompts and schema definitions.

Was this guide helpful?

Your feedback helps us improve our guides

Key Terms Used in This Guide

Model

The trained AI system that contains all the patterns it learned from data. Think of it as the 'brain' that makes predictions or decisions.

Tool (Function Calling)

A capability that allows an AI to call external functions or APIs—like searching the web, querying databases, or running calculations.

AI (Artificial Intelligence)

Making machines perform tasks that typically require human intelligence—like understanding language, recognizing patterns, or making decisions.

Related Guides

Context Management: Handling Long Conversations and Documents

Intermediate

Master context window management for AI. Learn strategies for long conversations, document processing, memory systems, and context optimization.

12 min read

Deployment Patterns: Serverless, Edge, and Containers

Intermediate

How to deploy AI systems in production. Compare serverless, edge, container, and self-hosted options.

13 min read

Fine-Tuning vs RAG: Which Should You Use?

Intermediate

Compare fine-tuning and RAG to customize AI. Learn when each approach works best, how they differ, and how to combine them.

12 min read

TL;DR

Why it matters

The problem with free-form text

The parsing nightmare

Enter structured output

How structured output works

JSON schema basics

Simple example

Nested structures

Function calling explained

How it works

Example: Calculator function

Provider-specific implementations

OpenAI

Anthropic (Claude)

Google (Gemini)

Validation and error handling

Schema validation

Pydantic for type safety

Handling missing fields

Common patterns and use cases

1. Data extraction from unstructured text

2. Classification and routing

3. API parameter generation

4. Multi-step workflows

5. Database operations

Troubleshooting

Problem: Model returns text instead of JSON

Problem: Schema validation fails

Problem: Inconsistent field names

Problem: Hallucinated data

Problem: Function calling fails

Problem: Performance is slow

Best practices

Schema design

Prompt engineering for structure

Error handling strategy

Testing and validation

Use responsibly

What&#39;s next?

Frequently Asked Questions

Is structured output the same as function calling?

Can I use structured output with any LLM?

Will structured output make responses slower?

What happens if the model can't fill all required fields?

Can I combine multiple function calls in one response?

How do I debug when the model returns wrong data?

Was this guide helpful?

Key Terms Used in This Guide

Model

Tool (Function Calling)

AI (Artificial Intelligence)

Related Guides

Context Management: Handling Long Conversations and Documents

Deployment Patterns: Serverless, Edge, and Containers

Fine-Tuning vs RAG: Which Should You Use?

What's next?