- Home
- /Guides
- /build-deploy
- /Structured Output and Function Calling: Getting Reliable JSON from AI
Structured Output and Function Calling: Getting Reliable JSON from AI
Learn how to get reliable, parseable JSON output from AI models using structured output, function calling, and JSON schema. Essential for production AI applications.
TL;DR
Structured output ensures AI models return data in predictable formats (like JSON) that your code can reliably parse. Function calling lets models request specific actions with structured parameters. Together, they transform LLMs from conversational interfaces into reliable components in production systems.
Why it matters
Free-form text responses are great for chat, but terrible for automation. When you need an AI to extract customer data, update databases, or trigger workflows, you need guarantees that the output will be valid JSON, not creative prose. Structured output makes AI production-ready.
The problem with free-form text
Ask an LLM to extract information, and you'll get creative but inconsistent responses:
Prompt: "Extract the customer name and email from this message."
Possible responses:
- "The customer's name is John Doe and their email is john@example.com"
- "Name: John Doe, Email: john@example.com"
- "Sure! The name is John Doe (email: john@example.com)"
- "John Doe john@example.com"
All correct, but how do you parse this reliably? You can't. Every format requires different parsing logic, and edge cases will break your code.
The parsing nightmare
Imagine extracting structured data from thousands of customer messages:
# This is fragile and will break
response = llm.generate("Extract name and email from: " + message)
# Which parser do you use?
if "name is" in response:
name = response.split("name is ")[1].split(" and")[0]
elif "Name:" in response:
name = response.split("Name: ")[1].split(",")[0]
# ... endless edge cases
This approach fails the moment the LLM gets creative with formatting.
Enter structured output
Structured output (also called JSON mode or constrained generation) forces the model to return valid JSON that matches a specific schema. No more parsing guesswork.
Same prompt, with structured output:
{
"customer_name": "John Doe",
"email": "john@example.com"
}
Every time. Guaranteed. Your code can rely on response["customer_name"] existing.
How structured output works
Behind the scenes, the model's token generation is constrained to only produce valid JSON:
- You define a schema: "I want an object with
customer_nameandemailfields" - The model generates: While producing tokens, invalid JSON is prevented
- You receive: Guaranteed-valid JSON matching your schema
The model can't output "The customer's name is..." because that's not valid JSON according to your schema.
Jargon: "Constrained generation"
Limiting the model's outputs to only valid tokens for a specific format. For JSON, this means the next token must be valid JSON syntaxâno free-form text allowed.
JSON schema basics
JSON Schema is a standard for describing JSON structure. It's how you tell the model what format you want.
Simple example
{
"type": "object",
"properties": {
"customer_name": {
"type": "string",
"description": "Full name of the customer"
},
"email": {
"type": "string",
"description": "Customer's email address"
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "Urgency level"
}
},
"required": ["customer_name", "email"]
}
What this means:
- Object with properties: The response must be a JSON object
- customer_name: A string (required)
- email: A string (required)
- priority: One of "low", "medium", or "high" (optional)
- Descriptions: Help the model understand what to put in each field
Nested structures
Schemas can be complex:
{
"type": "object",
"properties": {
"customer": {
"type": "object",
"properties": {
"name": {"type": "string"},
"contact": {
"type": "object",
"properties": {
"email": {"type": "string"},
"phone": {"type": "string"}
}
}
}
},
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"product": {"type": "string"},
"quantity": {"type": "number"}
}
}
}
}
}
This describes nested objects and arraysâperfect for complex data extraction.
Function calling explained
Function calling (also called tool use) lets models request specific actions with structured parameters.
Instead of: "Please search for 'AI news' using the search tool"
The model outputs:
{
"function": "web_search",
"parameters": {
"query": "AI news",
"limit": 10
}
}
Your code interprets this, runs the function, and returns results to the model.
How it works
- Define available functions with schemas
- Model decides when to call them based on the user's request
- Model outputs a function call (structured JSON)
- Your code executes the function and returns results
- Model uses results to continue or finish the task
Example: Calculator function
Function definition:
{
"name": "calculate",
"description": "Perform mathematical calculations",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Math expression to evaluate (e.g., '15 * 234 + 67')"
}
},
"required": ["expression"]
}
}
User: "What's 15 times 234 plus 67?"
Model outputs:
{
"function": "calculate",
"parameters": {
"expression": "15 * 234 + 67"
}
}
Your code: Runs eval("15 * 234 + 67") â returns 3577
Model: "The result is 3,577."
Provider-specific implementations
Each AI provider implements structured output slightly differently.
OpenAI
JSON mode (basic):
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4-turbo",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": "Extract customer info as JSON with name and email fields."},
{"role": "user", "content": "Customer John Doe contacted us at john@example.com"}
]
)
data = json.loads(response.choices[0].message.content)
Structured outputs (with schema):
from pydantic import BaseModel
class CustomerInfo(BaseModel):
customer_name: str
email: str
priority: str
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[...],
response_format=CustomerInfo
)
data = response.choices[0].message.parsed
# data.customer_name, data.email are guaranteed to exist
Function calling:
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4-turbo",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools
)
# Check if model wants to call a function
if response.choices[0].message.tool_calls:
function_call = response.choices[0].message.tool_calls[0]
# function_call.function.name == "get_weather"
# function_call.function.arguments == '{"location": "Tokyo"}'
Anthropic (Claude)
Claude uses tool use instead of function calling:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
tools=[
{
"name": "get_weather",
"description": "Get weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
],
messages=[{"role": "user", "content": "Weather in Paris?"}]
)
# Check for tool use
for block in response.content:
if block.type == "tool_use":
print(f"Tool: {block.name}")
print(f"Input: {block.input}")
Structured output via prompt:
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[
{
"role": "user",
"content": """Extract customer info from this message and return ONLY valid JSON with this exact structure:
{
"customer_name": "string",
"email": "string",
"priority": "low" | "medium" | "high"
}
Message: John Doe contacted us urgently at john@example.com"""
}
]
)
data = json.loads(response.content[0].text)
Google (Gemini)
import google.generativeai as genai
model = genai.GenerativeModel('gemini-pro')
# Function calling
tools = [
{
"name": "search_database",
"description": "Search customer database",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"},
"limit": {"type": "integer"}
},
"required": ["query"]
}
}
]
chat = model.start_chat()
response = chat.send_message(
"Find customers named John",
tools=tools
)
# Check for function calls
for part in response.parts:
if fn := part.function_call:
print(f"Function: {fn.name}")
print(f"Args: {fn.args}")
Validation and error handling
Structured output reduces errors, but doesn't eliminate them. Always validate.
Schema validation
from jsonschema import validate, ValidationError
schema = {
"type": "object",
"properties": {
"customer_name": {"type": "string", "minLength": 1},
"email": {"type": "string", "pattern": "^[a-zA-Z0-9+_.-]+@[a-zA-Z0-9.-]+$"}
},
"required": ["customer_name", "email"]
}
try:
response_json = json.loads(llm_response)
validate(instance=response_json, schema=schema)
# Safe to use
process_customer(response_json)
except json.JSONDecodeError:
# LLM didn't return valid JSON (rare with structured output)
log_error("Invalid JSON from LLM")
except ValidationError as e:
# JSON doesn't match schema
log_error(f"Schema validation failed: {e.message}")
Pydantic for type safety
Pydantic provides runtime validation and type hints:
from pydantic import BaseModel, EmailStr, validator
class CustomerInfo(BaseModel):
customer_name: str
email: EmailStr # Validates email format
priority: str
@validator('priority')
def validate_priority(cls, v):
if v not in ['low', 'medium', 'high']:
raise ValueError('Priority must be low, medium, or high')
return v
@validator('customer_name')
def validate_name(cls, v):
if len(v.strip()) == 0:
raise ValueError('Name cannot be empty')
return v.strip()
# Usage
try:
customer = CustomerInfo(**response_json)
# All fields validated, types guaranteed
except ValidationError as e:
print(e.json())
Handling missing fields
Even with required fields, always have fallbacks:
customer_name = response_json.get("customer_name", "Unknown Customer")
email = response_json.get("email")
if not email:
# Retry with more explicit prompt
retry_response = llm.generate_with_schema(
prompt="IMPORTANT: Extract the email address. If no email is present, return null.",
schema=schema
)
Common patterns and use cases
1. Data extraction from unstructured text
Use case: Extract structured data from customer emails, forms, or documents.
schema = {
"type": "object",
"properties": {
"order_id": {"type": "string"},
"products": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"quantity": {"type": "integer"},
"price": {"type": "number"}
}
}
},
"total": {"type": "number"},
"shipping_address": {"type": "string"}
}
}
# LLM extracts structured order info from free-form text
order_data = extract_with_schema(customer_email, schema)
create_order_in_database(order_data)
2. Classification and routing
Use case: Categorize support tickets, emails, or content.
schema = {
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": ["billing", "technical", "sales", "general"]
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high", "urgent"]
},
"requires_human": {"type": "boolean"},
"suggested_response": {"type": "string"}
},
"required": ["category", "priority", "requires_human"]
}
ticket_info = classify_ticket(ticket_text, schema)
# Route based on structured output
if ticket_info["requires_human"]:
assign_to_agent(ticket_info["category"])
else:
send_auto_response(ticket_info["suggested_response"])
3. API parameter generation
Use case: Convert natural language to API calls.
# Function definition for a CRM API
create_contact_tool = {
"name": "create_contact",
"description": "Create a new contact in the CRM",
"parameters": {
"type": "object",
"properties": {
"first_name": {"type": "string"},
"last_name": {"type": "string"},
"email": {"type": "string"},
"company": {"type": "string"},
"phone": {"type": "string"},
"tags": {"type": "array", "items": {"type": "string"}}
},
"required": ["first_name", "last_name", "email"]
}
}
# User: "Add Sarah Chen from Acme Corp, email sarah@acme.com, tag as VIP customer"
# Model outputs structured function call
function_call = {
"function": "create_contact",
"parameters": {
"first_name": "Sarah",
"last_name": "Chen",
"email": "sarah@acme.com",
"company": "Acme Corp",
"tags": ["VIP", "customer"]
}
}
# Execute API call with validated parameters
crm_api.create_contact(**function_call["parameters"])
4. Multi-step workflows
Use case: Break complex tasks into structured steps.
workflow_schema = {
"type": "object",
"properties": {
"steps": {
"type": "array",
"items": {
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": ["search", "analyze", "summarize", "notify"]
},
"parameters": {"type": "object"},
"depends_on": {"type": "array", "items": {"type": "integer"}}
}
}
}
}
}
# Model plans a workflow
plan = generate_plan("Research AI safety and send summary to team", workflow_schema)
# Execute steps in order
for step in plan["steps"]:
execute_step(step["action"], step["parameters"])
5. Database operations
Use case: Natural language to database queries.
db_query_tool = {
"name": "query_database",
"description": "Query the customer database",
"parameters": {
"type": "object",
"properties": {
"table": {
"type": "string",
"enum": ["customers", "orders", "products"]
},
"filters": {
"type": "object",
"properties": {
"field": {"type": "string"},
"operator": {"type": "string", "enum": ["equals", "contains", "greater_than", "less_than"]},
"value": {"type": "string"}
}
},
"limit": {"type": "integer", "maximum": 100}
},
"required": ["table"]
}
}
# User: "Show me customers in California who spent over $1000"
# Model generates structured query
query = {
"function": "query_database",
"parameters": {
"table": "customers",
"filters": [
{"field": "state", "operator": "equals", "value": "California"},
{"field": "total_spent", "operator": "greater_than", "value": "1000"}
],
"limit": 50
}
}
Troubleshooting
Problem: Model returns text instead of JSON
Symptoms: You expect JSON but get "Here's the information you requested..."
Solutions:
- Use provider-specific JSON mode (OpenAI's
response_format) - Add explicit instructions: "Return ONLY valid JSON, no other text"
- Use structured output APIs, not basic completion
- Check your system prompt clearly specifies JSON output
Problem: Schema validation fails
Symptoms: JSON is valid but doesn't match your schema
Solutions:
- Simplify the schema (complex nested schemas confuse models)
- Add detailed
descriptionfields to each property - Show examples in your prompt
- Use
enumfor constrained choices instead of free text - Make fewer fields
requiredinitially
Problem: Inconsistent field names
Symptoms: Sometimes customerName, sometimes customer_name
Solutions:
- Explicitly specify field names in the schema
- Use
additionalProperties: falseto prevent extra fields - Provide an example in your prompt showing exact field names
- Use TypeScript/Pydantic schemas that enforce consistency
Problem: Hallucinated data
Symptoms: Model invents information to fill required fields
Solutions:
- Make fields optional unless truly required
- Add validation that checks for placeholder values
- Prompt the model to use
nullfor missing information - Include instructions: "Only extract information explicitly present in the text"
Problem: Function calling fails
Symptoms: Model doesn't call functions or calls the wrong one
Solutions:
- Improve function descriptions (be specific about when to use each)
- Reduce the number of available functions (fewer choices = better accuracy)
- Add examples of correct function usage in the prompt
- Make parameter descriptions clearer
- Check function names are descriptive (not
func1,func2)
Problem: Performance is slow
Symptoms: Structured output takes longer than free text
Solutions:
- This is normal (constrained generation is slower)
- Use smaller models for simple schemas
- Simplify your schema
- Cache results for repeated queries
- Use parallel function calls when possible
Best practices
Schema design
1. Start simple, add complexity gradually
// Start with this
{"type": "object", "properties": {"name": {"type": "string"}}}
// Not this
{"type": "object", "properties": {"person": {"type": "object", "properties": {...}}}}
2. Use enums for constrained values
// Good
{"type": "string", "enum": ["small", "medium", "large"]}
// Bad (model might return "M", "med", "MEDIUM")
{"type": "string", "description": "Size: small, medium, or large"}
3. Provide clear descriptions
{
"properties": {
"confidence": {
"type": "number",
"description": "Confidence score from 0.0 to 1.0, where 1.0 is highest confidence",
"minimum": 0,
"maximum": 1
}
}
}
4. Make fields optional when appropriate
{
"required": ["customer_name"], // Only truly required fields
"properties": {
"customer_name": {"type": "string"},
"phone": {"type": "string"}, // Optional, not everyone has one
"notes": {"type": "string"} // Optional
}
}
Prompt engineering for structure
1. Show examples
Extract customer info as JSON. Example output:
{
"customer_name": "Jane Smith",
"email": "jane@example.com",
"priority": "high"
}
Now extract from this message: [message]
2. Be explicit about edge cases
Extract fields. If a field is not present in the text, use null.
Do not invent or guess information.
3. Use system prompts for consistency
system_prompt = """You are a data extraction assistant.
Always return valid JSON matching the provided schema.
Extract only information explicitly stated in the input.
Use null for missing fields. Never invent data."""
Error handling strategy
def extract_with_retry(text, schema, max_retries=3):
for attempt in range(max_retries):
try:
response = llm.generate_structured(text, schema)
validated = validate_schema(response, schema)
return validated
except ValidationError as e:
if attempt == max_retries - 1:
# Final attempt failed
log_error(f"Schema validation failed after {max_retries} attempts")
return get_safe_default()
else:
# Retry with more explicit instructions
text = f"{text}\n\nPrevious attempt failed: {e.message}. Please ensure all required fields are present."
except Exception as e:
log_error(f"Unexpected error: {e}")
return get_safe_default()
Testing and validation
1. Test with edge cases
- Empty input
- Missing fields
- Unexpected formats
- Very long input
- Special characters
2. Monitor in production
def track_structured_output(response, schema_name):
metrics.increment(f"structured_output.{schema_name}.total")
if validation_failed(response):
metrics.increment(f"structured_output.{schema_name}.validation_failed")
log_sample(response, schema_name)
if has_null_required_fields(response):
metrics.increment(f"structured_output.{schema_name}.missing_required")
3. A/B test schemas
- Try different field names
- Test with/without descriptions
- Compare simple vs. nested structures
- Measure accuracy and latency
Use responsibly
- Validate all outputs: Never trust structured output blindly
- Handle failures gracefully: Have fallback behavior when parsing fails
- Don't over-constrain: Too strict schemas frustrate the model
- Monitor costs: Function calling increases token usage
- Test edge cases: Empty inputs, missing data, special characters
- Log failures: Track validation errors to improve schemas
- Privacy matters: Structured output can leak sensitive data if logged
What's next?
Now that you understand structured output, you might explore:
- Agents & Tools: Building AI systems that take actions
- Evaluating AI Answers: Measuring accuracy of extracted data
- Orchestration Options: Frameworks like LangChain for managing function calls
- APIs & Integration: Connecting AI to your existing systems
- Guardrails & Policy: Setting boundaries on what AI can output
Frequently Asked Questions
Is structured output the same as function calling?
Not exactly. Structured output ensures responses match a JSON schema. Function calling is a specific use case where the model requests to call functions with structured parameters. Function calling uses structured output under the hood.
Can I use structured output with any LLM?
Most modern LLMs support it (GPT-4, Claude 3, Gemini Pro, etc.), but implementation varies. Some have native support (OpenAI's JSON mode), others need careful prompting. Check your provider's documentation.
Will structured output make responses slower?
Yes, slightly. Constraining generation to valid JSON takes more computation than free text. The difference is usually negligible (<500ms) but can add up with complex schemas or many requests.
What happens if the model can't fill all required fields?
Behavior varies by provider. Some will hallucinate data to satisfy the schema, others will return an error. Always validate outputs and make fields optional when appropriate.
Can I combine multiple function calls in one response?
Yes! Many providers support parallel function calling where the model requests multiple functions simultaneously. Useful for complex tasks requiring several API calls.
How do I debug when the model returns wrong data?
Check: (1) Is your schema description clear? (2) Are you providing examples? (3) Is the input ambiguous? (4) Try simplifying the schema. Log failures and iterate on prompts and schema definitions.
Was this guide helpful?
Your feedback helps us improve our guides
Key Terms Used in This Guide
Model
The trained AI system that contains all the patterns it learned from data. Think of it as the 'brain' that makes predictions or decisions.
Tool (Function Calling)
A capability that allows an AI to call external functions or APIsâlike searching the web, querying databases, or running calculations.
AI (Artificial Intelligence)
Making machines perform tasks that typically require human intelligenceâlike understanding language, recognizing patterns, or making decisions.
Related Guides
Context Management: Handling Long Conversations and Documents
IntermediateMaster context window management for AI. Learn strategies for long conversations, document processing, memory systems, and context optimization.
Deployment Patterns: Serverless, Edge, and Containers
IntermediateHow to deploy AI systems in production. Compare serverless, edge, container, and self-hosted options.
Fine-Tuning vs RAG: Which Should You Use?
IntermediateCompare fine-tuning and RAG to customize AI. Learn when each approach works best, how they differ, and how to combine them.