- Home
- /Guides
- /practical-ai
- /AI API Integration Basics
AI API Integration Basics
Learn how to integrate AI APIs into your applications. Authentication, requests, error handling, and best practices.
By Marcin Piekarski • Frontend Lead & AI Educator • builtweb.com.au
AI-Assisted by: Prism AI (Prism AI represents the collaborative AI assistance in content creation.)
Last Updated: 11 February 2026
TL;DR
AI APIs let you add powerful AI features to your apps without building or training your own models. You authenticate with API keys, send HTTP requests containing prompts or data, handle responses and errors gracefully, and optimise for cost and performance. Getting the basics right saves you headaches, money, and frustrated users down the line.
Why it matters
Most businesses and developers will never train their own AI models. Instead, they will call APIs provided by companies like OpenAI, Anthropic, or Google. Understanding how these APIs work is the single most practical AI skill you can learn right now. Whether you are building a chatbot, adding smart search to your product, or automating content workflows, API integration is how the work actually gets done.
A poorly integrated API leads to slow responses, unexpected bills, and outages that leave users staring at error screens. A well-integrated one feels seamless. The difference comes down to understanding a handful of core concepts.
What are AI APIs?
An AI API is a web service that lets you send data (usually text, images, or audio) over the internet and receive AI-generated results back. Think of it like ordering food through a delivery app. You do not need a kitchen (a GPU cluster), a chef (a trained model), or ingredients (terabytes of training data). You just place an order and get the meal.
Common AI APIs you will encounter include OpenAI (GPT-4o, DALL-E), Anthropic (Claude), Google (Gemini, Vertex AI), and Cohere (embeddings and text generation). Each has its own pricing, rate limits, and feature set, but they all follow a similar request-response pattern.
How the basic workflow works
Every AI API integration follows the same five-step cycle:
- Get an API key. Sign up for the provider, navigate to their developer dashboard, and generate a set of credentials.
- Send a request. Use an HTTP POST request to send your prompt or input data to the API endpoint.
- Receive a response. The API returns JSON containing the AI's output, along with metadata like token usage.
- Handle errors. Implement retry logic and fallbacks so your app does not crash when something goes wrong.
- Process the result. Extract the useful parts of the response and present them to your user or feed them into the next step of your workflow.
This cycle repeats for every single interaction your app has with the AI. Understanding it deeply means you can debug problems faster and build more resilient applications.
Making your first request
Here is a minimal example using the OpenAI Python library:
import openai
openai.api_key = "your-key-here"
response = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain APIs simply."}
]
)
print(response.choices[0].message.content)
The messages array is where the magic happens. The "system" message sets the AI's behaviour. The "user" message is the actual question. You can add multiple messages to simulate a conversation history, and the AI will respond in context.
Most providers also offer SDKs for JavaScript, Go, Ruby, and other languages. The underlying concept is identical: send a structured request, get a structured response.
Authentication and security
Your API key is the password to your AI account. If someone else gets it, they can run up your bill or access your data.
API key best practices:
- Store keys in environment variables, never in source code.
- Never commit keys to Git. Use
.envfiles and add them to.gitignore. - Rotate keys periodically, especially after a team member leaves.
- Set spending limits in your provider's dashboard so a leaked key cannot bankrupt you.
Some providers also support OAuth for user-specific access. This is more complex to implement but essential if you are building a multi-user application where each user authenticates with their own account.
For production apps, always route API calls through your own backend server. Never expose API keys in client-side JavaScript because anyone can open browser developer tools and read them.
Request parameters that matter
Beyond the required fields (model name and input), several optional parameters dramatically affect the output:
- Temperature controls randomness. A value of 0 gives near-deterministic output (good for factual tasks). A value of 1 gives more creative, varied responses. Most production apps use 0.3 to 0.7.
- Max tokens caps the response length. If you are generating short answers, set this low to save money and speed up responses.
- Top-p (nucleus sampling) is an alternative to temperature. Generally, adjust one or the other, not both.
- Stop sequences tell the model when to stop generating. Useful for structured outputs.
Getting these right means the difference between an AI that rambles and one that gives crisp, useful answers.
Response handling and streaming
The API returns JSON containing the generated text, token usage counts, and a finish reason (whether it stopped naturally or hit the token limit).
For short responses, parse the JSON and extract what you need. For longer outputs, use streaming. Streaming sends the response token by token as it is generated, so users see text appearing in real time instead of waiting for the entire response. This dramatically improves perceived performance and is how ChatGPT and Claude display their answers.
Most SDKs support streaming with a simple flag:
stream = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a story."}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content, end="")
Error handling and retry logic
APIs fail. Networks drop. Servers overload. Your app needs to handle this gracefully.
Common HTTP errors you will see:
- 401 Unauthorized: Your API key is invalid or missing.
- 429 Too Many Requests: You have exceeded the rate limit. Back off and retry.
- 500 Internal Server Error: The provider's servers are having trouble. Retry after a delay.
- 503 Service Unavailable: The service is temporarily down. Retry with exponential backoff.
Exponential backoff means waiting progressively longer between retries: 1 second, then 2, then 4, then 8, up to a maximum. This prevents your app from hammering an already-stressed server.
Always set a maximum number of retries (typically 3-5) and a timeout for each request. Without these, a single failing request can block your entire application.
Rate limits and how to work with them
Every AI API enforces rate limits to prevent abuse and ensure fair access. These typically include requests per minute (RPM), tokens per minute (TPM), and sometimes concurrent request limits.
Practical strategies:
- Queue requests and process them at a steady pace instead of sending bursts.
- Use batch endpoints when available (OpenAI offers a dedicated batch API at 50% lower cost).
- Upgrade your API tier if you consistently hit limits. Most providers offer higher limits for paying customers.
- Monitor your usage through the provider's dashboard and set alerts before you hit ceilings.
Cost optimisation
AI API calls cost real money, and costs add up fast at scale. Here are proven strategies to keep your bills manageable:
- Cache common responses. If many users ask the same question, store the answer and serve it from cache.
- Use smaller models when possible. GPT-4o is powerful but expensive. For simple tasks, GPT-4o-mini or Claude Haiku may be 10-20 times cheaper and fast enough.
- Limit max tokens. Do not request 4,000 tokens when 200 will do.
- Batch requests to take advantage of bulk pricing.
- Monitor usage dashboards daily during development and weekly in production.
A common mistake is optimising too early. Start with the best model, get your app working correctly, then switch to cheaper models for tasks that do not need the full power.
Common mistakes
Exposing API keys in frontend code. This is the number one mistake beginners make. Always use a backend proxy.
Ignoring error handling. Your app will crash the first time the API returns an unexpected response. Build error handling from day one.
Not setting spending limits. A bug in your retry logic can generate thousands of requests in minutes. Set hard limits in your provider dashboard.
Sending too much context. Including unnecessary conversation history or system prompts wastes tokens and money. Be intentional about what you send.
Not testing with real-world inputs. Users will send typos, long paragraphs, and edge cases you never imagined. Test with messy, real data before launching.
What's next?
Now that you understand API integration basics, explore these related topics:
- Prompt Engineering Basics to craft better inputs for your API calls
- Token Economics to understand how pricing and limits work
- Batch Processing with AI for handling large-scale operations efficiently
- Context Windows to learn how much data you can send in a single request
Frequently Asked Questions
Do I need to know how to code to use AI APIs?
You need basic programming knowledge, yes. Most AI APIs require sending HTTP requests, which means writing code in Python, JavaScript, or another language. However, you do not need to be an expert. If you can follow a tutorial and copy-paste code examples, you can get a basic integration working in an afternoon.
How much does it cost to use AI APIs?
Costs vary widely. OpenAI's GPT-3.5 costs about $0.50 per million input tokens, while GPT-4o can cost $30 per million input tokens. For a small app handling a few hundred requests per day, you might spend $5-50 per month. At scale with thousands of users, costs can reach hundreds or thousands per month. Always set spending limits.
Can I switch between different AI API providers easily?
Switching is possible but not always seamless. Most providers use similar request-response patterns, but the specific parameters, response formats, and capabilities differ. Libraries like LiteLLM provide a unified interface across providers, making switching easier. Design your code with provider abstraction from the start if flexibility matters.
What happens if the AI API goes down while my app is running?
Without error handling, your app will crash or hang. With proper error handling, you can show users a friendly message, retry the request after a delay, or fall back to a cached response or alternative provider. Always plan for API downtime because it will happen.
Was this guide helpful?
Your feedback helps us improve our guides
About the Authors
Marcin Piekarski• Frontend Lead & AI Educator
Marcin is a Frontend Lead with 20+ years in tech. Currently building headless ecommerce at Harvey Norman (Next.js, Node.js, GraphQL). He created Field Guide to AI to help others understand AI tools practically—without the jargon.
Credentials & Experience:
- 20+ years web development experience
- Frontend Lead at Harvey Norman (10 years)
- Worked with: Gumtree, CommBank, Woolworths, Optus, M&C Saatchi
- Runs AI workshops for teams
- Founder of builtweb.com.au
- Daily AI tools user: ChatGPT, Claude, Gemini, AI coding assistants
- Specializes in React ecosystem: React, Next.js, Node.js
Areas of Expertise:
Prism AI• AI Research & Writing Assistant
Prism AI is the AI ghostwriter behind Field Guide to AI—a collaborative ensemble of frontier models (Claude, ChatGPT, Gemini, and others) that assist with research, drafting, and content synthesis. Like light through a prism, human expertise is refracted through multiple AI perspectives to create clear, comprehensive guides. All AI-generated content is reviewed, fact-checked, and refined by Marcin before publication.
Capabilities:
- Powered by frontier AI models: Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google)
- Specializes in research synthesis and content drafting
- All output reviewed and verified by human experts
- Trained on authoritative AI documentation and research papers
Specializations:
Transparency Note: All AI-assisted content is thoroughly reviewed, fact-checked, and refined by Marcin Piekarski before publication. AI helps with research and drafting, but human expertise ensures accuracy and quality.
Key Terms Used in This Guide
AI (Artificial Intelligence)
Making machines perform tasks that typically require human intelligence—like understanding language, recognizing patterns, or making decisions.
API (Application Programming Interface)
A way for different software programs to talk to each other—like a menu of requests you can make to get AI to do something.
Related Guides
AI for Data Analysis: From Questions to Insights
IntermediateUse AI to analyze data, generate insights, create visualizations, and answer business questions from your datasets.
A/B Testing AI Outputs: Measure What Works
IntermediateHow do you know if your AI changes improved outcomes? Learn to A/B test prompts, models, and parameters scientifically.
Batch Processing with AI: Efficiency at Scale
IntermediateProcess thousands of items efficiently with batch AI operations. Learn strategies for large-scale AI tasks.