Skip to main content
BETAThis is a new design — give feedback

Inference

Also known as: Model Inference, Prediction

In one sentence

The process where a trained AI model takes new input and produces an output—a prediction, answer, or generated text. This is the 'using' phase that happens after training is complete.

Explain like I'm 12

Training is like studying for an exam. Inference is taking the exam—you use everything you learned to answer new questions you haven't seen before.

In context

Every time you type a message into ChatGPT and get a response, that's inference happening. The model applies patterns it learned during training to generate an answer to your specific prompt. Cloud providers like AWS, Google Cloud, and Azure charge for inference by the token or by compute time. Companies running AI at scale often spend far more on inference than on training, since training happens once but inference runs millions of times per day.

See also

Related Guides

Learn more about Inference in these guides: