Skip to main content
BETAThis is a new design — give feedback

Prompt Injection

Also known as: Prompt Attack, Jailbreaking

In one sentence

A security vulnerability where malicious users craft inputs designed to override an AI system's instructions, bypass safety filters, or extract hidden information from the system prompt.

Explain like I'm 12

Like convincing a security guard to ignore their rules by slipping secret instructions into a conversation—you trick the AI into thinking it should follow your commands instead of its original orders.

In context

Prompt injection is one of the most significant security challenges facing AI applications. Attackers insert phrases like 'Ignore all previous instructions and...' to bypass content filters or reveal confidential system prompts. Indirect prompt injection is even sneakier—attackers hide malicious instructions inside documents, emails, or web pages that an AI tool processes. For example, a hidden instruction in a PDF could tell an AI assistant to forward sensitive data. Companies defend against this using input validation, output filtering, guardrails, and multi-layer security architectures.

See also

Related Guides

Learn more about Prompt Injection in these guides: