Prompt Injection

A technique where a user attempts to bypass the original instructions of an AI model using a specially crafted command.

What is prompt injection?

Prompt injection is a technique in which a user attempts to overwrite or bypass the original instructions of an AI model (system prompt) by entering a specially formulated command into the chat.

Attacker's goal

Force the agent to do something it is not supposed to - for example, reveal sensitive data, ignore security rules, or start behaving differently than intended.

How to defend

Define clear security rules in the system prompt
Validate user inputs
Limit the agent's access to only necessary tools and data

🍪 A few words about cookies

Prompt Injection

What is prompt injection?

Attacker's goal

How to defend