Prompt injection

Official Definition

An attack technique where malicious inputs are crafted to manipulate an AI model into ignoring its instructions, bypassing its safety measures, or producing harmful or unintended outputs.

Source: AIEOG AI Lexicon (Feb 2026), adapted from NIST AI 100-2e2025 and OWASP Top 10 for LLM Applications

What prompt injection means in plain language

Prompt injection is a type of attack against generative AI systems where an adversary crafts input that tricks the model into doing something it was not supposed to do. It is the AI equivalent of SQL injection in traditional software — a technique that exploits how the system processes input to alter its behavior.

Prompt injection attacks can take many forms: direct injection (including malicious instructions in user input), indirect injection (embedding malicious instructions in documents or web content that the AI processes), and jailbreaking (crafting prompts that convince the model to ignore its safety guidelines).

For example, an attacker might embed hidden instructions in a document that a compliance AI tool processes, causing it to suppress certain findings. Or an adversary might craft a customer service chatbot input that causes it to disclose system prompt instructions or internal policies.

Why it matters in financial services

Prompt injection is a significant and evolving threat for financial institutions deploying generative AI:

Data exfiltration. Prompt injection can be used to extract sensitive information from AI systems, including system prompts, training data patterns, or information from other users’ interactions.
Safety bypass. Attackers can use prompt injection to make AI systems produce content that violates compliance policies, including discriminatory language, misleading information, or unauthorized disclosures.
Decision manipulation. In AI-assisted decision-making, prompt injection could manipulate the AI’s analysis or recommendations.
Reputation risk. A successfully attacked customer-facing AI system could produce embarrassing, harmful, or offensive content attributed to the institution.

Key considerations for compliance teams

Assess prompt injection risk. For each generative AI deployment, evaluate the exposure to prompt injection based on who can provide input.
Implement input sanitization. Filter and validate inputs to detect and block common prompt injection patterns.
Use defense-in-depth. Layer multiple defenses: input filtering, output validation, system prompt hardening, and monitoring.
Test regularly. Conduct adversarial testing (red teaming) to identify prompt injection vulnerabilities.
Monitor for injection attempts. Log and analyze inputs for patterns consistent with prompt injection attacks.
Include in security assessments. Prompt injection should be part of the security assessment for any generative AI deployment.

Stay current on AI risk in financial services

Get practical guidance on AI governance, model risk, and regulatory developments delivered to your inbox. Stay up to date on the latest in financial compliance from our experts.