Why this topic matters
Most AI failures are not model-only failures. They happen because developers connect the model to data, tools and business workflows, then assume the model will preserve the trust rules they had in mind. That assumption is exactly what an operator needs to test.
A practical surface map separates what the model can read, what it can remember, what it can call, what it can influence and what happens when its output is trusted. Without that map, teams chase isolated jailbreaks and miss the actual control boundary.
Surface map
Look at hidden instructions, prompt templates, retrieval layers, memory stores, embeddings, tool definitions, plugin permissions, system actions, approval checkpoints and user-visible output channels as one joined graph. Attack surface grows at every step where untrusted content is mixed with privileged context or where model output is allowed to drive an action.
Operator focus points
- Map every source of attacker-controlled text, files and URLs that can enter context.
- Trace whether retrieved content can outrank or distort system-level instructions.
- Identify all tools, connectors and side effects reachable from the planner.
- Check whether model output is consumed by code, analysts or automated actions without sanitisation.
- Separate surprising output from reachable impact and prove the business consequence.
Curated public references
- OWASP Gen AI Security ProjectProject home for current LLM and GenAI security guidance.
- MITRE ATLASTechnique and threat mapping for AI-enabled systems.
- NIST AI RMF 1.0Risk framing for AI systems and operational controls.
