AI Security // Retrieval, Agents & Tooling

RAG, Agents & Tool Abuse

RAG and agents turn language into reach. Once a planner can retrieve, rank, remember and invoke tools, every weak permission boundary becomes an offensive opportunity. The operator job is to prove which text can influence which action.

field briefoperator referencepublic sources

Why this topic matters

A poisoned document, malicious knowledge-base page, crafted email or hostile ticket can shift retrieval output, steer the planner and alter tool selection. That means the compromise point may live in content storage, not in the model API itself.

Agents are dangerous when they inherit broad tool scopes, weak confirmation logic or fake confidence from model output. A bad plan plus real authority is where AI risk stops being theoretical.

What to test

  • Poison retrieval corpora and test whether malicious instructions survive chunking and ranking.
  • Review connector permissions, delegated tokens and over-broad tool access.
  • Test whether agents can call actions from weak evidence or fabricated confidence.
  • Check confirmation prompts, approval bypass paths and replay of sensitive actions.
  • Trace whether outputs can exfiltrate data from memory, tools, connectors or hidden context.

Operator framing

The practical model is simple: what can the model see, what can it call, what can it change, and who believes the answer enough to act on it. Most agent abuse findings reduce to insecure orchestration with a language layer on top.

Curated public references