Everyone Selling You an AI Agent Isn't Telling You the Whole Truth

This is Part 1 of Agentic AI for Offensive Security, the foundations track. It runs alongside a hands-on build series where these concepts turn into running code. Read this track to understand the machine. Read that one to build it.

Everyone selling you an AI agent isn't telling you the whole truth. Either it isn't really an agent, or it isn't doing what they claim, or it's doing far more than they admit. All three should worry you.

I run red team operations against AI systems. In vendor decks, conference talks, and sales calls, I hear the word "agent" used for a chatbot with a system prompt, for a script that calls an API in a loop, and for a system that plans, acts, and corrects itself with no human in the chain. Those are three different animals. Only one of them can hurt you.

This article pins the words down. Not because definitions are fun, but because in offensive security the difference between those animals is the difference between a parlor trick and a loaded weapon. You can't attack, defend, or build what you can't name.

The three animals

Diagram of the agent loop showing perception, decision, action, and feedback cycling through a language model core

The chatbot in a costume. A language model, a system prompt, maybe some documents stuffed into the context window. You type, it answers. It has no tools, takes no actions, and touches nothing outside the conversation. Most "AI agents" sold today are this. The only agentic thing about them is the marketing page.

The workflow in a trench coat. A fixed pipeline where a developer decided every step in advance: fetch the data, send it to the model, parse the answer, call the API, post the result. The model fills in blanks inside a structure a human wrote. Useful and predictable, and sold as an agent because "AI workflow" doesn't raise venture money.

The actual agent. A model given a goal, a set of tools, and a loop. It decides what to do next, does it, looks at what happened, and decides again. Nobody scripted the path. The path emerges at runtime, which is exactly what makes it powerful and exactly what makes it dangerous.

The first two are software with a model inside. The third is something new: software where the control flow itself comes out of the model.

What an agent actually is

Strip the buzzwords and an AI agent is three parts.

A model. The brain. It reads the current state of the world and picks the next move. Everything the agent does well or badly starts here, which is why model selection gets its own article later in this series.

Tools. The hands. Function calls, shell commands, browsers, APIs, MCP servers. A model without tools can only talk. A model with tools can scan a network, write a file, send an email, move money. Tools are where words become actions, and actions have consequences.

A loop. The will. Something has to feed the model its goal, execute the tool it picked, show it the result, and ask "now what?" until the job is done. The loop is what turns one clever answer into a campaign.

Model, tools, loop. Every agent you will ever meet, from a coding assistant to an autonomous pentester, is those three parts in different proportions.

You've already used real ones, by the way. Claude Code is an agent: give it a bug, and it reads files, runs tests, edits code, and re-runs until green. Deep research features in the big chat products are agents: they search, read, follow leads, and search again. The pattern is already in your daily tools. What's new is pointing it at security work.

What "agentic" adds

The distinction that actually matters fits in one question: who closes the loop?

In a plain agent setup, a human closes it. The model proposes, you approve, it acts, you review the result, you decide what's next. The human is the loop. Slower, but contained.

An agentic system closes its own loop. It takes a goal, breaks it into steps nobody gave it, runs those steps, judges its own results, recovers from its own failures, and keeps going until it decides the job is done. Autonomy isn't a feature it has. Autonomy is what it is.

Notice that this is a spectrum, not a switch. An agent that asks permission before every action sits at one end. An agent that runs overnight, spawns copies of itself, and reports back in the morning sits at the other. Every system lives somewhere on that line, and where it lives is an engineering decision someone made. Or worse, didn't make. The build track's first post is entirely about engineering that decision on purpose.

The same task, three systems

Make it concrete. Give all three animals the same instruction: "find the admin panel on this web app."

The chatbot answers in prose. "Admin panels are commonly located at /admin, /wp-admin, or /login." It found nothing. It typed from memory.

The workflow runs exactly the probes its developer hard-coded: request /admin, request /wp-admin, grep the responses, return matches. If the panel lives at /manage, the workflow comes back empty and nobody learns anything.

The agentic agent requests the homepage, reads the HTML, notices a JavaScript bundle, pulls it, finds a route table inside, spots /internal/console, requests it, gets redirected to a login page, and reports the panel with evidence. Then it asks itself whether there are more. Nobody told it about the bundle. It found the path because the path was findable.

Same sentence in, three different events out. That gap is everything that follows in this series.

Why a red teamer cares

Two reasons, and they point in opposite directions.

The agent is a weapon. Reconnaissance that used to cost me four hours runs in about fifteen minutes, and the agent doesn't get bored on subdomain forty, doesn't skip the changelog, and works while I sleep. Offensive security is the sharpest test of what these systems can really do. Later in this series we build one and point it at a target designed to fight back.

The agent is an attack surface. Every part you just met can be turned. The model can be manipulated through prompt injection. The tools can be tricked into firing at the wrong target. The loop can be hijacked so the system pursues an attacker's goal with its owner's permissions. One poisoned document in the context, and the agent's autonomy belongs to someone else. The more agentic the system, the further a single successful injection travels.

That second point is the next article: why agentic AI for offensive security matters so much, and why it's genuinely dangerous. Not movie-dangerous. Mechanically dangerous, in ways you can demo.

The words this series will use

So we never argue about vocabulary again:

Model: the LLM itself. Brain only.
AI agent: model plus tools plus loop, able to take real actions.
Agentic agent: an agent that closes its own loop. It plans, acts, self-corrects, and pursues a goal without a human as the conveyor belt.
Fleet: multiple agents coordinated on one objective. Why one agent is never enough is its own article, and the answer surprised me.

Every claim in this series stands on those four definitions.

What's next

Part 2: what makes agentic agents so important for offensive security, and why the same property that makes them useful makes them dangerous. After the foundations, this track converges with the build series, where we construct an agent in Claude Code, hand it tools, and walk it through jailbreaking Gandalf level by level.

If you work in security and you've been told the agent wave is hype, stick around. I intend to change your mind with working code.