Risk management for agentic systems

AI risk management for agentic systems

Agent risk is not abstract. It is the set of side effects an agent can run. Map each action class to a default verdict, the evidence HELM records, and an approval path.

Name the side effect. Set the verdict. Keep the proof.

The frame

Risk lives in the side effect, not the prompt.

An agent that can only suggest carries little risk. An agent that can change a record, move money, or alter access carries the same blast radius as the person it acts for. Managing that risk means naming each side effect and deciding, in advance, what happens when an agent reaches for it.

Name the side effect

Risk lives in what an agent can change, not in what it might say.

Set a default verdict

Each action class carries a default of allow, escalate, or deny.

Require evidence

Each class names the receipt and EvidencePack recorded when it runs.

Route the unknown

Anything outside policy is denied or escalated, not run.

Action classes

A risk register written in side effects.

Each action class carries a default verdict, a risk tier, and the evidence HELM records when it runs. This is the unit of risk management for an agentic system.

Side effect Default verdict Risk Required evidence
Data export
Export a customer list, download records, push data to a destination
ESCALATE Critical Data hash, principal, policy hash, destination, signed receipt
Database / record write
Change a CRM, ticket, or policy-admin record
ALLOW High Before/after state hash, receipt, rollback semantics
IAM / access change
Grant a role, revoke a token, reset a password
ESCALATE Critical Delegation-chain receipt, access-change EvidencePack
Deployment / infra change
Deploy a service, update infrastructure, restart production
ESCALATE Critical Change receipt, CI evidence, rollback path
Code merge / PR action
Open a PR, modify code, merge a dependency bump
ESCALATE High PR receipt, diff hash, reviewer disposition
Refund / credit
Issue a refund, apply a credit, waive a fee
ESCALATE High Customer-action receipt, amount, policy, evidence
Customer communication
Send a support reply, an outbound email, or a notice
ESCALATE Medium Message receipt, template version, approval where required
Incident response
Quarantine a host, revoke a token, escalate a ticket
ESCALATE Critical Incident receipt, telemetry, disposition

A consequential action

How a single action is managed.

The risk model is not a document. It is a verdict on the action and a receipt for the outcome.

Agent proposes

Agent proposes to grant an admin role to a new principal

HELM checks policy

Checks the IAM action against policy, principal, and scope

Verdict

ESCALATE

Proof

Access-change receipt + delegation-chain EvidencePack

Questions

Risk management for agents, in plain terms.

What makes risk in agentic systems different?

An agentic system does not just answer. It can change records, move money, alter access, and deploy code. The risk is the side effect, so the unit of risk management is the action class, not the conversation.

How does HELM manage that risk?

HELM checks each proposed action against policy before the effect runs. It returns ALLOW, DENY, or ESCALATE, denies anything unknown or unapproved by default, and records a signed receipt and EvidencePack for what happened.

How does this map to frameworks like NIST AI RMF or EU rules?

Frameworks like the NIST AI RMF and the EU AI Act describe outcomes such as identifying, measuring, and managing risk. HELM does not certify you against any framework. It produces the per-action verdicts and signed evidence a risk program can reference when it reports against those frameworks.

Where does an action-class model start?

Start with the side effects your agents can actually run today, assign each one a default verdict and required evidence, and deny the rest. The risk test inspects what your agents can change so your action-class table reflects reality.

Terms

Plain-language terms

EvidencePack

A small bundle of records used to verify one event or review path.

Use for replayable evidence slices.
ProofGraph

A record chain that helps replay and check what happened.

Use for HELM proof records and replay paths.
ALLOW

HELM lets the action run.

Use as a canonical verdict.
DENY

HELM blocks the action.

Use as a canonical verdict.
ESCALATE

HELM stops and asks for more facts, policy, or human approval.

Use as the canonical non-dispatch path for missing facts, policy hold, or approval.

See the side effects your agents can run.

The risk test inspects what your agents can change, so your action-class table reflects reality.