Scanners inspect inputs
Prompt scanners and content filters read what goes in and out. They can flag a risky request. They do not decide whether the resulting action may run.
AI agent security
Scanners and filters inspect inputs. The harder problem is the side effect an agent performs. HELM governs the consequential action and records a receipt, so security lives where the effect runs.
Govern the side effect. Deny the unknown by default. Prove the decision.
The gap in scanner-only security
Most AI agent security stops at the text: scan the prompt, filter the response, score the risk. Useful, but the consequence appears when a tool runs. If the side effect is ungoverned, the agent is still unguarded.
Prompt scanners and content filters read what goes in and out. They can flag a risky request. They do not decide whether the resulting action may run.
A tool-using agent does not just answer. It writes records, moves money, changes access, deploys code. Security has to govern that effect, not just the text around it.
A log says something happened. It does not say the action was checked, what the verdict was, or which policy applied. Evidence has to be bound to the decision.
Execution authority plus receipts
Real agent security is two things working together: a decision on the side effect, and evidence bound to that decision.
Return ALLOW, DENY, or ESCALATE for a proposed action, before the effect runs.
Bind the permitted effect to the verdict that authorized it, with scope and policy.
Sign a receipt and EvidencePack that anyone can verify offline, later.
Security by side effect
HELM secures what an agent can do, by side effect. Each action class carries a default verdict and the evidence HELM records when it runs.
| Side effect | Default verdict | Risk | Required evidence |
|---|---|---|---|
| Data export Export a customer list, download records, push data to a destination | ESCALATE | Critical | Data hash, principal, policy hash, destination, signed receipt |
| Database / record write Change a CRM, ticket, or policy-admin record | ALLOW | High | Before/after state hash, receipt, rollback semantics |
| IAM / access change Grant a role, revoke a token, reset a password | ESCALATE | Critical | Delegation-chain receipt, access-change EvidencePack |
| Deployment / infra change Deploy a service, update infrastructure, restart production | ESCALATE | Critical | Change receipt, CI evidence, rollback path |
| Code merge / PR action Open a PR, modify code, merge a dependency bump | ESCALATE | High | PR receipt, diff hash, reviewer disposition |
| Refund / credit Issue a refund, apply a credit, waive a fee | ESCALATE | High | Customer-action receipt, amount, policy, evidence |
| Customer communication Send a support reply, an outbound email, or a notice | ESCALATE | Medium | Message receipt, template version, approval where required |
| Incident response Quarantine a host, revoke a token, escalate a ticket | ESCALATE | Critical | Incident receipt, telemetry, disposition |
Where HELM fits
Inspect prompts and responses for risky content.
Prove who or what is acting.
Route and observe tool and MCP traffic.
Reconstruct what happened from logs.
Decides whether the side effect may run, returns ALLOW / DENY / ESCALATE, and records a signed receipt.
Questions
It is the discipline of bounding what a tool-using agent can do and proving what it did. Input scanning and identity are part of it, but the core question is whether a proposed side effect may run. That is execution authority, paired with evidence of the decision.
A scanner inspects the request. It cannot stop the side effect that follows or prove the action was authorized. HELM checks the proposed action against policy before it runs and records a receipt, so the control and the evidence live at the moment of execution.
When the agent proposes a consequential action, HELM returns ALLOW, DENY, or ESCALATE before the effect runs, denies anything unknown or unapproved by default, and binds a signed receipt to the action. External tool output and MCP servers are treated as untrusted unless explicitly normalized and approved.
No. Identity proves who is acting and observability reconstructs history. HELM decides whether a consequential action may execute and records proof that survives outside those tools.
Keep reading
Terms
A small bundle of records used to verify one event or review path.
Use for replayable evidence slices.A record chain that helps replay and check what happened.
Use for HELM proof records and replay paths.HELM lets the action run.
Use as a canonical verdict.HELM blocks the action.
Use as a canonical verdict.HELM stops and asks for more facts, policy, or human approval.
Use as the canonical non-dispatch path for missing facts, policy hold, or approval.Bring one consequential agent action to the boundary and see the verdict and the receipt.