Policy-as-YAML: AI Agent Policy Enforcement That Scales Across Teams
The Problem: AI Agent Policy Enforcement Is Broken Across Teams
You have five agents. Each one touches production data. Each one has approval logic baked into its code. AI agent policy enforcement — the rules that determine what an agent is allowed to write, where, and how much — is scattered across every codebase instead of living in one place.
Agent A checks if rows > 500: require_approval(). Agent B does the same check, but the threshold is 200. Agent C skips it entirely because someone was in a hurry. Agent D was written by a contractor who left.
Now your security team asks: "What are the rules for sending customer data to Salesforce?" You don't have an answer. The answer lives in four different codebases, three different languages, and no one agrees.
This is the real cost of embedding governance logic in application code. It works for one agent. It fails at three. By ten, you have compliance debt you can't audit.
The fix isn't more code review. It's separating policy from logic entirely. If you're still at the stage of understanding why agents need a write-path checkpoint at all, start with Why AI Agents Need an AI Agent Commit Checkpoint — this post picks up where that one leaves off.
Policy-as-YAML: The Pattern
Policy-as-YAML treats approval rules as data, not code. You define what's allowed — data types, destinations, thresholds, expiry — in a YAML file. Every agent loads that file at runtime. The policy evaluates the intent before any human sees it.
Four properties make this work:
- Shareable. One file. Every team, every agent, every environment reads the same rules.
- Auditable. Git history shows who changed what, when, and why. Diff a policy the same way you diff code.
- Reviewable by non-engineers. Your compliance lead can read YAML. They cannot read TypeScript middleware.
- Versionable. Tag
finance-high-risk@v1.2the same way you tag a library release. Pin agents to specific versions. Roll back without a deploy.
Zehrava Gate implements this pattern natively. Policies are YAML files loaded at runtime, evaluated against every gate.propose() call before the intent hits the approval queue.
Three Real Policy Examples
1. CRM Low-Risk: Routine Data Sync
# crm-low-risk.yaml
# For agents syncing contact data to CRM tools.
# Auto-approves small batches; escalates large ones.
id: crm-low-risk
allowed_types: [csv, json]
destinations: [salesforce.import, hubspot.contacts]
auto_approve_under: 100 # rows under 100 approve instantly
require_approval_over: 100 # rows at or above 100 queue for review
expiry_minutes: 60 # intent expires if not actioned in 1 hour
# execution: runner_exec (worker runs in your VPC) (worker runs in your VPC, 15min token TTL)
This is the baseline. Small CRM syncs clear automatically. A 50-row contact upload at 2am doesn't wake anyone up. A 10,000-row export does.
2. Finance High-Risk: Zero Auto-Approval
# finance-high-risk.yaml
# For agents that touch payment data, invoices, or financial records.
# Every intent requires human sign-off. No exceptions.
id: finance-high-risk
allowed_types: [json]
destinations: [quickbooks.journal, stripe.payout, internal.finance-db]
auto_approve_under: 0 # 0 disables auto-approval entirely
require_approval_over: 0 # every intent goes to queue
require_fields: [amount, currency, reason, requested_by]
block_if_terms: [DELETE, TRUNCATE, DROP] # SQL-like guards on payload text
expiry_minutes: 30 # tighter window; stale finance ops are dangerous
# execution: runner_exec (worker runs in your VPC)
require_fields forces agents to include context. An approver sees amount, currency, reason, and who requested it — not a raw JSON blob. block_if_terms catches destructive operations before they reach the queue at all.
3. Outbound Email: Content Gating
# outbound-email.yaml
# For agents that draft or send emails to external recipients.
# Always requires approval; blocks known spam indicators.
id: outbound-email
allowed_types: [text, html]
destinations: [sendgrid.transactional, postmark.send]
auto_approve_under: 0
require_approval_over: 0
require_fields: [recipient, subject, body, sender]
block_if_terms: [unsubscribe all, click here now, limited time offer]
expiry_minutes: 120 # emails can wait longer; 2-hour review window
# execution: runner_exec (worker runs in your VPC)
The block_if_terms list catches spam-pattern language before a human even sees the draft. Your compliance team maintains this list. They don't need to touch agent code to update it.
How Gate Loads and Applies Policies
Every intent runs through the same pipeline. The agent calls gate.propose() with a policy parameter. Gate loads the named policy, evaluates the payload against it, and routes accordingly.
TypeScript:
import { gate } from '@zehrava/gate';
const result = await gate.propose({
policy: 'crm-low-risk',
payload: {
type: 'csv',
destination: 'salesforce.import',
rows: 47,
data: contactBatch,
},
context: {
agent: 'crm-sync-agent',
triggered_by: 'nightly-cron',
},
});
if (result.status === 'approved') {
await syncToSalesforce(result.payload);
} else if (result.status === 'pending') {
console.log(`Intent ${result.intentId} queued for review`);
} else {
console.error(`Blocked: ${result.reason}`);
}
Python:
from zehrava import gate
result = gate.propose(
policy="finance-high-risk",
payload={
"type": "json",
"destination": "quickbooks.journal",
"amount": 4200.00,
"currency": "USD",
"reason": "Q1 contractor payment",
"requested_by": "finance-agent-v2",
},
context={
"agent": "finance-agent-v2",
"run_id": run_id,
}
)
if result["status"] == "approved":
post_to_quickbooks(result["payload"])
elif result["status"] == "pending":
notify_team(f"Approval needed: {result['id']}")
elif result["status"] == "blocked":
raise PolicyViolation(result["reason"])
Gate returns one of three statuses: approved, pending, or blocked. Your agent code handles outcomes, not policy decisions. The policy makes the call.
Evaluation order:
- Load named policy from file or registry
- Check
allowed_types— reject if payload type not listed - Check
destinations— reject if target not whitelisted - Check
block_if_terms— block if any term found in payload - Check
require_fields— block if required context missing - Compare row/size count to
auto_approve_underthreshold - Route: auto-approve, queue for review, or block
No intent reaches a human until it clears steps 1–5. Reviewers see clean, validated intents — not raw agent output.
Versioning Policies with Git
Policies live in your repo. Treat them like infrastructure config.
policies/
crm-low-risk.yaml
finance-high-risk.yaml
outbound-email.yaml
support-reply.yaml
blog-publish.yaml
Every change goes through pull request. Your compliance lead reviews the diff. The commit message is the audit trail.
# Example git log for a policy change
a3f2c1b feat(policy): lower finance-high-risk expiry from 60m to 30m
Requested by security team after Q4 audit.
Reviewed-by: legal@company.com
d7e90aa fix(policy): add DELETE to finance-high-risk block_if_terms
Caught by pen test. Agents were generating DELETE statements.
b12cc3f feat(policy): add outbound-email policy
New policy for email-sending agents going to prod.
When an auditor asks "what were the approval rules for the finance agent in January?", you run git log --follow policies/finance-high-risk.yaml. Done.
Gate supports policy pinning. Reference a specific commit hash or tag in your agent config to lock a policy version across environments:
gate.propose({
policy: 'finance-high-risk@v1.2.0', // pinned to tagged release
...
})
Extending Policies: Custom Rules
The built-in fields cover most cases. When they don't, Gate supports custom policy extensions.
Need to enforce data residency? Add a require_region field:
id: eu-data-handling
allowed_types: [json, csv]
destinations: [eu-west-db, gdpr-compliant-crm]
auto_approve_under: 0
require_approval_over: 0
require_fields: [data_region, gdpr_basis, retention_days]
require_region: eu-west # custom extension: Gate checks payload.data_region
expiry_minutes: 60
# execution: runner_exec (worker runs in your VPC)
Need to gate on dollar amount, not row count? Use auto_approve_under_amount:
id: procurement-medium-risk
allowed_types: [json]
destinations: [procurement.api, netsuite.po]
auto_approve_under_amount: 5000 # POs under $5k auto-approve
require_approval_over_amount: 5000
require_fields: [vendor, amount, currency, po_number]
expiry_minutes: 480 # 8-hour window for procurement reviews
# execution: runner_exec (worker runs in your VPC)
Extensions register in Gate's config, not in each agent. One registration, used everywhere.
For a real-world look at what happens when agents don't have this layer — and how fast things go wrong — read 847 Records Overwritten, No Audit Trail: An AI Agent Production Incident. The CRM overwrite case maps directly to the crm-low-risk policy above.
Getting Started
Gate ships with five built-in policies: crm-low-risk, finance-high-risk, support-reply, outbound-email, and blog-publish. Most teams start with one of these and customize from there.
Install:
# TypeScript
npm install @zehrava/gate
# Python
pip install zehrava-gate
Initialize:
import { Gate } from '@zehrava/gate';
const gate = new Gate({
apiKey: process.env.ZEHRAVA_API_KEY,
policyDir: './policies', // local YAML files; or omit to use Gate's hosted registry
});
Submit your first intent:
const result = await gate.propose({
policy: 'crm-low-risk',
payload: { type: 'csv', destination: 'salesforce.import', rows: 12 },
});
Twelve rows, CRM destination, auto-approved. No queue. No human. Policy said so.
For teams moving fast with multiple agents, the fastest path: pick the closest built-in policy, drop it in ./policies/, adjust one threshold, commit. Your agents share a single source of truth from day one.
Full documentation and policy reference at zehrava.com.
Add a commit checkpoint to your agent stack.
MIT license. Self-hostable. Framework-agnostic. Takes under an hour to wrap your first agent.
Get started free →