AI Governance Red-teaming Agent Security March 2026

OpenAI Acquired Promptfoo. What That Means for Agent Governance.

OpenAI acquired Promptfoo on March 9, 2026. Promptfoo built red-teaming and evaluation tooling for LLM-powered systems. Promptfoo grew to 350,000 developers and reached 25% of the Fortune 500 in under two years. OpenAI will embed Promptfoo technology into its Frontier platform for enterprise agentic security testing.

The acquisition validates a category: developers need tooling that enforces safety on agent behavior, not just model outputs. Understanding what Promptfoo does — and what it does not do — explains where agent governance fits next.

What Promptfoo does

Promptfoo is a red-teaming and evaluation tool. Promptfoo tests what an agent will do given a prompt. Promptfoo finds vulnerabilities before production deployment. Promptfoo generates adversarial inputs and checks whether the model responds safely. The test runs before the agent ships.

Four things Promptfoo tests:

Prompt injection: Does the agent follow instructions embedded in malicious input?
Data leakage: Does the agent surface confidential information in its response?
Jailbreak resistance: Does the agent hold its safety constraints under adversarial pressure?
Tool misuse: Does the agent call the wrong function when given ambiguous instructions?

Promptfoo answers the question: "Is this agent safe to ship?" It runs offline, before deployment. The evaluation tells teams what risks exist.

What Promptfoo does not do

Promptfoo does not govern what happens after the agent ships. Promptfoo does not block an action at runtime. Promptfoo does not issue an approval decision before the agent sends an email or charges a card. Promptfoo does not maintain an approval queue or an immutable audit ledger.

Those are different problems. A safe agent still makes mistakes. A well-tested agent can still receive unexpected inputs at runtime. A well-tested agent can still hit a production system at 2am with a payload that no evaluation anticipated.

Testing and governance are sequential. Promptfoo tests what your agent will do before you ship it. Agent governance controls what your agent is allowed to do after you ship it.

The gap: runtime write-path control

Production AI agents write to real systems. They send emails. They insert records. They charge cards. They call APIs that do not have an undo button. Evaluation covers the development phase. The write path needs governance at runtime.

Runtime governance answers a different question than evaluation. Evaluation asks: "Would this agent misuse this tool in testing?" Runtime governance asks: "Should this specific intent execute right now against this specific destination?"

The 3 requirements for runtime write-path governance are below.

Pre-execution decision: The agent submits an intent before the write runs. The governance layer evaluates and returns a decision. The action runs only on approval.
Human approval queue: High-risk intents wait for a reviewer. Low-risk intents auto-approve under thresholds. The queue gives humans visibility without requiring them to approve every action.
Immutable audit trail: Every intent produces an audit record. The record ties the action to the agent identity, the policy decision, the reviewer, and the execution outcome. The audit record is queryable.

How the two tools fit together

Teams building serious agent systems need both. Promptfoo runs in CI — it tests what the agent will do before the agent ships. Agent governance runs in production — it controls what the agent is allowed to do after it ships.

Development
  │
  └─ Promptfoo red-team → find vulnerabilities → fix before shipping

Production
  │
  └─ Agent submits intent → governance layer → approved? → execute

The two tools do not compete. A team that only has Promptfoo has tested its agent but has no runtime checkpoint. A team that only has runtime governance has a checkpoint but no vulnerability baseline. Running both closes the loop.

What OpenAI's acquisition signals

OpenAI acquiring Promptfoo signals 3 things about the direction of agent infrastructure:

The category is real: 350,000 developers adopted Promptfoo because the need for AI security tooling is genuine, not theoretical. OpenAI paid for that adoption.
Evaluation is entering the platform layer: Baking red-teaming into Frontier means enterprise customers get evaluation as a managed capability. Teams no longer need to self-host evaluation infrastructure.
Governance is the next layer: Evaluation is the "pre-ship" phase. Governance is the "post-ship" phase. The acquisition of an evaluation tool signals that the governance layer — runtime control, approval queues, audit trails — is the open problem that remains.

Where Zehrava Gate fits

Zehrava Gate is a runtime write-path control plane for AI agents. Agents submit intents before production writes run. Zehrava Gate evaluates a YAML policy deterministically. Zehrava Gate issues an approval, a block, or a hold. The approval carries a signed execution token with a 15-minute TTL.

The policy file is simple, reviewable, and version-controlled:

# policies/outbound-email.yaml
id: outbound-email
destinations: [gmail.send, sendgrid.send]

field_checks:
  - path: "to"
    must_match: "^[^@]+@[a-zA-Z0-9.-]+\\.com$"

rate_limits:
  per_agent_per_hour: 20

require_approval: always

Use Promptfoo to test this agent before shipping. Use Zehrava Gate to govern every email it sends after shipping.

Try Zehrava Gate

Open the demo. Submit an intent. See the approval queue and audit trail.

Open the demo

OpenAI Acquired Promptfoo. What That Means for Agent Governance.

What Promptfoo does

What Promptfoo does not do

The gap: runtime write-path control

How the two tools fit together

What OpenAI's acquisition signals

Where Zehrava Gate fits

Try Zehrava Gate

Related reading