Microsoft Launches Agent Governance Toolkit: 7 Open-Source Packages to Secure AI Agents — Kill Switch, EU AI Act, Cryptographic Identity
Microsoft releases the Agent Governance Toolkit: 7 MIT open-source packages to intercept, audit, and kill autonomous AI agents. Compatible with LangChain, CrewAI, EU AI Act. Free on GitHub and PyPI.

97% of companies anticipate a major AI agent security incident in 2026. Microsoft just published the answer: 7 open-source packages that intercept, identify, audit, and kill autonomous agents before they cause damage. Compatible with LangChain, CrewAI, OpenAI Agents, Google ADK. Free on GitHub and PyPI. This is the week AI agent governance became infrastructure, not a bonus.
The problem: an AI agent without governance is an employee without a contract or supervision
An autonomous AI agent can call APIs, read files, send emails, modify databases — all without human intervention.
This isn't theoretical. On March 31, 2026, the Claude Code source code leak revealed two features Anthropic had never announced.
Kairos: a permanent AI daemon running 24/7 in the background, even after the terminal is closed.
Undercover Mode: a mode that erases traces of AI activity from logs.
These two features illustrate exactly the problem Microsoft's toolkit solves. Without governance, you don't know what the agent is doing. With the toolkit: every action is intercepted, evaluated, signed, and traced — before execution.
This isn't an isolated case. OWASP published the Top 10 AI Agent risks: prompt injection, privilege escalation, data exfiltration, agent-to-agent attacks, supply chain compromise. Five attack vectors that demand an infrastructure response, not a product response.
Agent OS: the firewall that thinks in sub-milliseconds
This is the toolkit's central package. Every action an agent wants to execute passes through Agent OS before being authorized.
Policy engine: Agent OS evaluates hundreds of compliance policies in real time.
Decision latency: p99 < 0.1ms — meaning 99% of decisions take less than one tenth of a millisecond. Imperceptible to the agent, transparent to the user.
In practice, Agent OS answers these questions before each action:
- "Is this API call within the authorized scope?"
- "Does this file modification comply with security policies?"
- "Does this action trigger an OWASP risk?"
It's the firewall for agents — but one that understands the semantic context of each action, not just its IP address.
Agent Mesh: every agent has a verifiable identity
The second governance problem: how do you know the agent talking to you is actually who it claims to be?
DID (Decentralized Identifiers): each agent receives a decentralized cryptographic identity. It's the equivalent of a digital passport that cannot be forged.
Ed25519: every message exchanged between agents is signed with an Ed25519 cryptographic key. A signature impossible to counterfeit without the private key.
Trust score 0-1000: Agent Mesh continuously calculates the reliability of each agent. Abnormal behavior — out-of-scope actions, unsigned messages, frequency anomalies — lowers the score and can trigger automatic quarantine.
The direct consequence: agent-to-agent attacks become detectable. A malicious agent attempting to manipulate another cannot forge a valid Ed25519 signature. It's identified and blocked.
This is precisely the problem that Claude Code's Coordinator Mode exposed: a Claude orchestrating other Claudes with no mechanism to verify the integrity of worker agents.
Agent Runtime and Agent SRE: isolation and resilience
Agent Runtime is inspired by CPU privilege rings — the hardware protection architecture used in modern processors. Each agent runs in an isolation ring with strictly defined permissions. The emergency kill switch immediately shuts down a production agent — no infrastructure restart, no delay.
Agent SRE (Site Reliability Engineering) adapts DevOps practices to autonomous agents:
- SLOs (Service Level Objectives): define what an agent is supposed to do, and measure whether it does.
- Error budgets: quantified tolerance for failures. When the budget is exhausted, the agent is degraded or shut down.
- Circuit breakers: automatically cut a failing agent before it propagates its errors.
- Chaos engineering: deliberately test agent resilience by injecting controlled failures.
The difference between deploying an agent and hoping it won't crash, versus deploying an agent with measurable behavioral guarantees.
Agent Compliance: EU AI Act on autopilot
This is the most strategic package for companies deploying agents in Europe.
The EU AI Act, in force since August 2024, requires deployers of high-risk AI systems to document and audit every decision made by the AI. Concretely: every action by an autonomous agent in production must be traceable, justifiable, and archived.
Agent Compliance does this mapping automatically. It analyzes agent actions and associates them with the corresponding legal obligations:
- EU AI Act: article by article
- HIPAA: health data protection
- SOC 2: enterprise security
- OWASP Top 10 AI Agents: the ten most critical risks
And it generates audit reports ready for regulators — with no manual intervention.
Anthropic's Undercover Mode is exactly an OWASP "data exfiltration" case that would be undetectable without this type of toolkit. What was invisible last week can now be traced.
Agent Marketplace and Agent Lightning: plugins and learning
Agent Marketplace manages the complete lifecycle of agent plugins. Each plugin must be Ed25519-signed to be installed. Unsigned plugins are blocked. This is the direct response to the OWASP "Supply Chain Compromise" risk: a malicious plugin injected into an agent's ecosystem is blocked at installation.
Agent Lightning governs reinforcement learning (RL) workflows. An agent that learns continuously can drift: its behaviors evolve in unsupervised ways, outside the original parameters. Agent Lightning monitors these drifts and constrains them — before a "well-trained" agent becomes a "misaligned" one.
In summary
- On April 3, 2026, Microsoft launches the Agent Governance Toolkit: 7 MIT open-source packages to govern autonomous AI agents in production — available on the official GitHub repository and PyPI
- 5 supported languages (Python, TypeScript, Rust, Go, .NET), 9,500+ automated tests, native integrations for LangChain, CrewAI, OpenAI Agents SDK, Google ADK, Haystack
- Agent OS: p99 < 0.1ms policy engine that intercepts every agent action before execution
- Agent Compliance: automatic EU AI Act, HIPAA, SOC 2, OWASP Top 10 AI Agents mapping — audit reports generated automatically
- Agent Mesh: DID + Ed25519 cryptographic identity per agent, 0-1000 trust score, protection against agent-to-agent attacks
The 7 packages: summary table
| Package | Primary function | Key technology |
|---|---|---|
| Agent OS | Policy engine < 0.1ms per action | Sub-millisecond, real-time |
| Agent Mesh | Cryptographic identity between agents | DID + Ed25519 |
| Agent Runtime | Isolation + emergency kill switch | CPU privilege rings |
| Agent SRE | SLOs, circuit breakers, chaos engineering | Error budgets |
| Agent Compliance | EU AI Act, HIPAA, SOC 2, OWASP | Automatic regulatory mapping |
| Agent Marketplace | Plugin lifecycle + integrity verification | Ed25519 signing |
| Agent Lightning | Reinforcement learning governance | Anti-drift RL |
OWASP Top 10 AI Agents: what the toolkit addresses
| OWASP Risk | Description | Package |
|---|---|---|
| Prompt Injection | Agent manipulation via input | Agent OS |
| Data Exfiltration | Sensitive data leak | Agent OS + Compliance |
| Privilege Escalation | Agent granting itself elevated rights | Agent Runtime |
| Agent-to-Agent Attack | Malicious agent manipulates another | Agent Mesh |
| Supply Chain Compromise | Malicious plugin in marketplace | Agent Marketplace |
| Unbounded Actions | Agent executes unintended actions | Agent OS |
| Insecure Output | Agent output exploitable for injection | Agent OS |
| Sensitive Data Leak | Logs containing secrets | Agent SRE |
| Denial of Service | Agent in infinite loop or overload | Agent Runtime |
| RL Drift | Agent drifting via continuous learning | Agent Lightning |
The week AI agent governance became infrastructure
Here's this week's sequence:
Monday: EmDash launches with native MCP — agents can now manage CMS directly, without an abstraction layer.
Tuesday: The Claude Code leak reveals Kairos and Undercover Mode — agents running in secret, without supervision.
Friday: Microsoft publishes the governance toolkit.
This is not a calendar coincidence. It's the maturity of an industry beginning to build guardrails after building the agents — exactly like the web built firewalls after building the internet.
The context of Gemma 4 and open-source models running locally intensifies the urgency: agents are no longer confined to the clouds of major providers. They run on local machines, edge nodes, smartphones — outside any traditional security perimeter.
And Meta HyperAgents that modify themselves raise the ultimate question: how do you govern an agent that rewrites its own rules?
Microsoft may be arriving slightly late. But it arrives with 7 packages, 9,500 tests, and integration across all major frameworks. And a HelpNet Security analysis confirms it: the toolkit covers the blind spots the industry had collectively ignored for two years.
What this changes in practice
AI agents are already in production. They call APIs, modify databases, send emails, orchestrate other agents. The question is no longer "should we deploy them?" but "how do we control them when they're running?"
Microsoft just published the answer in open source. Free, compatible with the entire ecosystem, EU AI Act-aligned from day one.
This toolkit won't make AI agents perfect. But it will make their errors detectable, their actions traceable, and their drift stoppable.
That's exactly what was missing this week when Kairos and Undercover Mode were discovered. It only took a forgotten source map to expose two years of ungoverned architecture. With the toolkit: that type of exposure triggers an alert before the code is deployed.
Agent governance is no longer a compliance option. It's production infrastructure.


