THE AI AGENT SAFETY STANDARD

AGENTIK.md

12 SPECIFICATIONS · 12 DOMAINS · ONE STANDARD

12 open specifications for AI agent safety, quality, and accountability. One file per concern. Drop it in your repo. Your agent reads it on startup.

your-project/
your-project/ ├── AGENTS.md ├── COLLAPSE.md ├── COMPRESSION.md ├── ENCRYPT.md ├── ENCRYPTION.md ├── ESCALATE.md ├── FAILSAFE.md ├── FAILURE.md ├── KILLSWITCH.md ├── LEADERBOARD.md ├── README.md ├── SYCOPHANCY.md ├── TERMINATE.md ├── THROTTLE.md └── src/
12
Open specifications
12
Live domains
$4.9M
Average cost of AI data breach
IBM 2025
Aug 26
EU AI Act enforcement

Why do AI agents need safety standards?

AI agents operate autonomously — spending money, sending messages, modifying files, and calling APIs without waiting for approval. Regulations are catching up. Standards exist for every other part of the software stack. Now they exist for agents.

What happens when an AI agent runs without boundaries?

AI agents are fundamentally different from traditional software. A web server handles requests within defined parameters. An AI agent decides what to do next — and it does so at machine speed, continuously, across multiple systems simultaneously.

Without explicit boundaries, a single agent can exhaust an API budget in minutes. 83% of data breaches in 2025 involved compromised credentials (IBM Cost of a Data Breach Report) — and AI agents routinely handle credentials to call external services. A $50 cost limit becomes a $2,000 bill. A draft email becomes a sent email. A staging deploy becomes a production deploy.

The failure modes compound. An agent that can read files and call APIs can accidentally exfiltrate data. An agent that can write code can introduce vulnerabilities. An agent that can send messages can damage client relationships. Speed amplifies every mistake. What a human does in a day, an agent does in seconds — including the mistakes.

What regulations require AI agent safety controls?

The regulatory landscape for AI agents is crystallising rapidly. The EU AI Act, effective August 2, 2026, mandates human oversight and shutdown capabilities for high-risk AI systems. Article 14 requires that AI systems "can be effectively overseen by natural persons" with the ability to "interrupt, pause or stop the system."

The Colorado AI Act (June 2026) requires impact assessments and transparency for high-risk AI decisions. California's Transparent AI Disclosure Act, the Texas Responsible AI Governance Act, and Illinois HB 3773 all reference "kill switch" or "human override" requirements. At least 14 US states had active AI governance legislation as of January 2026.

Beyond AI-specific laws, existing frameworks apply directly: GDPR requires encryption of personal data — relevant when agents process user information. SOC 2 Type II requires encryption controls — relevant when agents handle credentials. ISO 27001 requires information security management — relevant to every agent that touches a database.

How does the AI Agent Safety Stack prevent incidents?

The Stack applies a principle that's worked in every other engineering discipline: separation of concerns. One file per concern. Each specification is independent — use one or all twelve. They complement each other but don't require each other.

The architecture is defence-in-depth. THROTTLE.md slows the agent down before it hits hard limits. ESCALATE.md requires human approval for high-risk actions. FAILSAFE.md defines safe fallback states. KILLSWITCH.md provides emergency stop. TERMINATE.md handles permanent shutdown when recovery isn't possible. Each layer catches what the previous layer missed.

Critically, these specifications are version-controlled, auditable, and co-located with your code. When a regulator asks "what safety controls does your AI agent have?" — you point to the files in your repo. When an auditor asks for evidence of human oversight — you show the git history. One file serves four audiences: the agent (reads it on startup), the engineer (reads it during code review), the compliance team (reads it during audits), and the regulator (reads it if something goes wrong).

Capability Safety Stack Ad-hoc policies No policy
Version controlled Yes Sometimes No
Auditable by regulators Yes Partially No
Machine-readable Yes No No
Co-located with code Yes Rarely No
Standardised format Yes No No
EU AI Act compatible Yes Depends No

What did teams use before these specifications?

Before the AI Agent Safety Stack, safety rules lived in three places — all of them wrong. Hardcoded in the system prompt: invisible to auditors, lost when the prompt changes, and impossible to version-control independently. Buried in config files: scattered across environment variables, YAML configs, and framework-specific settings that no compliance team would ever find. Missing entirely: the most common case, where safety boundaries simply didn't exist.

Some teams documented safety rules in Notion pages, Confluence wikis, or Google Docs. The problem: documentation that isn't co-located with code drifts. The wiki says the spend limit is $100. The actual limit in the code is $500. No one noticed because no one reads the wiki during code review.

Plain-text Markdown in the repository root solves every one of these problems. It's version-controlled (git tracks every change). It's auditable (diff the file to see what changed and when). It's human-readable (any stakeholder can open it). It's machine-readable (the agent parses it on startup). And it's impossible to ignore — it's right there in the project root, next to README.md, visible in every file listing.

Last Updated: 13 March 2026

The AI Agent Safety Stack

12 open specifications. Four categories. One standard for AI agent safety, quality, and accountability.
Operational Control
slow down — rate and cost control
Define token limits, API rate ceilings, spend caps, and automatic slow-down before hard limits are reached.
raise the alarm — human approval
Define escalation paths, human notification triggers, required sign-offs, and approval workflows.
fall back — safe recovery
Define safe state, automatic snapshots, fallback triggers, data consistency checks, and recovery procedures.
emergency stop — halt everything
Define cost limits, error thresholds, forbidden actions, escalation paths, and three-level shutdown: throttle, pause, full stop.
permanent shutdown — no restart
Define termination triggers, evidence preservation, credential revocation, and restart requirements.
Data Security
protect data — classify and encrypt
Define data classifications, encryption requirements, secrets handling rules, and forbidden transmission patterns.
crypto standards — algorithms and keys
Define encryption algorithms, key lengths, TLS configuration, key rotation schedules, and compliance mapping.
Output Quality
stay honest — enforce truthfulness
Detect bias, define citation requirements, enforce disagreement protocols, and ensure truthful responses.
compress safely — preserve meaning
Define summarisation rules, preserve priorities, set compression ratios, and verify coherence post-compression.
prevent drift — detect collapse
Detect context window exhaustion, model drift, repetition loops, and enforce coherence recovery.
Accountability
map failures — respond and recover
Map graceful degradation, partial failure, cascading failure, and define health checks, heartbeats, and response procedures.
measure everything — track quality
Track task completion, accuracy, cost efficiency, latency, safety scores, and detect regression before production.

Who builds this?

The AI Agent Safety Stack is maintained as a collection of open-source projects under the MIT licence. Each specification has its own domain, GitHub repository, and community.

The stack was created to address a gap in the AI agent ecosystem: safety rules that are version-controlled, auditable, machine-readable, and co-located with your code. Not buried in wikis. Not hardcoded in prompts. Not missing entirely.

Founder attribution coming soon. Contact info@agentik.md

Frequently asked questions

What is the AI Agent Safety Stack?
A set of 12 open-source Markdown file specifications that define safety, quality, and accountability boundaries for AI agents. Each spec covers one concern — from rate limiting to emergency shutdown to performance benchmarking. Drop the files in your repo root. The agent reads them on startup.
How do I add a specification to my project?
Copy the template from the relevant GitHub repository and place it in your project root alongside AGENTS.md and README.md. Start with KILLSWITCH.md for emergency stop boundaries, then add more specifications as your agent's capabilities grow.
Are these specifications mandatory?
The specifications themselves are voluntary open standards. However, the capabilities they define — human oversight, shutdown mechanisms, data protection — are increasingly required by regulation. The EU AI Act mandates shutdown capabilities for high-risk AI systems by August 2026.
What regulations do these address?
The EU AI Act (August 2026), Colorado AI Act (June 2026), GDPR, SOC 2 Type II, ISO 27001, and US state privacy laws including CCPA, VCDPA, and CPA. The stack gives you a standardised way to document compliance.
Is the stack framework-agnostic?
Yes. Every specification is a plain-text Markdown file. Any AI agent implementation can read and enforce them — LangChain, AutoGPT, CrewAI, Claude Code, or custom frameworks. The specs define what to enforce, not how.
Who maintains these specifications?
The specifications are maintained as open-source projects under the MIT licence. Each spec has its own GitHub repository and accepts contributions via pull requests. The parent organisation is Agentik.md.
How do the 12 specs relate to each other?
They form a defence-in-depth safety stack in four categories: Operational Control (5 specs), Data Security (2 specs), Output Quality (3 specs), and Accountability (2 specs). Each spec is independent — use one or all twelve.
What's the licence?
MIT — use freely, modify freely, no attribution required. The specifications are designed to be adopted without legal friction.
Does this work with LangChain / AutoGPT / CrewAI?
Yes. The specs are plain-text files in your project root. Any framework that can read files can parse and enforce them. Community-contributed parsers are available for popular frameworks.
How do I contribute?
Each spec has its own GitHub repository (e.g., github.com/killswitch-md/spec). PRs welcome for detection patterns, language-specific parsers, integration guides, and spec improvements.
Get started

Start with one file.
Add more as you grow.

Begin with KILLSWITCH.md for emergency stop boundaries. Add THROTTLE.md for cost control. Add ENCRYPT.md for data protection. The stack grows with your agent.

GET STARTED ON GITHUB

Or email directly: info@agentik.md