What is a versioned skill for an AI agent?

A skill is a defined unit of capability: a description of a task, how to do it, and which tools it is allowed to use. Versioning it means it lives as a tracked artifact you can change, review, and roll back, rather than as text buried in one giant prompt. The agent's behavior becomes something you edit deliberately and audit, not something you tune by feel.

What does 'gating tools per skill and per turn' mean?

It means the set of tools the agent can call is restricted by which skill is active, and re-evaluated each turn. A skill for drafting only exposes drafting tools; it cannot reach a tool that moves money. Per-turn gating means the available tools track what the agent is currently doing, so it can never call something outside the current task.

How do you keep a gated agent from being uselessly restricted?

By making skills grant real capability, not by removing it. The point is not fewer tools, it's the right tools for the current task and nothing else. A senior colleague isn't less capable for staying in their lane; they're trusted because they do. Gating plus a confirmation step on irreversible actions gives you capable and safe at once.

What is fail-closed skill loading?

It means if a skill can't be loaded or validated, the agent does without it rather than falling back to an unrestricted default. The dangerous failure mode is a missing guardrail silently leaving the agent more capable than intended. Fail-closed makes the absence of a skill reduce what the agent can do, never expand it.

tsukumo

Versioned skills + gated tools: building a trustworthy AI agent · tsukumo

tsukumo

Agency17 June 20264 min read

Building a senior-colleague AI: versioned skills and gated tools

A loose-cannon agent is dangerous and a shackled one is useless. The way out is to put the judgment in versioned, fail-closed skill definitions and to gate which tools the agent can touch per skill and per turn. Capable without being a liability.

The short answer

A loose-cannon agent is a liability and a shackled one is useless. The middle is built, not prompted: put the agent's judgment in versioned, fail-closed skill definitions, gate which tools it can call per skill and per turn, and put a confirmation step in front of anything irreversible. The agent stays capable because each skill grants real ability, and stays safe because it can only ever reach the tools the current skill allows.

tsukumo

Building a senior-colleague AI: versioned skills and gated tools

Short version: there are two easy ways to build an agent and both are wrong. A loose-cannon agent with every tool and one giant prompt is a liability. A shackled agent that has to ask permission for everything is useless. The version worth shipping sits in the middle, and it is built, not prompted: the judgment lives in versioned, fail-closed skill definitions, the tools are gated per skill and per turn, and anything irreversible goes through a confirmation step. We built one for a fiduciary, where the agent has to be genuinely useful and genuinely safe at the same time.

Skill as a versioned artifact	Capability buried in a prompt
Edit and roll back deliberately	Change by feel, hope nothing regresses
Review like code	Behavior hidden in prose
Loaded from a system of record	Hard-coded and scattered
Fail-closed if missing	Silent, unpredictable fallback

Loose cannon	Shackled bot	Senior colleague
All tools, always	Asks before everything	Right tools for the task
Prompt-tuned by feel	Hard rules everywhere	Versioned, reviewable skills
Fails open	Fails by stopping	Fails closed, stays useful
Dangerous	Useless	Capable and safe

Building a senior-colleague AI: versioned skills and gated tools

Why "one big prompt with all the tools" fails#

Skills as versioned artifacts#

Gating tools per skill and per turn#

A confirmation step before the irreversible#

Shackled, loose, or senior?#

What this means for your team#

What agentic product development actually is (and how it beats a dev shop)

When to scale your agent setup: the team signals that actually matter

What an AI engineering assessment actually is (and what you walk away with)

Want this running on your team?