A Skill Is a Folder, Not a Prompt: What Anthropic Learned Running Hundreds of Them

TL;DR

Anthropic published a June 3, 2026 post describing how its engineering organization uses hundreds of Claude Code Skills. The confirmed development is a public account of Anthropic’s internal practice; the main unresolved point is how broadly its claimed quality gains apply outside Anthropic.

Anthropic published a June 3 Claude blog post by Claude Code engineer Thariq Shihipar explaining how the company packages reusable agent know-how as Skills and has run hundreds of them across its engineering organization, a development that gives teams a clearer model for turning repeated AI-agent instructions into shared operational assets.

The post, titled “Lessons from building Claude Code: How we use skills”, presents a Skill as more than a saved prompt. According to Anthropic’s described setup, a Skill is a folder that can contain a SKILL.md file, reference material, scripts, templates, configuration, hooks and memory that an agent can access when a task calls for it.

The source material says Anthropic found its internal Skills clustered into nine categories: library and API reference, product verification, data fetching and analysis, business-process automation, code scaffolding and templates, code quality and review, CI/CD and deployment, runbooks, and infrastructure operations. Anthropic’s own measurement, as summarized by Thorsten Meyer AI, says verification Skills, meaning Skills that check an agent’s work, had the largest effect on output quality.

The confirmed news is the publication of Anthropic’s lessons and the described structure of its Skills system. The business framing in the July 1 Thorsten Meyer AI analysis, that Skills can become versioned operating procedures for AI agents, is an interpretation built on Anthropic’s account rather than an independently audited finding.

At a glance
reportWhen: published June 3, 2026; covered July 1,…
The developmentAnthropic has published lessons from running hundreds of Claude Code Skills internally, describing Skills as reusable folders that agents can discover, read and run.
AI Dispatch · Insights · 1 July 2026

A Skill is a folder, not a prompt

Anthropic published what it learned running hundreds of Skills across its own engineering org. Read as a business memo, the point is bigger than a coding trick: this is how ad-hoc prompting becomes durable institutional capability — the SOPs your agents actually follow, versioned and shared.

✕ The misconception

“A Skill is just a clever markdown prompt you save in a file.”

✓ What it actually is

A folder the agent can discover, read & run — instructions, scripts, references, templates, config & on-demand hooks.

Anatomy of a Skill — the file system is context engineering
my-skill/the unit you share & version
├─ SKILL.mdroot instructions + a description written for the model (its trigger)
├─ references/deep detail pulled in only when needed — progressive disclosure
├─ scripts/real code, so the agent composes instead of rebuilding boilerplate
├─ assets/templates & files to copy into the output
├─ config.jsonsetup the agent asks for if it’s missing (e.g. which Slack channel)
└─ hooks + memoryon-demand guardrails + an append-only log so it remembers
Why it matters: the folder itself is the knowledge base. The agent reads the root, then reaches deeper only when the task demands it — the same way you’d hand a new hire a one-pager that points to the detailed docs.
The nine types — a gap-analysis map for your own library
1Library / API reference
2Product verification ★ top impact
3Data fetching & analysis
4Business-process automation
5Code scaffolding & templates
6Code quality & review
7CI/CD & deployment
8Runbooks
9Infrastructure operations
By Anthropic’s own measurement, verification Skills — the ones that check the work — moved output quality the most. If you build one category well, build that one.
The craft — what separates a good Skill from a useless one
Gotchas = highest-signal section Describe for the model, not humans (it’s the trigger) Don’t state the obvious Ship scripts, not just prose On-demand guardrail hooks (/careful, /freeze) Let it remember (log / SQLite) Don’t railroad — leave room to adapt
The take

The knowledge of how your organization actually operates can be captured, versioned, shared & executed — and the thing capturing it is a humble folder with a script and a gotchas list inside. For the builder, that’s context engineering with real tools attached. For whoever owns the budget, it’s the difference between AI that starts from zero every morning and an asset that compounds. Caveats: best practices are still evolving, checked-in Skills cost context, and curation beats accumulation. Start with one Skill, one gotcha, and the category that catches your mistakes.

Source: “Lessons from building Claude Code: How we use skills,” Thariq Shihipar (Anthropic), Claude blog, 3 June 2026. Categories, examples & measured claims are Anthropic’s; framing is the author’s. Docs: code.claude.com/docs/en/skills.
thorstenmeyerai.com

Skills Turn Prompts Into Assets

For teams already using coding agents, Anthropic’s account points to a shift from one-off prompting toward repeatable workflows. A Skill can bundle the way a company checks releases, queries data, reviews code or scaffolds a project, so the agent starts with shared instructions and runnable tools rather than a fresh prompt each time.

The potential value is practical: more consistent output, faster onboarding for new team members, and less dependence on knowledge held by one engineer or scattered across old docs. That value is not automatic. It depends on curation, maintenance, and whether teams keep Skills small enough for agents to use without adding noise.

Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results

Claude AI for Beginners Bible: [5 in 1] The Ultimate Guide to Automate Your Work, Save Hours Every Week, and Use AI for Real-World Results

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

How Anthropic Structures Skills

The described Skill layout starts with SKILL.md, which contains root instructions and a model-facing description that helps the agent decide when to use the Skill. Deeper material can live in references, while scripts give the agent real code to run instead of rebuilding boilerplate from prose.

The source also describes assets for templates and files, config.json for missing setup choices, and hooks or memory for on-demand guardrails and logs. The design uses progressive disclosure: the agent reads the root first and pulls in deeper detail only when the task requires it.

This is not presented in the source material as a new Claude Code product launch on July 1. The dated development is Anthropic’s June 3, 2026 public write-up, followed by a July 1, 2026 analysis from Thorsten Meyer AI that frames the approach for builders and budget owners.

“Lessons from building Claude Code: How we use skills”

— Thariq Shihipar, Anthropic

50 AI Workflows for Engineers: From Debugging to System Design, Code Review & Engineering Automation

50 AI Workflows for Engineers: From Debugging to System Design, Code Review & Engineering Automation

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Metrics Remain Partly Opaque

Several details remain unclear from the source material. Anthropic is described as running hundreds of Skills, but the exact count, the full set of tasks, the measurement method, and the size of the reported quality gains are not public in the provided material.

It is also not yet clear how well Anthropic’s results carry over to companies with different codebases, review practices or agent usage. The source warns that best practices are still evolving, checked-in Skills can add context cost, and accumulation without review can make a library harder to use.

Technical Writing for Engineers: Documentation, Diagrams & API References

Technical Writing for Engineers: Documentation, Diagrams & API References

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Teams Test Their First Skills

The next step for readers using Claude Code is likely experimentation rather than a broad rebuild of internal docs. The source recommends starting with one Skill, one hard-won gotcha, and the category most likely to catch mistakes, especially verification.

Teams can watch Anthropic’s Claude Code Skills documentation for further guidance and compare their own before-and-after results. The practical test will be whether Skills reduce repeated prompting, improve review quality, and stay maintained as tools, policies and code change.

Plaud Note AI Voice Recorder, Note Taker w/Case, App Control, Transcribe & Summarize with AI, Support 112 Languages, for Meetings, Calls, Lectures, Professionals, Teams, Black, Non-Pro Version

Plaud Note AI Voice Recorder, Note Taker w/Case, App Control, Transcribe & Summarize with AI, Support 112 Languages, for Meetings, Calls, Lectures, Professionals, Teams, Black, Non-Pro Version

Plaud Intelligence: Capture conversations in 112 languages and generate accurate transcripts with the Plaud App and Web. Plaud…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What is the actual news here?

Anthropic published a June 3, 2026 account of how it uses Claude Code Skills internally, including the claim that it has run hundreds across its engineering organization.

Is a Skill just a saved prompt?

No. In Anthropic’s description, a Skill is a folder that can include instructions, references, scripts, templates, configuration and hooks. The prompt-like text is only one part of the package.

Which kind of Skill had the biggest effect?

According to Anthropic’s own measurement as summarized in the source material, verification Skills, the Skills that check outputs, had the strongest effect on quality. The underlying benchmark details were not included in the provided material.

Why does this matter for companies using AI agents?

The approach could help companies turn repeated agent instructions into versioned workflows that are shared across teams. That may make AI-agent work more consistent, but the outcome depends on maintenance and adoption.

What should teams do next?

The source points to a narrow start: build one useful Skill, include one practical gotcha, and add scripts or checks where prose is not enough. Teams should measure whether the Skill improves real work output before expanding the library.

Source: Thorsten Meyer AI

Wellness content on this site is informational and not a substitute for professional medical guidance.

You May Also Like

Forezai · TradingAgents: A Trading Firm Made of Agents

Forezai has released TradingAgents, an Apache-2.0 open-source framework that models a trading desk with multiple AI agents.

When One Agent Isn’t Enough: Claude Now Builds Its Own Team of Agents on the Fly

Anthropic says Claude Code can now write task-specific workflows that spawn subagents for complex work, with higher token costs.

Readiness: Before You Fund the Answer

Thorsten Meyer AI’s spotlight describes Readiness, a 20-minute AI funding diagnostic, while launch and validation details remain limited.

A Frontier AI Model Just Went Dark for 18 Days. The Kill-Switch Is Real Now.

Commerce lifted controls on Anthropic’s Claude Fable 5 and Mythos 5 after an 18-day outage tied to AI security claims.