Skills are Markdown files that Claude Code reads at startup and applies automatically based on keyword matching against requests. Community testing shows skills activate automatically only about 20% of the time. The gap between intended automatic activation and actual behavior is the primary limitation of the mechanism.
How Skills Work
A skill is a Markdown file with YAML frontmatter that lives in your project (typically in .claude/skills/ or a similar directory). The frontmatter includes a name and a description. At startup, Claude Code loads every skill’s name and description into its context.
When you make a request, Claude checks for keyword matches between your prompt and the loaded skill descriptions. If it finds a match, it activates the skill and uses its contents to inform its response. If it doesn’t find a match, the skill sits unused.
This is the critical point: activation depends on keyword matching, not semantic understanding. Claude doesn’t deeply reason about whether a skill is relevant to your request. It checks whether the words in your prompt overlap with the words in the skill’s description. If you describe a skill as “dbt documentation utilities” but then ask Claude to “generate docs for this model,” the activation depends on whether “docs” triggers a match against “documentation.” Sometimes it does. Often it doesn’t.
The 20% Problem
One practitioner summarized the experience that many have had: “I have to always say ‘update process notes and use your skill.’ This is too verbose for me. So I simply created a slash command.”
This pattern repeats across the community. You set up a skill expecting automatic activation. It works sometimes. Most of the time, it doesn’t. You end up appending “and use the skill” to your prompts, which defeats the entire purpose of automatic activation. At that point, you’re manually triggering an implicit feature — the worst of both worlds.
The 20% activation rate isn’t a bug that Anthropic will fix with a model update (though improvements are likely over time). It’s a consequence of how keyword matching works against natural language. You phrase things differently each time. You use synonyms, abbreviations, and indirect references. The skill description can’t anticipate every way you might express a relevant request.
Skills vs. Commands: The Direction of Control
The fundamental difference between skills and commands is the direction of control.
Commands are pull-based. You type /audit-model and Claude runs it. You decide when, you decide on what, and the workflow executes the same way every time. The human initiates.
Skills are push-based (in theory). Claude decides when to apply them based on its interpretation of your request. The AI initiates — or doesn’t. You’ve delegated the decision of when to run a workflow to the same keyword-matching system that might not recognize your request.
For anything where consistency matters — audits, documentation generation, pre-commit checks, test scaffolding — pull-based control is strictly better. You don’t want a data quality check that runs 20% of the time. You want it to run every time, identically.
Where Skills Actually Work
Skills aren’t useless. They just serve a narrower purpose than the marketing suggests. The pattern where skills genuinely shine is background domain knowledge — context that should inform Claude’s work across many different types of requests without being explicitly triggered.
Good skill candidates:
- Warehouse-specific quirks. BigQuery’s handling of
STRUCTtypes, Snowflake’s case sensitivity rules, Databricks’ Delta Lake merge semantics. This kind of knowledge should color how Claude writes SQL regardless of what you asked for. - Team conventions and style guides. Your preferred CTE naming patterns, documentation standards, code review expectations. These inform every task without needing explicit invocation.
- Reference documentation. Data dictionaries, schema definitions, business term glossaries. Having this available as background context improves Claude’s output even when you don’t explicitly ask it to “check the data dictionary.”
- Architectural decisions and rationale. Why you chose
mergeoverinsert_overwritefor incremental models, why certain models are materialized as tables instead of views, why you use a specific deduplication pattern.
The common thread: these are all cases where you want the knowledge to influence Claude’s work, not to trigger a specific workflow. You don’t need Claude to decide “now I’ll apply the BigQuery quirks skill.” You need Claude to just know about BigQuery quirks whenever it writes SQL.
Where Skills Fail
Skills fail when you use them for repeatable workflows — the kind of work where you need the same steps executed the same way every time.
A skill for model auditing, where you expect Claude to run five specific checks whenever you say “check this model,” will disappoint. Claude might run three of the five checks, or apply the audit criteria inconsistently, or skip the skill entirely because your phrasing didn’t trigger it. Compare that to a /audit-model command that runs all five checks every time because you explicitly invoked it.
Skills also fail when you pile too many into a project. Each skill’s description consumes part of Claude’s context at startup. If you have fifteen skills with overlapping descriptions, Claude’s keyword matching becomes even less reliable — multiple skills compete for activation on the same request, and the result is unpredictable.
The Skill-Command Combination
For some use cases, pairing a skill with a command gives you both coverage modes. You create a skill with your Mermaid diagram conventions and lineage documentation standards. You also create a /document-lineage command that explicitly generates lineage docs.
The skill means Claude might apply your conventions automatically when you ask a related question in a different context — “how does data flow from sources to this mart?” The command means you definitely get lineage docs when you need them, formatted your way, with your conventions applied.
This combination works because the skill and the command serve different purposes. The skill provides ambient knowledge. The command provides a reliable trigger. You’re not depending on the skill for anything critical — it’s a bonus when it fires, not a requirement.
Practical Recommendations
-
Default to commands for any workflow you run more than twice. The 80% failure rate of automatic activation isn’t acceptable for workflows that matter.
-
Use skills for domain knowledge, not procedures. If the content is “here’s how our warehouse handles X,” it’s a good skill. If the content is “run these five steps in order,” it’s a command.
-
Invest in description engineering for the skills you do create. The description is the only thing Claude uses to decide whether to activate the skill. Generic descriptions produce generic activation (or none).
-
Keep skill count low. Three to five well-described skills outperform fifteen vague ones. Each skill competes for Claude’s attention at activation time.
-
Don’t fight the tool. If you find yourself typing “and use the skill” regularly, convert that skill to a command. The tool is telling you something about the appropriate abstraction.
-
Use CLAUDE.md for truly universal context that should apply to every conversation. CLAUDE.md is read every time, unconditionally. It doesn’t depend on keyword matching. For conventions that must always apply, CLAUDE.md is more reliable than skills.