← Back to Blog

Why the Claude Skill Linter Exists (and the Fourteen Patterns It Catches Before You Ship)

Why the Claude Skill Linter Exists (and the Fourteen Patterns It Catches Before You Ship)

The most expensive AI failure of 2025 wasn't a hack. It was two agents asking each other clarifying questions for 11 days while their team slept.

The team lost $47,000 in API charges before someone pulled the plug. The agents weren't doing anything malicious; they had no instruction telling them when to stop. A six-word clause in the skill text would have prevented it. Nobody on the team thought to write the clause because nobody on the team had ever seen the failure mode.

Most SMB-written Claude skills have the same pattern. The skill works in testing. The skill works in early production. Then a new edge case appears, the model decides to ask for clarification, the clarification doesn't resolve, the loop runs through the night, and the owner wakes up to a bill.

The Claude Skill Linter catches 14 of these patterns in a single paste. Five-second analysis, no signup, no upload, browser-only.

What the tool checks

Fourteen rules across three severity tiers.

Five critical rules. Each one is a "must fix before shipping" item that's been the root cause of a documented production agent disaster.

  1. Missing iteration cap. No instruction telling the agent when to stop trying. Root cause of the $47K loop incident.
  2. Send-money action without human gate. The skill performs a financial action without requiring approval first. Biggest blast-radius failure mode in SMB skills.
  3. Possible hardcoded credential. API key or password literal inside the skill text.
  4. Public-post action without approval gate. Skill can publish (tweet, review reply, blog comment) without a human pause. One mistake creates a permanent public record.
  5. Delete/overwrite without confirmation. Destructive action with no soft-delete fallback. One-way operation; recovery may be impossible.

Five warning rules. Should be fixed or documented as intentional.

  1. Vague instruction ("use your best judgment," "as appropriate").
  2. No defined failure mode (skill describes the happy path only).
  3. No permission-scope statement (skill doesn't say what's allowed vs. what's not).
  4. No loop-detection safeguard (skill encourages retries with no infinite-loop check).
  5. No rollback / undo plan (skill takes actions with no plan to reverse them).

Four informational rules. Judgment calls; not always wrong but worth noticing.

  1. No cost-awareness clause. Acceptable for low-volume skills; risky for any skill that loops or pulls long context.
  2. No AI-identity disclosure rule. Some jurisdictions now require automated-system disclosure on certain communications.
  3. Skill is very short (under 60 words). Usually under-specified.
  4. No context-truncation rule. When a long thread comes in, the skill has no rule for what to keep vs. drop.

Each finding includes a "why" (what the failure mode looks like in production) and a "fix" (a concrete clause you can paste into the skill).

How to use it in 60 seconds

  1. Open your skill text in Claude's skill editor.
  2. Paste the full text into the linter's input box.
  3. Click Lint this skill.
  4. Read the findings. Critical first, warnings second, info as time permits.
  5. If you want a rewrite, click "Copy fix prompt." The prompt is structured for Claude (or any LLM) to read the findings and rewrite the skill with every Critical and Warning addressed.
  6. Paste the rewrite back into your Claude skill editor. Lint again to confirm clean.

The whole flow takes 60 seconds for a 300-word skill. Two minutes for a long one with many edits needed.

An example: the launch invoice-chase skill

Anthropic shipped invoice-chase as one of Claude for Small Business's launch skills. The default version is reasonable but the linter catches three issues on the out-of-the-box text:

  • No iteration cap. The default skill doesn't tell the agent when to stop trying to find an unpaid invoice. For a small AR list, this is fine; for a 5,000-customer list, it's the same shape as the $47K loop, just slower.
  • Vague instruction. The default uses "appropriate" and "as needed" in a couple of places. Fine for a polished demo. Risky in production when the agent has to make judgment calls about how aggressive to be.
  • No human-gate on send. The default ships in "draft-only" mode, but if you turn on auto-send (which is tempting once it's been working for a week), there's no instruction to pause before sending in edge cases (already-paid customers, mid-dispute customers, customers in a payment plan).

None of these make the skill broken. All three are the kind of thing that turns into an embarrassing email or a runaway bill once you're at the 50th invoice instead of the 5th. The linter surfaces them before they bite.

The invoice-chase deep dive post has a full lint-clean rewrite of the skill that addresses all three findings plus another eight failure modes.

What the tool deliberately doesn't do

It doesn't certify your skill as safe. The lint rules are necessary but not sufficient. A skill can pass all 14 checks and still be wrong for your business if the prompt logic itself is misaligned with what you actually need.

It also doesn't catch every possible issue. Prompt injection, model bias, training-data inclusion, and a dozen other risks live outside what static text analysis can catch. For those, you need runtime monitoring and a human reviewing the agent's actions in production. The linter is the pre-flight check, not the airworthiness certification.

Where this fits in the broader skill-safety workflow

Think of skill safety as four layers, each one catching what the layer above misses:

  1. Pre-flight (lint). The Skill Linter. Catches the 14 most common shipping-time mistakes.
  2. In-flight (cost caps + iteration limits). Set at the platform level. Covers the agent cost-controls baseline.
  3. In-flight (permission scope). Set at the connector level. Covers the connector permission cheat sheet recommendations.
  4. Post-flight (human review). Read what the agent did, the first 30 days. Adjust the skill text based on what surprised you.

Each layer is fast and cheap individually. The linter is 60 seconds. The cost cap is two clicks in your billing dashboard. The permission scope is one OAuth flow. The human review is 20 minutes a day for the first month. None of these require a developer; all of them prevent the failure modes that send SMB owners back to "AI doesn't work for me."

The deeper version

The full argument for treating skill safety as part of operations (not just a launch checklist) is the spine of The $100 Network (Digital Empire series, $9.99 on Kindle). The book covers the broader pattern: that the under-$100 AI stack is now powerful enough to do real damage if shipped without thought, and the safeguards have to scale with the autonomy.

Related reading

Fact-check notes and sources

  • $47K runaway-loop incident per Kusireddy on Towards AI, October 2025.
  • The 14 lint rules are derived from publicly documented production-agent failures observed across 2024-2026, plus the Anthropic Claude Skills documentation on safe skill authoring.
  • Claude for Small Business launch and skill catalog per Inc.com's May 13, 2026 announcement coverage.
  • Heuristic disclaimer: the linter uses regex pattern matching, not semantic understanding. False positives and false negatives both happen; treat findings as a checklist starting point, not a verdict.

This post is informational, not security-engineering or AI-safety consulting advice. The 14 rules are a baseline, not a comprehensive safety program. Mentions of Anthropic and other third-party services are nominative fair use. No affiliation is implied.

← Back to Blog

Accessibility Options

Text Size
High Contrast
Reduce Motion
Reading Guide
Link Highlighting
Accessibility Statement

J.A. Watte is committed to ensuring digital accessibility for people with disabilities. This site conforms to WCAG 2.1 and 2.2 Level AA guidelines.

Measures Taken

  • Semantic HTML with proper heading hierarchy
  • ARIA labels and roles for interactive components
  • Color contrast ratios meeting WCAG AA (4.5:1)
  • Full keyboard navigation support
  • Skip navigation link
  • Visible focus indicators (3:1 contrast)
  • 44px minimum touch/click targets
  • Dark/light theme with system preference detection
  • Responsive design for all devices
  • Reduced motion support (CSS + toggle)
  • Text size customization (14px–20px)
  • Print stylesheet

Feedback

Contact: jwatte.com/contact

Full Accessibility StatementPrivacy Policy

Last updated: April 2026