Thin Harness, Fat Skills — How to Make AI Agents Actually Understand Your Data

Defined

Thin Harness, Fat Skills is Garry Tan's architecture for AI agent productivity.

A skill file is a reusable markdown procedure that teaches the model HOW to do something. Not WHAT to do. The user supplies the specifics. The skill supplies the process. The harness — the program that runs the LLM — stays thin. The skills — markdown procedures encoding judgment, process, and domain knowledge — are where 90% of the value lives.

What makes fat skills different from prompt engineering

Skills as method calls

A skill file works like a function. It takes parameters. You invoke it with different arguments. The same procedure produces radically different capabilities depending on what you pass in.

Markdown is actually code

A skill file is a more perfect encapsulation of capability than rigid source code, because it describes process, judgment, and context in the language the model already thinks in.

The resolver loads context

Skills say HOW. Resolvers say WHAT to load WHEN. Claude Code has a built-in resolver — every skill has a description field, and the model matches user intent automatically.

System that learns

After an event, the /improve skill reads surveys, extracts patterns, and writes new rules back into the matching skills. The skill file rewrites itself.

Primary source

The essay from Garry Tan spells out the five definitions, the three-layer architecture, and the YC Startup School case study that proves the system works at 6,000-founder scale.

Core mechanic

A skill describes a process of judgment. The invocation supplies the world. Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN.

The five definitions

Five definitions fix the bottleneck. The model already knows how to reason — it fails because it doesn't know your data.

01

Skill File

A reusable markdown procedure that teaches the model HOW to do something. It takes parameters, and the same procedure produces radically different capabilities depending on what you pass in. This is software design, using markdown as the programming language and human judgment as the runtime.

02

Harness

The program that runs the LLM. It does four things: runs the model in a loop, reads and writes your files, manages context, and enforces safety. The anti-pattern is a fat harness with thin skills — 40+ tool definitions eating half the context window. Build exactly what you need.

03

Resolver

A routing table for context. When task type X appears, load document Y first. Skills say HOW. Resolvers say WHAT to load WHEN. The fix for a 20,000-line CLAUDE.md: about 200 lines of pointers. The resolver loads the right document when it matters.

04

Latent vs. Deterministic

Latent space is where intelligence lives — the model reads, interprets, decides. Deterministic is where trust lives — same input, same output, every time. The worst systems put the wrong work on the wrong side. An LLM can seat 8 people; ask it to seat 800 and it hallucinates.

05

Diarization

The model reads everything about a subject and writes a structured profile. Read 50 documents, produce 1 page of judgment. No SQL query produces this. No RAG pipeline produces this. The model has to actually read, hold contradictions in mind, and write structured intelligence.

Why it matters

It turns agent productivity into something that compounds.

Most AI work focuses on better models. The 2x people and the 100x people are using the same models. The difference is the architecture — thin harness, fat skills — and the discipline to codify everything. Every skill you write is a permanent upgrade. It never degrades. It never forgets. And when the next model drops, every skill instantly gets better.

Skills as method calls

The parameters change. The process stays. Same skill, different invocations, completely different capabilities.

Push execution down

Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN.

Agent-native loop

The agent process becomes the thing being optimized. Retrieve, read, diarize, count, synthesize. Then survey, investigate, diarize, rewrite the skill.

Compounding system

Build it once. It runs forever. The system compounds every time you codify a new procedure into a skill file.

Architecture

Three layers. One principle. Thin harness, fat skills.

Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN.

Fat skills (top)
  Markdown procedures encoding judgment, process, and domain knowledge.
  90% of the value lives here.

Thin CLI harness (middle)
  ~200 lines. JSON in, text out. Read-only by default. CLI first, add MCP later.

Your app (bottom)
  QueryDB. ReadDoc. Search. Timeline. The deterministic foundation.

Agent Decision Guide: Skill or Code?

| Question                          | If YES (skill)   | If NO (code)     |
|-----------------------------------|------------------|------------------|
| Agent needs to think/adapt?       | Skill (markdown) | Code             |
| Same input, same output?          | Code (CLI)       | Skill            |
| Requires judgment about env?      | Skill            | Code             |
| Lookup, list, or status check?    | Code             | Probably skill   |
| Changes with conversation?        | Skill            | Code             |

Case study

YC Startup School: 6,000 founders, one skill file.

Chase Center. July 2026. 6,000 founders. Each one has a structured application, questionnaire answers, transcripts from 1:1 advisor chats, and public signals. Three invocations of the same matching skill produce three completely different strategies — breakout rooms, lunch tables, and live 1:1s. The model's judgment reclassifies founders no embedding could catch. After the event, the /improve skill reads NPS surveys and writes new rules back into the matching skill. July: 12% OK ratings. Next event: 4%.

FOUNDER: Maria Santos
COMPANY: Contrail (contrail.dev)
SAYS: "Datadog for AI agents"
ACTUALLY BUILDING: 80% of commits are in billing module.
  She's building a FinOps tool disguised as observability.

FAQ

The fastest answers to the questions people ask first.

Who created the thin harness, fat skills concept? ▼

The concept comes from Garry Tan, CEO of Y Combinator. The essay "Thin Harness, Fat Skills" was published in April 2026 and confirms the architecture behind Claude Code's secret sauce.

What is a skill file? ▼

A skill file is a reusable markdown procedure that teaches the model HOW to do something, not WHAT to do. It takes parameters like a function call. The same procedure produces radically different capabilities depending on what you pass in.

What does "thin harness" mean? ▼

The harness is the program that runs the LLM. "Thin" means it does four things: run the model in a loop, read and write files, manage context, and enforce safety. The anti-pattern is a fat harness with 40+ tool definitions eating half the context window.

How does this compare to Claude Code? ▼

Claude Code confirmed the thin harness, fat skills architecture. Claude Code is the best harness for coding. The same principle applies to any domain: email, calendar, meetings, research, alerts. Skills are method calls. The parameters change. The process stays.

What hardware do I need to use fat skills? ▼

Fat skills are an architecture pattern, not a software package. Any LLM harness that supports markdown skill files, a resolver for context routing, and deterministic tooling can implement this pattern. Claude Code and OpenClaw are reference implementations.

Primary sources

Every claim on this page is grounded in the essay or linked public discussion.

GH

GitHub Repo

The source for the essay, five definitions, and three-layer architecture.

X

X Thread

Garry Tan's thread announcing the thin harness, fat skills framework.

GB

GBrain Repository

The full collection of YC ethos essays and agent-building frameworks.