Thin Harness, Fat Skills
How to make AI agents actually understand your data. Five definitions that fit on an index card — skill files, harness, resolver, latent vs. deterministic, diarization — and the architecture that delivers 10x to 100x productivity.
Thin Harness, Fat Skills is Garry Tan's architecture for AI agent productivity.
A skill file is a reusable markdown procedure that teaches the model HOW to do something. Not WHAT to do. The user supplies the specifics. The skill supplies the process. The harness — the program that runs the LLM — stays thin. The skills — markdown procedures encoding judgment, process, and domain knowledge — are where 90% of the value lives.
What makes fat skills different from prompt engineering
Skills as method calls
A skill file works like a function. It takes parameters. You invoke it with different arguments. The same procedure produces radically different capabilities depending on what you pass in.
Markdown is actually code
A skill file is a more perfect encapsulation of capability than rigid source code, because it describes process, judgment, and context in the language the model already thinks in.
The resolver loads context
Skills say HOW. Resolvers say WHAT to load WHEN. Claude Code has a built-in resolver — every skill has a description field, and the model matches user intent automatically.
System that learns
After an event, the /improve skill reads surveys, extracts patterns, and writes new rules back into the matching skills. The skill file rewrites itself.
Primary source
The essay from Garry Tan spells out the five definitions, the three-layer architecture, and the YC Startup School case study that proves the system works at 6,000-founder scale.
Core mechanic
A skill describes a process of judgment. The invocation supplies the world. Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN.
Five definitions fix the bottleneck. The model already knows how to reason — it fails because it doesn't know your data.
Skill File
A reusable markdown procedure that teaches the model HOW to do something. It takes parameters, and the same procedure produces radically different capabilities depending on what you pass in. This is software design, using markdown as the programming language and human judgment as the runtime.
Harness
The program that runs the LLM. It does four things: runs the model in a loop, reads and writes your files, manages context, and enforces safety. The anti-pattern is a fat harness with thin skills — 40+ tool definitions eating half the context window. Build exactly what you need.
Resolver
A routing table for context. When task type X appears, load document Y first. Skills say HOW. Resolvers say WHAT to load WHEN. The fix for a 20,000-line CLAUDE.md: about 200 lines of pointers. The resolver loads the right document when it matters.
Latent vs. Deterministic
Latent space is where intelligence lives — the model reads, interprets, decides. Deterministic is where trust lives — same input, same output, every time. The worst systems put the wrong work on the wrong side. An LLM can seat 8 people; ask it to seat 800 and it hallucinates.
Diarization
The model reads everything about a subject and writes a structured profile. Read 50 documents, produce 1 page of judgment. No SQL query produces this. No RAG pipeline produces this. The model has to actually read, hold contradictions in mind, and write structured intelligence.
It turns agent productivity into something that compounds.
Most AI work focuses on better models. The 2x people and the 100x people are using the same models. The difference is the architecture — thin harness, fat skills — and the discipline to codify everything. Every skill you write is a permanent upgrade. It never degrades. It never forgets. And when the next model drops, every skill instantly gets better.
Skills as method calls
The parameters change. The process stays. Same skill, different invocations, completely different capabilities.
Push execution down
Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN.
Agent-native loop
The agent process becomes the thing being optimized. Retrieve, read, diarize, count, synthesize. Then survey, investigate, diarize, rewrite the skill.
Compounding system
Build it once. It runs forever. The system compounds every time you codify a new procedure into a skill file.
Three layers. One principle. Thin harness, fat skills.
Push intelligence UP into skills. Push execution DOWN into deterministic tooling. Keep the harness THIN.
Agent Decision Guide: Skill or Code?
YC Startup School: 6,000 founders, one skill file.
Chase Center. July 2026. 6,000 founders. Each one has a structured application, questionnaire answers, transcripts from 1:1 advisor chats, and public signals. Three invocations of the same matching skill produce three completely different strategies — breakout rooms, lunch tables, and live 1:1s. The model's judgment reclassifies founders no embedding could catch. After the event, the /improve skill reads NPS surveys and writes new rules back into the matching skill. July: 12% OK ratings. Next event: 4%.