Sherpa: a static analysis index that cuts AI token consumption by 60%
How I built a pre-computed codebase index that replaces exploratory grep/file-read calls with a single manifest — and reduced token usage by 60% on structural questions.
Every time an AI coding tool needs to answer a structural question — where is this function defined, who calls it, what does this file export — it does the same thing: grep the codebase, read a few files, piece together the answer from the output. Three to five tool calls, a few thousand tokens, and a result that any TypeScript compiler could have produced in milliseconds.
I started noticing this pattern while working on a React project with around 110 TypeScript files. Before touching any code, Claude would spend a significant chunk of the session context just navigating — grepping for function names, reading type definitions, tracing import chains. The actual implementation work was faster than the exploration that preceded it.
The core insight was straightforward: these are deterministic lookups. The compiler already knows where every symbol is defined, what it exports, and who imports it. The problem is that this knowledge isn’t in a format the AI can load as context.
So I built sherpa.
What it generates
sherpa is a CLI that runs static analysis on a TypeScript or JavaScript codebase and writes .claude/manifest.md — a compact, pre-computed index in three sections.
Exports — one line per file, listing everything it exposes publicly:
src/types/index.ts: Task DisplayState VolumeState ContextMenuOption Position
src/actions/tasks.ts: closeApp closeAllApps launchApp TaskActionTypes
src/reducers/index.ts: RootState default
Import Graph — who imports each file, using → to show direction:
src/types/index.ts → src/actions/tasks.ts src/reducers/TasksReducer.ts src/reducers/DisplayReducer.ts $lib/Common/ContextMenu/ContextMenu.tsx
src/actions/tasks.ts → src/App.tsx src/reducers/TasksReducer.ts $lib/AppDrawer/AppDrawer.tsx $lib/Desktop/Desktop.tsx ...
Symbols — one line per exported symbol, with file location, kind, and TypeScript signature:
Task src/types/index.ts:1 interface
closeApp src/actions/tasks.ts:102 function (data: { _id: string }) => CloseAppAction
RootState src/reducers/index.ts:13 type
AppConfig src/config/apps.ts:5 interface
The manifest is loaded once per session via a reference in CLAUDE.md. From that point on, any structural question is answered by reading a few lines from an already-loaded file — no grep, no file reads.
The demonstration
Here’s a concrete example. Task: “I want to add a theme property to the Task type — where do I need to make changes?”
Without sherpa — 3 tool calls:
| Step | Tool | Output size | Tokens |
|---|---|---|---|
Find all Task references | grep -rn "Task" src/ | 6,989 chars, 81 lines | ~1,747 |
| Read the type definition | cat src/types/index.ts | 460 chars | ~115 |
| Read the main consumer | cat src/reducers/TasksReducer.ts | 4,142 chars | ~1,035 |
| Total | ~2,897 |
And after all that, the answer still requires interpreting the grep output to identify which files actually use Task as a type (versus files that contain the string in a comment or variable name).
With sherpa — manifest already in context:
## Exports
src/types/index.ts: Task DisplayState VolumeState ...
## Import Graph
src/types/index.ts → src/actions/tasks.ts src/reducers/DisplayReducer.ts src/reducers/VolumeReducer.ts $lib/Common/ContextMenu/ContextMenu.tsx
## Symbols
Task src/types/index.ts:1 interface
Three manifest lines answer the question completely:
Taskis an interface defined atsrc/types/index.ts:1- The file is imported by exactly 4 files — those are all the places that need updating
- No ambiguity, no false positives from string matches
| Tool calls | Tokens | |
|---|---|---|
| Without sherpa | 3 | ~2,897 |
| With sherpa | 0 (manifest already loaded) | ~69 |
| Saving | −3 calls | −97% |
The manifest is loaded once at the start of the session, so subsequent questions about the same codebase cost nothing extra.
Five rounds of optimization
The first version of sherpa generated a manifest using standard markdown — headers for each file, bold labels, backtick-wrapped values. Readable, but expensive.
On a ~110-file project, the initial format consumed 8,783 tokens. Over five optimization rounds, that number came down to 3,375.
| Round | Change | Tokens | Δ |
|---|---|---|---|
| 0 | Original markdown (### headers, bold, `backticks`) | 8,783 | — |
| 1 | Compact format — one line per entry, no markup overhead | 6,781 | −23% |
| 2 | Filter barrel index.ts files and string-literal constants from output | 6,015 | −11% |
| 3 | Drop redundant ← import lines + path alias ($lib/ for long prefixes) | 3,914 | −35% |
| 4 | Filter default-only files from Exports + fix absolute path leaks in signatures | 3,375 | −14% |
| Total | −62% |
A few notes on what each round eliminated:
Round 1 — The original format had three lines per symbol (### name, - **file:**, - **kind:**, - **signature:**, blank line). Collapsing each to a single space-separated line cut the file from 1,134 lines to 329.
Round 2 — Barrel files (index.ts that only re-export default) added entries to the Export Map with zero unique information — the Symbol Index already listed the component with its real name. Same for Redux action type strings ("CLOSE_APP") — they were in the Symbol Index as const with a string literal signature, but the reader gets no value from them.
Round 3 — The Import Graph had both ← (what a file imports) and → (who imports it). These are the same data from two perspectives — if A → B, then B ← A. Dropping ← cut the section in half with no information loss. The path alias replaced the 23-character prefix src/components/library/ (which appeared 276 times) with $lib/.
Round 4 — Files exporting only default were still appearing in the Export Map even when they weren’t barrel files. Since the Symbol Index already has the component listed with its readable name (e.g., Calculator instead of default), the Export Map entry was redundant. Also fixed a TypeScript compiler quirk where local absolute paths appeared in signatures: import("/abs/path/to/file").AppConfig[] became AppConfig[].
Installing it
sherpa is distributed via GitHub — no npm registry needed.
pnpm add -g github:Giovagni/sherpa
Then, from your project root:
sherpa init
This does four things:
- Runs a full analysis and writes
.claude/manifest.md - Adds
.claude/manifest.mdand.claude/manifest.cache.jsonto.gitignore— the manifest is local-only, never committed - Installs a git post-commit hook that runs
sherpa generateafter every commit - Prints the snippet to add to your
CLAUDE.md
Add the snippet to CLAUDE.md:
## Codebase Index
See @.claude/manifest.md for symbol definitions, exports, and dependency graph.
Run `sherpa init` once to generate it locally (gitignored — each developer generates their own).
The @ prefix tells Claude Code to load the file as context at session start.
For subsequent runs, incremental analysis only re-parses changed files:
sherpa generate # incremental — ~20ms if nothing changed, ~1–3s otherwise
sherpa generate --full # force full re-analysis
sherpa watch # watch for file changes and regenerate automatically
sherpa stats # show token count and size
To define custom path aliases, create sherpa.config.json at the project root:
{
"aliases": {
"$lib/": "src/components/library/",
"$api/": "src/services/api/"
}
}
Without a config file, sherpa defaults to $lib/ for src/components/library/.
Honest limitations
A few things worth knowing before adopting it:
TypeScript and JavaScript only. There’s no support for other languages. JavaScript files work but without type information — signatures degrade to inferred types.
Incremental analysis costs 1–3 seconds on file changes. The 0-change case is instant (no TypeScript compiler at all — pure cache read). But when files change, sherpa builds a mini ts-morph project for those files and their direct imports. The TypeScript compiler startup is unavoidable in this architecture.
sherpa watch has limitations on Linux. It uses fs.watch({ recursive: true }), which works reliably on macOS and Windows but has known issues on Linux (inotify, no NFS support).
Third-party packages are not indexed. Only local imports are tracked. npm dependencies don’t appear in the manifest.
The source is at github.com/Giovagni/sherpa.