Capture a code idea as a clean semantic unit — regenerate it onto today's codebase.
Works with Claude Code, and any MCP-capable client (Codex, GPT, …).
Git recovers history. It can't recover an entangled idea onto today's code.
You're deep into building an agent. You've tried twenty things — a restructured system prompt, splitting one overloaded tool into three, a re-ranking step before retrieval, a scratchpad for intermediate reasoning, two different few-shot sets — half of them commented in and out, all tangled together in one working tree. Now you want one of those ideas back. Not the stale commit it lived in. The idea itself, re-applied to the agent you have today.
Git can't do that. Its unit is a tree snapshot, not an idea. git checkout drags back everything from that moment and throws away all the infrastructure you've built since. You can't pull one feature forward without rolling back the rest.
research-git makes the idea the unit. It captures each change as a self-contained Feature Capsule, stores it in a graph, and — when you want it back — regenerates it onto your current code instead of pasting a stale patch. The capsule is a specification of intent; the code is always rebuilt against today's reality.
The intelligent steps (segmenting a diff into capsules, regenerating one onto changed code) run as subagents on your existing Claude subscription — there is no pay-per-use API anywhere.
One loop: capture each idea into a graph, then regenerate it onto today's code. The engine (blue) is free and deterministic; intelligence happens at exactly two points (green) — subagents dispatched onto your existing subscription, never a paid API.
flowchart LR
A["edit code /<br/>rgit run -- ..."] -->|"free, deterministic"| B["raw proposal<br/>(diff staged)"]
B -->|"/rgit-capture"| C{{"capsule-<br/>segmenter"}}
C --> D[("Feature Capsule<br/>graph (.rgit/)")]
D -->|"/rgit-recall «query»"| E["compose brief vs<br/>today's code"]
E --> F{{"capsule-<br/>regenerator"}}
F --> G["reviewable diff<br/>on today's code"]
G -.->|"rgit run — freeze + link variant"| D
classDef engine fill:#eef2ff,stroke:#5b6cff,color:#1e2a78;
classDef agent fill:#eafff0,stroke:#36a85f,color:#0f5132;
class A,B,D,E,G engine;
class C,F agent;
Every idea you keep becomes one capsule — a self-contained unit a future agent can read and bring back:
| Field | What it holds |
|---|---|
| intent | why this change existed — the hypothesis, not a diff restatement |
| code slices | the relevant snippets / files / symbols |
| knobs | parameters / flags / configs |
| dependencies | other capsules it needs + silent assumptions |
| result | metrics / notes / why it worked or didn't, linked to the runs it produced |
| resurrection guide | how to regenerate it onto a changed codebase |
Capsules live in a small graph beside your repo (.rgit/), on top of normal git. Every run you launch through research-git also freezes a byte-exact, content-addressed snapshot of the code that ran — so "the code behind this result" is always a perfect replay, never at the mercy of an agent.
Five steps: install → init → run → capture → recall.
pip install research-git # or, from a clone: pip install -e .
# wire the plugin (agents + skills) and the MCP server into your client
rgit install claude-code # Claude Code (via the official `claude` CLI)
rgit install codex # Codex / Gemini / opencode: symlinks the skills into ~/.agents/skills/
rgit install --list # all platforms; add --dry-run to preview, --uninstall to removecodex, gemini, and opencode share the ~/.agents/skills/ convention — the installer symlinks each skill there and prints the one-line MCP server entry to drop into that client's config. It also writes a managed research-git guidance block into the client's global guidance file when the platform has one (~/.codex/AGENTS.md, ~/.claude/CLAUDE.md, or ~/.gemini/GEMINI.md). On an interactive terminal you're asked how proactive capture should be — default, manual-only, or none; pass --guidance <mode> to choose non-interactively. Start a new agent session after install so the guidance is loaded. Prefer the manual route on Claude Code? /plugin marketplace add StepzeroLab/research-git then /plugin install research-git@research-git.
cd your-project
rgit init # creates .rgit/ (the store) at the git rootLaunch your work through rgit run — it executes your command, freezes a reproducible artifact, records the run + any metrics, and stages what changed:
rgit run -- python eval_agent.py --retrieval rerankThen turn that raw material into a clean capsule (in a Claude Code session):
/rgit-capture # segments the diff into Feature Capsules, then wires up graph edges
rgit review # list proposals
rgit review --approve <proposal_id> --name rerank-retrieval
Weeks later, after the agent has moved on:
/rgit-recall bring back the re-ranking retrieval step
Recall scores capsules against your query, surfaces each hit with its related neighbors, then dispatches a subagent that re-implements the idea onto today's structure — adapting to refactors and leaving you a reviewable diff.
That's the whole loop. The rest of the commands you'll meet as you need them — see More commands.
Anywhere you try many variations of one thing and later want a single one back — cleanly, on top of how the code looks now.
- 🤖 Agent / Prompt engineering — you tried four prompt structures, two tool-splitting schemes, and a different retrieval step. Last week's version scored better; bring that idea back onto the agent you've since rewritten.
- ⚙️ Backend / Systems — three caching strategies, two rate-limiters, a reworked query plan. Which won? Pull the winning variant forward without reverting everything built since.
- 🎨 Frontend — competing interaction flows and layout variants, half commented out. Resurrect the one that tested best onto the current component tree.
Also at home in ML research — different loss terms, attention blocks, augmentations. Same shape: the experiment is the idea, the metrics are the result, and you want one variant back on today's code.
The graph is served over MCP read-only (recall / compose / get, plus the query commands compare / ablation / provenance). Point a teammate's client at your rgit mcp server and they get the same Feature Capsules and the same answers — then their session regenerates an idea onto their code, on their subscription. The memory is shared; the intelligence is local.
The engine owns the durable, deterministic parts — the graph, content-addressed object store, git diffing, and the byte-exact run freeze. The agentic parts are delegated to subagents the host already provides. We don't reimplement an agent loop, and we never call a paid API.
A free, deterministic Phase 1 (libcst maps diff hunks to the functions/classes they touch) produces a rough candidate for every change. Phase 2 is a dispatched capsule-segmenter subagent that clusters the diff into coherent features, drops infrastructure noise, and writes the real intent, knobs, assumptions, and resurrection guide. Once a capsule is approved, the engine deterministically links same-region edges and over-produces depends_on candidates from name overlap, which an edge-judge subagent confirms or rejects.
Recall scores every approved capsule against your query in plain Python — no embeddings, no SQL LIKE traps — and boosts a hit when a connected capsule also matches, so related work surfaces together. Each result carries its related subgraph.
- MCP — shared memory (query-only). Returns graph snippets; safe to expose so a team shares one memory. Carries no intelligence.
- Plugin — local intelligence. Three subagents (
capsule-segmenter,capsule-regenerator,edge-judge) and two skills (rgit-capture,rgit-recall) define how a session acts on those snippets, natively, on its own subscription.
The agent helps you author; it is never in the replay path. rgit run freezes the exact bytes that ran, content-addressed and immutable. "The code behind run X" is a byte-identical re-materialization of a stored blob.
The five-step loop above is the core. These show up as your store grows — run rgit <command> --help for any of them:
| Command | What it does |
|---|---|
rgit watch |
free, deterministic background capture — stages raw material as you edit, so fleeting in-between states aren't lost |
rgit install-hooks |
stage on every commit via a post-commit hook (won't touch an existing hook) |
rgit run --from <capsule> |
run a recalled variant and link the new run as a variant_of the original |
rgit compare <query> |
which variant won: ranked table, Δ vs baseline, ★ winner |
rgit provenance <run_id> |
per-feature clean (capsule) vs agent-adapted (frozen) diff for a run |
rgit mcp |
serve the graph read-only so a teammate's client can recall against it |
MIT © Stepzero Lab
Core contributors: Yuxiang Lin · Fengrong Wan · Jiajun Sun