token-optimization

Here are 285 public repositories matching this topic...

headroomlabs-ai / headroom

Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.

Updated Jul 2, 2026
Python

Cut AI token costs 95%+ on code exploration. The leading MCP server for precise, symbol-level GitHub code retrieval via tree-sitter AST. Works with Claude Code, Cursor & any MCP client. 313B+ tokens saved.

Updated Jul 1, 2026
Python

alexgreensh / token-optimizer

Sponsor

Star

Find the ghost tokens. Fix them. Survive compaction. Avoid context quality decay.

token-usage context-window claude-code token-optimization context-engineering claude-plugin claude-code-skill token-optimizer agentskills ghost-tokens

Updated Jul 1, 2026
Python

lucasrosati / claude-code-memory-setup

Star

Up to 71.5x fewer tokens per session on Claude Code with Obsidian + Graphify. Persistent memory, codebase knowledge graphs, and chat import pipeline. 🇧🇷 PT-BR included.

knowledge-graph obsidian zettelkasten developer-productivity second-brain ai-tools graphify claude-code token-optimization coding-agent

Updated Jun 1, 2026
Python

GMaN1911 / claude-cognitive

Star

Working memory for Claude Code - persistent context and multi-instance coordination

productivity developer-tools claude-ai context-management claude-code token-optimization

Updated Jan 17, 2026
Python

juyterman1000 / entroly

Star

Cut your Claude / OpenAI / Gemini bill 70–95% on AI coding. Local proxy that compresses context, keeps provider caches hot, and verifies LLM output ($0 hallucination guard). Drop-in for Cursor, Claude Code, Codex, Aider + 34 more and custom providers — 30s, no code changes

rust productivity open-source ai mcp cursor ai-agents claude rag llm chatgpt anthropic hallucination-detection context-compression mcp-server claude-code token-optimization llm-grounding ai-hallucination

Updated Jul 1, 2026
Python

Lap-Platform / LAP

Star

Your agents are guessing at APIs. Give them the actual Agent-Native spec. 1500+ API's Ready To-Use skills, Compile any API spec into a lean, agent-native format. 10× smaller. OpenAPI, GraphQL, AsyncAPI, Protobuf, Postman.

Updated Mar 26, 2026
Python

elusznik / mcp-server-code-execution-mode

Star

An MCP server that executes Python code in isolated rootless containers with optional MCP server proxying. Implementation of Anthropic's and Cloudflare's ideas for reducing MCP tool definitions context bloat.

python docker mcp orchestration agents code-execution claude podman anthropic agentic-ai model-context-protocol claude-code token-optimization

Updated Dec 5, 2025
Python

0xhimanshu / governor

Star

Claude Code usage governor: compact professional output, context slimming, tool-output filtering, telemetry, and drift guardrails.

cli developer-tools ai-tools llm prompt-engineering claude-ai context-window claude-code token-optimization claude-code-plugin claude-skills

Updated Jun 20, 2026
Python

borhen68 / TokenTamer

Star

A drop-in proxy that compresses bloated code context in real-time, cutting LLM API costs by 50–80% without losing what the model actually needs to know.

python proxy openai developer-tools llm cost-reduction anthropic context-compression token-optimization ai-coding-agent

Updated Jun 15, 2026
Python

abhisekjha / pith

Star

Pith is the hook that makes Claude Code sessions last 3x longer.

anthropic llm-tools claude-code token-optimization claude-code-plugin

Updated May 6, 2026
Python

avilum / minrlm

Star

A small Recursive Language Model: let any LLM run code on its context instead of stuffing it into the prompt.

agent inference ai-agents inference-engine cost-optimization rlms rlm inference-api llm llm-inference token-economics latency-optimization token-optimization recursive-language-model minrlm

Updated Jun 11, 2026
Python

capitalparser / notebooklm-wiki-pipeline

Star

Turn Google Drive PDFs into Obsidian wiki notes via NotebookLM MCP without loading full PDFs into Claude context

pdf mcp google-drive obsidian knowledge-management notebooklm claude-code token-optimization

Updated May 25, 2026
Python

castnettech / mnemosyne

Star

State aware knowledge compression, ingestion, and hybrid retrieval engine. Zero dependencies. Sub-100ms queries.

python open-source developer-tools tfidf bm25 zero-dependencies code-retrieval llm context-compression token-optimization

Updated May 30, 2026
Python

elevanaltd / octave-mcp

Star

OCTAVE protocol - structured AI communication with 3-20x token reduction. MCP server with lenient-to-canonical pipeline and schema validation.

python ai mcp protocol llm model-context-protocol token-optimization

Updated Jun 23, 2026
Python

KbWen / agentic-os

Star

Governance framework for AI coding agents. It runs them through a five-step workflow (plan, build, review, test, ship) where no step counts as done without evidence. Drop-in rules and guardrails for Claude Code, Codex, Cursor, Copilot, and Antigravity, via AGENTS.md.

Updated Jul 2, 2026
Python

CarlosVallejoRuiz / slurp

Star

Token-budget-aware graph navigation for AI coding agents. Serve exactly the noodles your LLM needs. 🍜

python cli knowledge-graph ai-tools graphify llm token-optimization

Updated Jun 15, 2026
Python

sheeki03 / Few-Word

Sponsor

Star

Claude Code plugin that offloads large outputs to filesystem and retrieves when required.

ai-agents claude-code token-optimization context-engineering claude-code-hooks claude-code-plugin claude-code-plugins claude-code-skills claude-code-skill

Updated Jan 23, 2026
Python

JacobHuang91 / prompt-refiner

Star

🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Save 10-20% on API costs while fitting RAG docs, chat history, and prompts into your token budget.

python machine-learning openai ai-agents cost-optimization rag llm prompt-engineering langchain anthropic function-calling prompt-optimization token-optimization

Updated Apr 12, 2026
Python

SonicBotMan / lobster-press

Star

🦞 LobsterPress（龙虾饼） - Cognitive Memory System for AI Agents 基于认知科学的 LLM 永久记忆引擎

bash ai shell-script claude chatgpt context-compression token-optimization openclaw

Updated Jul 1, 2026
Python

Improve this page

Add a description, image, and links to the token-optimization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the token-optimization topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

token-optimization

Here are 285 public repositories matching this topic...

headroomlabs-ai / headroom

jgravelle / jcodemunch-mcp

alexgreensh / token-optimizer

lucasrosati / claude-code-memory-setup

GMaN1911 / claude-cognitive

juyterman1000 / entroly

Lap-Platform / LAP

elusznik / mcp-server-code-execution-mode

0xhimanshu / governor

borhen68 / TokenTamer

abhisekjha / pith

avilum / minrlm

capitalparser / notebooklm-wiki-pipeline

castnettech / mnemosyne

elevanaltd / octave-mcp

KbWen / agentic-os

CarlosVallejoRuiz / slurp

sheeki03 / Few-Word

JacobHuang91 / prompt-refiner

SonicBotMan / lobster-press

Improve this page

Add this topic to your repo