Your MCP tool descriptions are eating your context window.
I've been reviewing MCP implementations, and the same pattern keeps appearing: verbose tool schemas that burn thousands of tokens before the agent does any actual work. In a world where context is your scarcest resource, this matters more than most teams realize.
A recent proposal in the MCP repo measured a MySQL server with 106 tools: 207KB of schema data, roughly 54,600 tokens, on every initialization. Even when the model only needs 2-3 tools.
The good news: this problem is getting serious attention at multiple layers of the stack.
Token efficiency isn't a single problem. It's three problems at three architectural layers:
That last one is crucial and often overlooked. The MCP host (Claude Desktop, your custom integration, etc.) doesn't have to forward every discovered tool to the model. It can implement its own filtering, search, or progressive disclosure before anything hits the context window.
Claude Code is now rolling out MCP Tool Search, as of v2.1.7, which automatically triggers when your MCP tool descriptions would consume more than 10% of context. Instead of preloading all tools, Claude Code loads them via search on demand. This directly addresses one of the most-requested features on GitHub: users were documenting setups with 7+ MCP servers consuming 67k+ tokens.
Anthropic's engineering post on advanced tool use describes the underlying pattern. Their Tool Search Tool lets the agent query for relevant tools rather than receiving everything upfront. They report 85% token reduction for large tool libraries, with tool definitions dropping from 10K+ tokens to around 3K per request.
Let's look at what's happening at each layer.
Every word in your tool descriptions should earn its place.