Open Source · CPU-Only · No AI Calls

Cut your LLM token bill
by 17–46% — no model changes,
no code rewrite

TextCompressor compresses prompts automatically as a local proxy or hosted API. Works with OpenAI, Anthropic, Ollama, LM Studio, and any OpenAI-compatible client.

Start Free View on GitHub
17–46%
Token savings per request
<10ms
Compression overhead
CPU-only
No model dependency

Real savings on real prompts

4,821
Original tokens
3,940
Compressed tokens
18.3% saved — typical OpenClaw session

Tested across 11,760 benchmark questions on War & Peace, legal (FAR), financial (SEC 10-K), and technical (RFC 7231) documents. Light compression saves ~17% with only ~2.7pp accuracy penalty. See full benchmark report →

Three steps, zero friction

1

Point your LLM client at localhost:8080

Start the TextCompressor proxy. Set your app's base URL — no other code changes needed.

textcompressor-proxy --level light --port 8080
# Then set: OPENAI_BASE_URL=http://localhost:8080
2

TextCompressor strips filler words from every prompt

Stop words, filler phrases, and redundant tokens are removed deterministically. Nouns, verbs, and meaning are preserved. Legal domain mode protects regulatory language.

3

Your LLM gets a tighter prompt — same answers, fewer tokens

Every request is forwarded to the real API with the savings reported in response headers: X-TC-Token-Reduction-Pct, X-TC-Tokens-Saved.

Works with
OpenAI Anthropic Ollama LM Studio OpenClaw Any OpenAI-compatible API

Start free, scale when you need to

Free
$0
100 requests / month
  • All 3 domains (general, legal, technical)
  • Light & medium compression
  • Local proxy (unlimited, offline)
  • Aggressive compression
  • Priority support
Pro
$99/mo
10M tokens / month
  • All domains
  • All compression levels
  • Local proxy (unlimited)
  • Hosted API access
  • Priority support

Local proxy is always free and unlimited — the hosted API is for teams who don't want to run their own server. Enterprise pricing available.

Install the local proxy

🍎
macOS
macOS 12+, Apple Silicon & Intel
Download .zip
🪟
Windows
Windows 10/11, x64
Download .exe
🐧
Linux
Ubuntu, Debian, Arch, Fedora
Download .tar.gz

Or install via pip: pip install textcompressor

Common questions

Does it change the meaning of my prompts?
No. TextCompressor removes stop words and filler phrases only — words like "the", "a", "however", "please note that". Nouns, verbs, numbers, and domain-specific terms are untouched. Your LLM sees a denser version of the same message.
Does my data leave my machine?
Not with the local app. The compression runs entirely on your CPU — no external calls. Only the final compressed prompt is forwarded to your LLM API (OpenAI, Ollama, etc.), the same as without TextCompressor.
Will it work with my LLM?
If it has an OpenAI-compatible API, yes. Tested with GPT-4o, Claude, Ollama, LM Studio, and OpenClaw. Any client that respects OPENAI_BASE_URL works with zero code changes.
What's the accuracy impact?
Light compression (~17% token savings) shows only ~2.7pp accuracy penalty across 11,760 benchmark tests — and on legal/regulatory text, GPT-4o actually performs slightly better with light compression than without. Medium saves ~34% at ~5.1pp. We recommend starting with light.
Is it open source?
The source is available on GitHub for review. The hosted API and paid plans fund ongoing development.