I kept getting wrecked by Claude API bills. So I built a middleware layer.
I'm a music composer. I run a sonic branding studio called Fable Audio. I am not, by any definition, an infrastructure engineer. And yet here I am building API tooling. Because Claude kept billing ...

Source: DEV Community
I'm a music composer. I run a sonic branding studio called Fable Audio. I am not, by any definition, an infrastructure engineer. And yet here I am building API tooling. Because Claude kept billing me into mild panic. The problem I was building Claude-powered workflows and token usage kept compounding faster than I expected. Each call in an agentic chain drags context from the last one. By call four or five you're paying for a lot of history the model doesn't need. Every fix I found required rewriting app logic. I wanted something that handled it at the layer between my app and the API — without touching my code. What I built Tokenly is an optimization layer that sits between your app and the Anthropic API. BYOK — your key goes straight to Anthropic. Still early, still figuring it out. tokenly.onrender.com If you're hitting the same wall on Claude API costs, especially on multi-turn or agentic workflows, I'd genuinely appreciate feedback on whether this is useful to you.