Signal — a journal from Orpheus

Most AI-tools roundups are aspirational. This one is operational.

These are the tools we use every day to build Orpheus, with honest notes on what they replace and where they break. The list is short on purpose. The list of tools we tried and dropped is longer.

What we use, daily

Claude — for almost everything cognitive

API, chat, and Claude Code. Drafting code, reviewing PRs, summarizing meetings, debugging unfamiliar libraries. The reliability gap between Claude and the alternatives is the biggest single productivity multiplier on the team right now.

Cursor — IDE-level coding

Inline edits, function-level refactors, ambient autocomplete. Pairs with Claude Code (see Cursor vs Claude Code).

Cloudflare Workers AI — for in-app inference

When we need to run a small model in our request path, this is where it runs. Cheap, fast, no GPU procurement. Not the right tool for frontier-model work — that goes to Anthropic.

Linear — task tracking that AI can read

Not an AI tool itself, but its API is the cleanest way to let an agent see "what's on the team's plate." We pipe Linear into Claude when we want a summary of the week.

Granola — meeting notes that don't waste your time

Records, transcribes, summarizes. The summary quality is good enough that we stopped sending separate post-meeting recap emails.

What we tried and dropped

GitHub Copilot

Replaced by Cursor for editing, by Claude for everything else. Copilot was fine. The competitors got better faster.

Various "AI agent" platforms

We tried four different agent frameworks for automating internal ops. All of them died on the same problem: when something went wrong, debugging the agent was harder than just writing a script. We now use Claude Code for one-off agent tasks and Python scripts for recurring ones.

Most AI writing assistants

For long-form writing, raw Claude with a good prompt beats every wrapper. We tried four. We pay for none.

"AI inbox" tools

The premise — let AI manage your email — sounds good until you realize the cost of a missed email is asymmetric. We do triage manually.

How we evaluate a new tool

Three filters, in order:

Does it replace something we currently spend time on? Not "is it cool." Tools that don't replace work just add another tab.
Will we use it three times this week? If not, the marginal value is low and the context-switch cost wins.
Can we eject in 5 minutes? Tool dependency is real. The ones that lock you into a proprietary format get a second look.

The bar is intentionally high. Most tools that get demoed look amazing in the demo and don't survive contact with real work.

A note on what we sell

Orpheus is also an AI tool — for turning spoken language into searchable text. We use it internally too. The discipline of being a customer of your own product is its own filter. If we wouldn't pay for it, we shouldn't sell it.

Updated quarterly. The version that's three months old is probably wrong already.

The AI tools we actually use at Orpheus