- Why not translate between OpenAI and Anthropic formats?
-
Because the differences aren't superficial. Tool-call schemas,
prompt-cache markers, citation blocks, streaming-event shapes,
and the way assistant turns are encoded all differ between OpenAI
Chat Completions and Anthropic Messages. A translator would have
to be perfect on every release of either API — and any drift
corrupts your inference in ways that are subtle and hard to
debug. Same-format pass-through is a cleaner contract: your
Claude request lands on a Claude account, your OpenAI-shape
request lands on an OpenAI-shape account. Failover happens
inside that contract, never across it.
- Is this a load balancer?
-
No — it's a single-tenant router on your Mac. It does not split a
request across providers or aggregate responses. One request, one
provider, picked by the policy above.
- What happens if every provider is exhausted?
-
The Hydrant returns a structured 503 to the client with the
specific cooled-down accounts and the elapsed time until the
earliest one recovers. Your IDE sees a clear error, not a hang.
- Can I disable routing for an account but still track it?
-
Yes. Toggle routingEnabled off on the
account; cost and quota continue to populate the dashboard while
the Hydrant skips it for outbound traffic.
- How long is the cool-down?
-
Five minutes for transient failures, rate-limits, and auth failures.
Exhausted-bucket accounts stay parked until the bucket rolls
(daily, weekly, or monthly, per provider).
- Is there a config file?
-
Routing rules are derived from the accounts you've added in the
macOS app and a per-account sortKey +
preferred-pin you can drag-to-reorder. No YAML, no hot-reload —
change the order in the UI, the new policy takes effect on the next
request.