What BurnBar recommended on 2026-05-11.
Frozen snapshot. Same data the router used to score requests that day, ordered by task and explained with source citations. Benchmark signals are advisory — runtime constraints (provider-family mode, pinning, auth, quota, safety, availability) always win.
- generated 12:00 UTC
- 3 task categories
- 5 sources
Rundown · 2026-05-11
loading live data… Generated Mon, 11 May 2026 12:00:00 GMT · schema v1 · benchmarks advisory · runtime constraints win
- Artificial Analysis unavailable
- Terminal-Bench (via Hugging Face) error
- Design Arena stale · 42h old
- Hugging Face fresh
- Manual OpenBurnBar fixture fresh
Benchmark data is advisory only. Provider-family mode, user pinning, account auth, quota state, safety policy, and availability are evaluated at runtime and override any ranking shown here.
- Coding
Refactors, multi-file edits, repo-grounded code generation.
Today's pick: Claude Opus 4.7 — led the benchmark composite at 85/100; evidence is fresh; context window of 1000k clears typical large-context work; runner-up GLM 5 is held in reserve for instant failover.
- #1
Claude Opus 4.7 Anthropic · anthropic76 composite / 100 evidence 86%- bench85
- fresh85
- rel86
- latency—
- cost18
- ctx1M
- availcommon
Why this rank
- Composite benchmark score 85/100 across 1 source.
- Freshest evidence rated 85/100 — older sources are weighted down, not dropped.
- Premium-tier per-token cost.
- Context window: 1000k tokens.
- Wire-format family: anthropic.
Source citations
- #2
GLM 5 Z.ai · openai_compat75 composite / 100 evidence 86%- bench78
- fresh85
- rel82
- latency—
- cost66
- ctx256k
- availcommon
Why this rank
- Composite benchmark score 78/100 across 1 source.
- Freshest evidence rated 85/100 — older sources are weighted down, not dropped.
- Mid-tier per-token cost.
- Context window: 256k tokens.
- Wire-format family: openai_compat.
Source citations
-
- Design
Website / UI / SVG / slide generation evaluated head-to-head.
Today's pick: Claude Opus 4.7 — led the benchmark composite at 81/100; evidence is fresh; context window of 1000k clears typical large-context work.
- #1
Claude Opus 4.7 Anthropic · anthropic75 composite / 100 evidence 86%- bench81
- fresh85
- rel86
- latency—
- cost18
- ctx1M
- availcommon
Why this rank
- Composite benchmark score 81/100 across 1 source.
- Freshest evidence rated 85/100 — older sources are weighted down, not dropped.
- Premium-tier per-token cost.
- Context window: 1000k tokens.
- Wire-format family: anthropic.
Source citations
-
- Analysis
Long-context reasoning, summarization, structured extraction.
Today's pick: Claude Opus 4.7 — led the benchmark composite at 88/100; evidence is fresh; context window of 1000k clears typical large-context work.
- #1
Claude Opus 4.7 Anthropic · anthropic78 composite / 100 evidence 86%- bench88
- fresh85
- rel86
- latency—
- cost18
- ctx1M
- availcommon
Why this rank
- Composite benchmark score 88/100 across 1 source.
- Freshest evidence rated 85/100 — older sources are weighted down, not dropped.
- Premium-tier per-token cost.
- Context window: 1000k tokens.
- Wire-format family: anthropic.
Source citations
-
Re-run today's routing locally.
Add an account, pick a model, and let the Fire Hydrant do the routing. Provider-family mode by default; intelligent mode opt-in.