Operator log · daily intelligent router rundown

What BurnBar recommended on 2026-05-12.

Frozen snapshot. Same data the router used to score requests that day, ordered by task and explained with source citations. Benchmark signals are advisory — runtime constraints (provider-family mode, pinning, auth, quota, safety, availability) always win.

generated 12:00 UTC
5 task categories
5 sources

Daily Intelligent Router Rundown

Rundown · 2026-05-12

loading live data… Generated Tue, 12 May 2026 12:00:00 GMT · schema v1 · benchmarks advisory · runtime constraints win

Artificial Analysis unavailable
Terminal-Bench (via Hugging Face) stale · 14h old
Design Arena stale · 42h old
Hugging Face fresh
Manual OpenBurnBar fixture fresh

Benchmark data is advisory only. Provider-family mode, user pinning, account auth, quota state, safety policy, and availability are evaluated at runtime and override any ranking shown here.

Coding
Refactors, multi-file edits, repo-grounded code generation.

Today's pick: Claude Opus 4.7 — led the benchmark composite at 87/100; evidence is the freshest available, even though older than ideal; context window of 1000k clears typical large-context work; runner-up Claude Sonnet 4.6 is held in reserve for instant failover.
1. #1
  Claude Opus 4.7 Anthropic · anthropic
  
  75 composite / 100 evidence 100%
  - bench87
  - fresh55
  - rel88
  - latency46
  - cost18
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 87/100 across 1 source.
  - Freshest evidence rated 55/100 — older sources are weighted down, not dropped.
  - Premium-tier per-token cost.
  - Latency is acceptable for non-interactive work.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  Source citations
  - Artificial Analysis
    rank 1 score 87 4d old manual
2. #2
  Claude Sonnet 4.6 Anthropic · anthropic
  
  67 composite / 100 evidence 86%
  - bench79
  - fresh55
  - rel86
  - latency—
  - cost42
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 79/100 across 1 source.
  - Freshest evidence rated 55/100 — older sources are weighted down, not dropped.
  - Mid-tier per-token cost.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  - Tier · mid. Counted behind flagship siblings at equivalent benchmark; pin the tier explicitly to invert this.
  Source citations
  - Artificial Analysis
    rank 4 score 79 4d old manual
Terminal
Shell-loop agents that execute, observe, and self-correct.

Today's pick: Claude Opus 4.7 — led the benchmark composite at 78/100; evidence is fresh; context window of 1000k clears typical large-context work; runner-up Claude Sonnet 4.6 is held in reserve for instant failover.
1. #1
  Claude Opus 4.7 Anthropic · anthropic
  
  76 composite / 100 evidence 86%
  - bench78
  - fresh100
  - rel88
  - latency—
  - cost18
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 78/100 across 1 source.
  - Freshest evidence rated 100/100 — older sources are weighted down, not dropped.
  - Premium-tier per-token cost.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  Source citations
  - Terminal-Bench (via Hugging Face)
    rank 2 score 78 14h old manual
2. #2
  Claude Sonnet 4.6 Anthropic · anthropic
  
  72 composite / 100 evidence 86%
  - bench73
  - fresh100
  - rel86
  - latency—
  - cost42
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 73/100 across 1 source.
  - Freshest evidence rated 100/100 — older sources are weighted down, not dropped.
  - Mid-tier per-token cost.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  - Tier · mid. Counted behind flagship siblings at equivalent benchmark; pin the tier explicitly to invert this.
  Source citations
  - Terminal-Bench (via Hugging Face)
    rank 3 score 73 14h old manual
Design
Website / UI / SVG / slide generation evaluated head-to-head.

Today's pick: Claude Opus 4.7 — led the benchmark composite at 83/100; evidence is fresh; context window of 1000k clears typical large-context work.
1. #1
  Claude Opus 4.7 Anthropic · anthropic
  
  76 composite / 100 evidence 86%
  - bench83
  - fresh85
  - rel88
  - latency—
  - cost18
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 83/100 across 1 source.
  - Freshest evidence rated 85/100 — older sources are weighted down, not dropped.
  - Premium-tier per-token cost.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  Source citations
  - Design Arena
    rank 1 score 83 42h old manual
Analysis
Long-context reasoning, summarization, structured extraction.

Today's pick: Claude Opus 4.7 — led the benchmark composite at 89/100; evidence is the freshest available, even though older than ideal; context window of 1000k clears typical large-context work.
1. #1
  Claude Opus 4.7 Anthropic · anthropic
  
  74 composite / 100 evidence 86%
  - bench89
  - fresh55
  - rel88
  - latency—
  - cost18
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 89/100 across 1 source.
  - Freshest evidence rated 55/100 — older sources are weighted down, not dropped.
  - Premium-tier per-token cost.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  Source citations
  - Artificial Analysis
    rank 1 score 89 4d old manual
General
Mixed-intent chat / one-shot questions / catch-all routing.

Today's pick: Claude Opus 4.7 — led the benchmark composite at 87/100; evidence is the freshest available, even though older than ideal; context window of 1000k clears typical large-context work.
1. #1
  Claude Opus 4.7 Anthropic · anthropic
  
  73 composite / 100 evidence 86%
  - bench87
  - fresh55
  - rel88
  - latency—
  - cost18
  - ctx1M
  - availcommon
  Why this rank
  - Composite benchmark score 87/100 across 1 source.
  - Freshest evidence rated 55/100 — older sources are weighted down, not dropped.
  - Premium-tier per-token cost.
  - Context window: 1000k tokens.
  - Wire-format family: anthropic.
  Source citations
  - Artificial Analysis
    rank 1 score 87 4d old manual

Re-run today's routing locally.

Add an account, pick a model, and let the Fire Hydrant do the routing. Provider-family mode by default; intelligent mode opt-in.

Download for macOS Read the gateway doc

What BurnBar recommended on 2026-05-12.

Why this rank

Source citations

Why this rank

Source citations

Why this rank

Source citations

Why this rank

Source citations

Why this rank

Source citations

Why this rank

Source citations

Why this rank

Source citations

Re-run today's routing locally.