Run #692

success · fetched 2026-06-21 05:00:12 · 9.49 MB raw HTML · 540 models

Open this run in comparison view · Download raw HTML · JSON results

Top quality: Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) (63.1 pts)

#?Pareto?Model?Released?Cost$?$/Q?Qual?ΔTop?Intel?Code?Agent?Pen?Score?
1 Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)
Anthropic
2026-06-09 $6,228 98.77 63.1 0.0 59.9 76.5 52.8 37.9 63.1
2 Claude Opus 4.8 (Adaptive Reasoning, Max Effort)
Anthropic
2026-05-28 $4,012 67.95 59.0 -4.0 55.7 74.3 47.2 36.0 59.0
3 GPT-5.5 (xhigh)
OpenAI
2026-04-23 $2,588 44.47 58.2 -4.9 54.8 74.9 44.9 34.1 58.2
4 Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Anthropic
2026-04-16 $3,738 65.38 57.2 -5.9 53.5 73.6 44.4 35.7 57.2
5 GPT-5.5 (high)
OpenAI
2026-04-23 $1,775 31.67 56.1 -7.0 53.1 71.6 43.5 32.5 56.1
6 GPT-5.4 (xhigh)
OpenAI
2026-03-05 $2,261 41.48 54.5 -8.5 51.4 71.1 41.1 33.5 54.5
7 GLM-5.2 (max)
Z AI
2026-06-16 $869 16.00 54.3 -8.8 51.1 68.8 43.1 29.4 54.3
8 GPT-5.5 (medium)
OpenAI
2026-04-23 $969 18.20 53.2 -9.8 50.4 71.5 37.8 29.9 53.2
9 Gemini 3.5 Flash (high)
Google
2026-05-19 $1,142 21.70 52.6 -10.5 50.2 70.1 37.4 30.6 52.6
10 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
Anthropic
2026-02-17 $3,356 66.67 50.3 -12.7 47.2 63.0 40.8 35.3 50.3
11 Qwen3.7 Max
Alibaba
2026-05-19 $1,159 24.39 47.5 -15.5 46.0 66.0 30.6 30.6 47.5
12 DeepSeek V4 Pro (Reasoning, Max Effort)
DeepSeek
2026-04-24 $180 3.85 46.7 -16.4 44.3 59.4 36.4 22.5 46.7
13 MiniMax-M3
MiniMax
2026-06-01 $235 5.10 46.1 -16.9 44.4 58.6 35.4 23.7 46.1
14 Gemini 3.1 Pro Preview
Google
2026-02-19 $860 18.87 45.6 -17.5 46.5 68.8 21.4 29.3 45.6
15 GPT-5.5 (low)
OpenAI
2026-04-23 $390 8.68 44.9 -18.1 43.5 60.9 30.4 25.9 44.9
16 Kimi K2.7 Code
Kimi
2026-06-12 $530 12.03 44.1 -19.0 41.9 60.8 29.6 27.2 44.1
17 MiMo-V2.5-Pro
Xiaomi
2026-04-22 $99.1 2.26 43.8 -19.2 42.2 60.2 29.1 20.0 43.8
18 Kimi K2.6
Kimi
2026-04-20 $839 19.48 43.0 -20.0 42.8 56.0 30.3 29.2 43.0
19 DeepSeek V4 Flash (Reasoning, Max Effort)
DeepSeek
2026-04-24 $78.4 1.84 42.5 -20.6 40.3 56.2 31.1 18.9 42.5
20 GPT-5.4 mini (xhigh)
OpenAI
2026-03-17 $1,158 27.51 42.1 -21.0 40.0 56.1 30.2 30.6 42.1
21 GLM-5.1 (Reasoning)
Z AI
2026-04-07 $674 16.07 41.9 -21.1 40.2 55.8 29.9 28.3 41.9
22 GPT-5.4 nano (xhigh)
OpenAI
2026-03-17 $289 7.11 40.6 -22.4 38.2 56.1 27.5 24.6 40.6
23 Qwen3.6 Plus
Alibaba
2026-04-02 $484 11.95 40.5 -22.5 39.6 54.5 27.6 26.9 40.5
24 Qwen3.6 27B (Reasoning)
Alibaba
2026-04-22 $668 17.01 39.3 -23.8 37.1 53.7 27.0 28.2 39.3
25 GPT-5.5 (Non-reasoning)
OpenAI
2026-04-23 $223 5.69 39.2 -23.8 35.4 56.5 25.8 23.5 39.2
26 MiniMax-M2.7
MiniMax
2026-03-18 $144 3.71 38.8 -24.3 38.1 52.6 25.6 21.6 38.8
27 Qwen3.7 Plus
Alibaba
2026-06-01 $152 3.95 38.5 -24.5 39.0 55.9 20.8 21.8 38.5
28 Nemotron 3 Ultra 550B A55B (Reasoning)
NVIDIA
2026-06-04 $444 11.63 38.1 -24.9 37.8 49.3 27.4 26.5 38.1
29 Grok 4.3 (high)
xAI
2026-04-30 $319 9.20 34.6 -28.4 37.6 42.2 24.1 25.0 34.6
30 Qwen3.5 397B A17B (Reasoning)
Alibaba
2026-02-16 $528 15.57 33.9 -29.1 33.7 48.2 19.8 27.2 33.9
31 Qwen3.5 122B A10B (Reasoning)
Alibaba
2026-02-24 $447 13.59 32.9 -30.2 32.3 45.7 20.7 26.5 32.9
32 Mistral Medium 3.5
Mistral
2026-04-29 $1,014 31.75 31.9 -31.1 29.9 46.9 19.0 30.1 31.9
33 Qwen3.6 35B A3B (Reasoning)
Alibaba
2026-04-16 $333 10.52 31.6 -31.4 31.6 41.9 21.4 25.2 31.6
34 Ring-2.6-1T
InclusionAI
2026-05-08 $459 14.93 30.8 -32.3 30.6 42.8 18.9 26.6 30.8
35 Claude 4.5 Haiku (Reasoning)
Anthropic
2025-10-15 $539 17.99 30.0 -33.1 29.6 43.9 16.4 27.3 30.0
36 Step 3.7 Flash
StepFun
2026-05-29 $320 10.85 29.5 -33.5 29.7 37.3 21.5 25.1 29.5
37 Grok 4.3 (Non-reasoning)
xAI
2026-04-30 $297 10.78 27.6 -35.5 24.8 35.2 22.8 24.7 27.6
38 Gemma 4 26B A4B (Reasoning)
Google
2026-04-02 $54.5 2.15 25.3 -37.7 25.7 39.3 11.0 17.4 25.3
39 NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
NVIDIA
2026-03-11 $287 11.97 23.9 -39.1 25.4 37.7 8.7 24.6 23.9
40 gpt-oss-120b (high)
OpenAI
2025-08-05 $96.3 4.28 22.5 -40.6 23.8 30.4 13.2 19.8 22.5
41 Gemini 3.1 Flash-Lite
Google
2026-03-03 $94.8 4.32 22.0 -41.1 25.0 34.7 6.2 19.8 22.0
42 Nova 2.0 Pro Preview (medium)
Amazon
2025-11-27 $407 19.46 20.9 -42.1 21.8 34.0 7.0 26.1 20.9
43 gpt-oss-20B (high)
OpenAI
2025-08-05 $29.9 2.32 12.9 -50.2 14.9 20.7 3.1 14.8 12.9