Run #750

success · fetched 2026-06-26 23:00:59 · 9.84 MB raw HTML · 542 models

Open this run in comparison view · Download raw HTML · JSON results

Top quality: Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) (63.1 pts)

#?Pareto?Model?Released?Cost$?$/Q?Qual?ΔTop?Intel?Code?Agent?Pen?Score?
1 Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback)
Anthropic
2026-06-09 $5,631 89.29 63.1 0.0 59.9 76.5 52.8 37.5 63.1
2 Claude Opus 4.8 (Adaptive Reasoning, Max Effort)
Anthropic
2026-05-28 $3,753 63.56 59.0 -4.0 55.7 74.3 47.2 35.7 59.0
3 GPT-5.5 (xhigh)
OpenAI
2026-04-23 $2,630 45.19 58.2 -4.9 54.8 74.9 44.9 34.2 58.2
4 Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Anthropic
2026-04-16 $3,738 65.38 57.2 -5.9 53.5 73.6 44.4 35.7 57.2
5 GPT-5.5 (high)
OpenAI
2026-04-23 $1,655 29.52 56.1 -7.0 53.1 71.6 43.5 32.2 56.1
6 GPT-5.4 (xhigh)
OpenAI
2026-03-05 $2,132 39.11 54.5 -8.5 51.4 71.1 41.1 33.3 54.5
7 GLM-5.2 (max)
Z AI
2026-06-16 $983 18.10 54.3 -8.8 51.1 68.8 43.1 29.9 54.3
8 GPT-5.5 (medium)
OpenAI
2026-04-23 $870 16.34 53.2 -9.8 50.4 71.5 37.8 29.4 53.2
9 Gemini 3.5 Flash (high)
Google
2026-05-19 $1,041 19.79 52.6 -10.5 50.2 70.1 37.4 30.2 52.6
10 Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
Anthropic
2026-02-17 $3,356 66.67 50.3 -12.7 47.2 63.0 40.8 35.3 50.3
11 Qwen3.7 Max
Alibaba
2026-05-19 $1,159 24.39 47.5 -15.5 46.0 66.0 30.6 30.6 47.5
12 DeepSeek V4 Pro (Reasoning, Max Effort)
DeepSeek
2026-04-24 $176 3.78 46.7 -16.4 44.3 59.4 36.4 22.5 46.7
13 MiniMax-M3
MiniMax
2026-06-01 $204 4.42 46.1 -16.9 44.4 58.6 35.4 23.1 46.1
14 Gemini 3.1 Pro Preview
Google
2026-02-19 $815 17.89 45.6 -17.5 46.5 68.8 21.4 29.1 45.6
15 GPT-5.5 (low)
OpenAI
2026-04-23 $358 7.97 44.9 -18.1 43.5 60.9 30.4 25.5 44.9
16 Kimi K2.7 Code
Kimi
2026-06-12 $525 11.89 44.1 -19.0 41.9 60.8 29.6 27.2 44.1
17 MiMo-V2.5-Pro
Xiaomi
2026-04-22 $98.5 2.25 43.8 -19.2 42.2 60.2 29.1 19.9 43.8
18 Kimi K2.6
Kimi
2026-04-20 $782 18.16 43.0 -20.0 42.8 56.0 30.3 28.9 43.0
19 DeepSeek V4 Flash (Reasoning, Max Effort)
DeepSeek
2026-04-24 $72.4 1.70 42.5 -20.6 40.3 56.2 31.1 18.6 42.5
20 GPT-5.4 mini (xhigh)
OpenAI
2026-03-17 $1,088 25.85 42.1 -21.0 40.0 56.1 30.2 30.4 42.1
21 GLM-5.1 (Reasoning)
Z AI
2026-04-07 $674 16.07 41.9 -21.1 40.2 55.8 29.9 28.3 41.9
22 GPT-5.4 nano (xhigh)
OpenAI
2026-03-17 $267 6.57 40.6 -22.4 38.2 56.1 27.5 24.3 40.6
23 Qwen3.6 Plus
Alibaba
2026-04-02 $526 12.96 40.5 -22.5 39.6 54.5 27.6 27.2 40.5
24 Grok Build 0.1 0616
xAI
- $375 9.43 39.8 -23.3 39.8 51.5 28.0 25.7 39.8
25 Qwen3.6 27B (Reasoning)
Alibaba
2026-04-22 $668 17.01 39.3 -23.8 37.1 53.7 27.0 28.2 39.3
26 GPT-5.5 (Non-reasoning)
OpenAI
2026-04-23 $193 4.92 39.2 -23.8 35.4 56.5 25.8 22.9 39.2
27 MiniMax-M2.7
MiniMax
2026-03-18 $137 3.54 38.8 -24.3 38.1 52.6 25.6 21.4 38.8
28 Qwen3.7 Plus
Alibaba
2026-06-01 $152 3.95 38.5 -24.5 39.0 55.9 20.8 21.8 38.5
29 Nemotron 3 Ultra 550B A55B (Reasoning)
NVIDIA
2026-06-04 $443 11.62 38.1 -24.9 37.8 49.3 27.4 26.5 38.1
30 KAT-Coder-Pro V1
KwaiKAT
2025-11-11 $39.1 1.08 36.1 -26.9 34.6 58.9 14.9 15.9 36.1
31 Grok 4.3 (high)
xAI
2026-04-30 $300 8.66 34.6 -28.4 37.6 42.2 24.1 24.8 34.6
32 Qwen3.5 397B A17B (Reasoning)
Alibaba
2026-02-16 $528 15.57 33.9 -29.1 33.7 48.2 19.8 27.2 33.9
33 Qwen3.5 122B A10B (Reasoning)
Alibaba
2026-02-24 $447 13.59 32.9 -30.2 32.3 45.7 20.7 26.5 32.9
34 Mistral Medium 3.5
Mistral
2026-04-29 $1,014 31.75 31.9 -31.1 29.9 46.9 19.0 30.1 31.9
35 Qwen3.6 35B A3B (Reasoning)
Alibaba
2026-04-16 $333 10.54 31.6 -31.4 31.6 41.9 21.4 25.2 31.6
36 Ring-2.6-1T
InclusionAI
2026-05-08 $459 14.94 30.8 -32.3 30.6 42.8 18.9 26.6 30.8
37 Claude 4.5 Haiku (Reasoning)
Anthropic
2025-10-15 $539 17.99 30.0 -33.1 29.6 43.9 16.4 27.3 30.0
38 Step 3.7 Flash
StepFun
2026-05-29 $320 10.84 29.5 -33.5 29.7 37.3 21.5 25.1 29.5
39 Grok 4.3 (Non-reasoning)
xAI
2026-04-30 $213 7.72 27.6 -35.5 24.8 35.2 22.8 23.3 27.6
40 Gemma 4 26B A4B (Reasoning)
Google
2026-04-02 $54.5 2.15 25.3 -37.7 25.7 39.3 11.0 17.4 25.3
41 NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
NVIDIA
2026-03-11 $295 12.30 23.9 -39.1 25.4 37.7 8.7 24.7 23.9
42 gpt-oss-120b (high)
OpenAI
2025-08-05 $96.3 4.28 22.5 -40.6 23.8 30.4 13.2 19.8 22.5
43 Gemini 2.5 Pro
Google
2025-06-05 $610 27.64 22.1 -41.0 25.8 33.3 7.1 27.9 22.1
44 Gemini 3.1 Flash-Lite
Google
2026-03-03 $93.7 4.27 22.0 -41.1 25.0 34.7 6.2 19.7 22.0
45 Nova 2.0 Pro Preview (medium)
Amazon
2025-11-27 $407 19.46 20.9 -42.1 21.8 34.0 7.0 26.1 20.9
46 Nova 2.0 Pro Preview (low)
Amazon
2025-11-27 $244 14.09 17.3 -45.8 19.6 25.9 6.4 23.9 17.3
47 Nova 2.0 Lite (high)
Amazon
2025-10-29 $635 42.95 14.8 -48.3 18.2 23.0 3.1 28.0 14.8
48 Mistral Large 3
Mistral
2025-12-02 $71.0 5.13 13.8 -49.2 15.9 20.1 5.5 18.5 13.8
49 Magistral Medium 1.2
Mistral
2025-09-18 $886 64.78 13.7 -49.4 17.9 21.3 1.9 29.5 13.7
50 gpt-oss-20B (high)
OpenAI
2025-08-05 $29.9 2.32 12.9 -50.2 14.9 20.7 3.1 14.8 12.9
51 Nova 2.0 Pro Preview (Non-reasoning)
Amazon
2025-11-27 $204 16.09 12.7 -50.4 14.4 20.9 2.9 23.1 12.7
52 Llama 4 Maverick
Meta
2025-04-05 $38.3 3.61 10.6 -52.4 14.3 16.3 1.3 15.8 10.6
53 NVIDIA Nemotron 3 Nano 30B A3B (Reasoning)
NVIDIA
2025-12-15 $42.8 4.20 10.2 -52.9 14.2 14.4 2.0 16.3 10.2
54 Ministral 3 14B
Mistral
2025-12-02 $116 12.60 9.2 -53.9 11.1 14.4 2.2 20.6 9.2
55 Ministral 3 8B
Mistral
2025-12-02 $113 17.12 6.6 -56.4 9.0 9.7 1.2 20.5 6.6
56 Llama 4 Scout
Meta
2025-04-05 $11.4 1.77 6.4 -56.6 10.0 8.2 1.1 10.6 6.4
57 Ministral 3 3B
Mistral
2025-12-02 $91.5 21.01 4.4 -58.7 6.8 4.8 1.5 19.6 4.4