Run #713
success · fetched 2026-06-23 08:00:13 · 9.89 MB raw HTML · 540 models
Open this run in comparison view · Download raw HTML · JSON results
Top quality: Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) (63.1 pts)
| #? | Pareto? | Model? | Released? | Cost$? | $/Q? | Qual? | ΔTop? | Intel? | Code? | Agent? | Pen? | Score? |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ✓ | Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) Anthropic |
2026-06-09 | $5,631 | 89.29 | 63.1 | 0.0 | 59.9 | 76.5 | 52.8 | 37.5 | 63.1 |
| 2 | ✓ | Claude Opus 4.8 (Adaptive Reasoning, Max Effort) Anthropic |
2026-05-28 | $3,753 | 63.56 | 59.0 | -4.0 | 55.7 | 74.3 | 47.2 | 35.7 | 59.0 |
| 3 | ✓ | GPT-5.5 (xhigh) OpenAI |
2026-04-23 | $2,630 | 45.19 | 58.2 | -4.9 | 54.8 | 74.9 | 44.9 | 34.2 | 58.2 |
| 4 | Claude Opus 4.7 (Adaptive Reasoning, Max Effort) Anthropic |
2026-04-16 | $3,738 | 65.38 | 57.2 | -5.9 | 53.5 | 73.6 | 44.4 | 35.7 | 57.2 | |
| 5 | ✓ | GPT-5.5 (high) OpenAI |
2026-04-23 | $1,655 | 29.52 | 56.1 | -7.0 | 53.1 | 71.6 | 43.5 | 32.2 | 56.1 |
| 6 | GPT-5.4 (xhigh) OpenAI |
2026-03-05 | $2,132 | 39.11 | 54.5 | -8.5 | 51.4 | 71.1 | 41.1 | 33.3 | 54.5 | |
| 7 | ✓ | GLM-5.2 (max) Z AI |
2026-06-16 | $983 | 18.10 | 54.3 | -8.8 | 51.1 | 68.8 | 43.1 | 29.9 | 54.3 |
| 8 | ✓ | GPT-5.5 (medium) OpenAI |
2026-04-23 | $870 | 16.34 | 53.2 | -9.8 | 50.4 | 71.5 | 37.8 | 29.4 | 53.2 |
| 9 | Gemini 3.5 Flash (high) |
2026-05-19 | $1,041 | 19.79 | 52.6 | -10.5 | 50.2 | 70.1 | 37.4 | 30.2 | 52.6 | |
| 10 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) Anthropic |
2026-02-17 | $3,356 | 66.67 | 50.3 | -12.7 | 47.2 | 63.0 | 40.8 | 35.3 | 50.3 | |
| 11 | Qwen3.7 Max Alibaba |
2026-05-19 | $1,159 | 24.39 | 47.5 | -15.5 | 46.0 | 66.0 | 30.6 | 30.6 | 47.5 | |
| 12 | ✓ | DeepSeek V4 Pro (Reasoning, Max Effort) DeepSeek |
2026-04-24 | $176 | 3.78 | 46.7 | -16.4 | 44.3 | 59.4 | 36.4 | 22.5 | 46.7 |
| 13 | MiniMax-M3 MiniMax |
2026-06-01 | $204 | 4.42 | 46.1 | -16.9 | 44.4 | 58.6 | 35.4 | 23.1 | 46.1 | |
| 14 | Gemini 3.1 Pro Preview |
2026-02-19 | $815 | 17.89 | 45.6 | -17.5 | 46.5 | 68.8 | 21.4 | 29.1 | 45.6 | |
| 15 | GPT-5.5 (low) OpenAI |
2026-04-23 | $358 | 7.97 | 44.9 | -18.1 | 43.5 | 60.9 | 30.4 | 25.5 | 44.9 | |
| 16 | Kimi K2.7 Code Kimi |
2026-06-12 | $525 | 11.89 | 44.1 | -19.0 | 41.9 | 60.8 | 29.6 | 27.2 | 44.1 | |
| 17 | ✓ | MiMo-V2.5-Pro Xiaomi |
2026-04-22 | $98.5 | 2.25 | 43.8 | -19.2 | 42.2 | 60.2 | 29.1 | 19.9 | 43.8 |
| 18 | Kimi K2.6 Kimi |
2026-04-20 | $782 | 18.16 | 43.0 | -20.0 | 42.8 | 56.0 | 30.3 | 28.9 | 43.0 | |
| 19 | ✓ | DeepSeek V4 Flash (Reasoning, Max Effort) DeepSeek |
2026-04-24 | $72.4 | 1.70 | 42.5 | -20.6 | 40.3 | 56.2 | 31.1 | 18.6 | 42.5 |
| 20 | GPT-5.4 mini (xhigh) OpenAI |
2026-03-17 | $1,088 | 25.85 | 42.1 | -21.0 | 40.0 | 56.1 | 30.2 | 30.4 | 42.1 | |
| 21 | GLM-5.1 (Reasoning) Z AI |
2026-04-07 | $674 | 16.07 | 41.9 | -21.1 | 40.2 | 55.8 | 29.9 | 28.3 | 41.9 | |
| 22 | GPT-5.4 nano (xhigh) OpenAI |
2026-03-17 | $267 | 6.57 | 40.6 | -22.4 | 38.2 | 56.1 | 27.5 | 24.3 | 40.6 | |
| 23 | Qwen3.6 Plus Alibaba |
2026-04-02 | $525 | 12.94 | 40.5 | -22.5 | 39.6 | 54.5 | 27.6 | 27.2 | 40.5 | |
| 24 | Qwen3.6 27B (Reasoning) Alibaba |
2026-04-22 | $668 | 17.01 | 39.3 | -23.8 | 37.1 | 53.7 | 27.0 | 28.2 | 39.3 | |
| 25 | GPT-5.5 (Non-reasoning) OpenAI |
2026-04-23 | $193 | 4.92 | 39.2 | -23.8 | 35.4 | 56.5 | 25.8 | 22.9 | 39.2 | |
| 26 | MiniMax-M2.7 MiniMax |
2026-03-18 | $137 | 3.54 | 38.8 | -24.3 | 38.1 | 52.6 | 25.6 | 21.4 | 38.8 | |
| 27 | Qwen3.7 Plus Alibaba |
2026-06-01 | $152 | 3.95 | 38.5 | -24.5 | 39.0 | 55.9 | 20.8 | 21.8 | 38.5 | |
| 28 | Nemotron 3 Ultra 550B A55B (Reasoning) NVIDIA |
2026-06-04 | $443 | 11.62 | 38.1 | -24.9 | 37.8 | 49.3 | 27.4 | 26.5 | 38.1 | |
| 29 | Grok 4.3 (high) xAI |
2026-04-30 | $300 | 8.66 | 34.6 | -28.4 | 37.6 | 42.2 | 24.1 | 24.8 | 34.6 | |
| 30 | Qwen3.5 397B A17B (Reasoning) Alibaba |
2026-02-16 | $528 | 15.57 | 33.9 | -29.1 | 33.7 | 48.2 | 19.8 | 27.2 | 33.9 | |
| 31 | Qwen3.5 122B A10B (Reasoning) Alibaba |
2026-02-24 | $447 | 13.59 | 32.9 | -30.2 | 32.3 | 45.7 | 20.7 | 26.5 | 32.9 | |
| 32 | Mistral Medium 3.5 Mistral |
2026-04-29 | $1,014 | 31.75 | 31.9 | -31.1 | 29.9 | 46.9 | 19.0 | 30.1 | 31.9 | |
| 33 | Qwen3.6 35B A3B (Reasoning) Alibaba |
2026-04-16 | $333 | 10.52 | 31.6 | -31.4 | 31.6 | 41.9 | 21.4 | 25.2 | 31.6 | |
| 34 | Ring-2.6-1T InclusionAI |
2026-05-08 | $459 | 14.93 | 30.8 | -32.3 | 30.6 | 42.8 | 18.9 | 26.6 | 30.8 | |
| 35 | Claude 4.5 Haiku (Reasoning) Anthropic |
2025-10-15 | $539 | 17.99 | 30.0 | -33.1 | 29.6 | 43.9 | 16.4 | 27.3 | 30.0 | |
| 36 | Step 3.7 Flash StepFun |
2026-05-29 | $320 | 10.84 | 29.5 | -33.5 | 29.7 | 37.3 | 21.5 | 25.1 | 29.5 | |
| 37 | Grok 4.3 (Non-reasoning) xAI |
2026-04-30 | $213 | 7.72 | 27.6 | -35.5 | 24.8 | 35.2 | 22.8 | 23.3 | 27.6 | |
| 38 | ✓ | Gemma 4 26B A4B (Reasoning) |
2026-04-02 | $54.5 | 2.15 | 25.3 | -37.7 | 25.7 | 39.3 | 11.0 | 17.4 | 25.3 |
| 39 | NVIDIA Nemotron 3 Super 120B A12B (Reasoning) NVIDIA |
2026-03-11 | $295 | 12.30 | 23.9 | -39.1 | 25.4 | 37.7 | 8.7 | 24.7 | 23.9 | |
| 40 | gpt-oss-120b (high) OpenAI |
2025-08-05 | $96.3 | 4.28 | 22.5 | -40.6 | 23.8 | 30.4 | 13.2 | 19.8 | 22.5 | |
| 41 | Gemini 3.1 Flash-Lite |
2026-03-03 | $93.7 | 4.27 | 22.0 | -41.1 | 25.0 | 34.7 | 6.2 | 19.7 | 22.0 | |
| 42 | Nova 2.0 Pro Preview (medium) Amazon |
2025-11-27 | $407 | 19.46 | 20.9 | -42.1 | 21.8 | 34.0 | 7.0 | 26.1 | 20.9 | |
| 43 | ✓ | gpt-oss-20B (high) OpenAI |
2025-08-05 | $29.9 | 2.32 | 12.9 | -50.2 | 14.9 | 20.7 | 3.1 | 14.8 | 12.9 |