Run #636
success · fetched 2026-06-16 22:00:11 · 9.76 MB raw HTML · 538 models
Top quality: Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) (67.5 pts)
| #? | Pareto? | Model? | Released? | Cost$? | $/Q? | Qual? | ΔTop? | Intel? | Code? | Agent? | Pen? | Score? |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ✓ | Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) Anthropic |
2026-06-09 | $6,228 | 92.29 | 67.5 | 0.0 | 59.9 | 62.0 | 80.6 | 37.9 | 67.5 |
| 2 | ✓ | Claude Opus 4.8 (Adaptive Reasoning, Max Effort) Anthropic |
2026-05-28 | $3,736 | 58.93 | 63.4 | -4.1 | 55.7 | 56.7 | 77.8 | 35.7 | 63.4 |
| 3 | ✓ | GPT-5.5 (xhigh) OpenAI |
2026-04-23 | $2,865 | 45.70 | 62.7 | -4.8 | 54.8 | 59.1 | 74.1 | 34.6 | 62.7 |
| 4 | ✓ | GPT-5.5 (high) OpenAI |
2026-04-23 | $1,775 | 29.01 | 61.2 | -6.3 | 53.1 | 58.5 | 72.0 | 32.5 | 61.2 |
| 5 | Claude Opus 4.7 (Adaptive Reasoning, Max Effort) Anthropic |
2026-04-16 | $3,738 | 63.23 | 59.1 | -8.4 | 53.5 | 52.5 | 71.3 | 35.7 | 59.1 | |
| 6 | GPT-5.4 (xhigh) OpenAI |
2026-03-05 | $2,357 | 40.05 | 58.9 | -8.6 | 51.4 | 57.2 | 68.0 | 33.7 | 58.9 | |
| 7 | ✓ | Gemini 3.5 Flash (high) |
2026-05-19 | $1,071 | 19.41 | 55.2 | -12.3 | 50.2 | 45.0 | 70.3 | 30.3 | 55.2 |
| 8 | Qwen3.7 Max Alibaba |
2026-05-19 | $1,643 | 30.30 | 54.2 | -13.3 | 46.0 | 50.1 | 66.6 | 32.2 | 54.2 | |
| 9 | Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) Anthropic |
2026-02-17 | $3,356 | 62.47 | 53.7 | -13.8 | 47.2 | 50.9 | 63.0 | 35.3 | 53.7 | |
| 10 | ✓ | Gemini 3.1 Pro Preview |
2026-02-19 | $829 | 15.44 | 53.7 | -13.8 | 46.5 | 55.5 | 59.1 | 29.2 | 53.7 |
| 11 | ✓ | DeepSeek V4 Pro (Reasoning, Max Effort) DeepSeek |
2026-04-24 | $180 | 3.39 | 53.0 | -14.5 | 44.3 | 47.5 | 67.2 | 22.5 | 53.0 |
| 12 | MiniMax-M3 MiniMax |
2026-06-01 | $260 | 4.98 | 52.2 | -15.3 | 44.4 | 43.4 | 68.6 | 24.1 | 52.2 | |
| 13 | Kimi K2.6 Kimi |
2026-04-20 | $839 | 16.13 | 52.0 | -15.5 | 42.8 | 47.1 | 66.0 | 29.2 | 52.0 | |
| 14 | ✓ | MiMo-V2.5-Pro Xiaomi |
2026-04-22 | $99.1 | 1.92 | 51.7 | -15.7 | 42.2 | 45.5 | 67.4 | 20.0 | 51.7 |
| 15 | GLM-5.1 (Reasoning) Z AI |
2026-04-07 | $689 | 13.72 | 50.2 | -17.3 | 40.2 | 43.4 | 67.1 | 28.4 | 50.2 | |
| 16 | Qwen3.7 Plus Alibaba |
2026-06-01 | $149 | 2.96 | 50.2 | -17.3 | 39.0 | 46.5 | 65.1 | 21.7 | 50.2 | |
| 17 | GPT-5.4 mini (xhigh) OpenAI |
2026-03-17 | $1,158 | 23.10 | 50.1 | -17.4 | 40.0 | 51.5 | 58.9 | 30.6 | 50.1 | |
| 18 | Grok 4.3 (high) xAI |
2026-04-30 | $332 | 6.89 | 48.2 | -19.3 | 37.6 | 41.0 | 65.9 | 25.2 | 48.2 | |
| 19 | Qwen3.6 Plus Alibaba |
2026-04-02 | $534 | 11.12 | 48.0 | -19.4 | 39.6 | 42.9 | 61.7 | 27.3 | 48.0 | |
| 20 | MiniMax-M2.7 MiniMax |
2026-03-18 | $144 | 3.05 | 47.2 | -20.3 | 38.1 | 41.9 | 61.5 | 21.6 | 47.2 | |
| 21 | ✓ | DeepSeek V4 Flash (Reasoning, Max Effort) DeepSeek |
2026-04-24 | $89.9 | 1.92 | 46.8 | -20.7 | 40.3 | 38.7 | 61.3 | 19.5 | 46.8 |
| 22 | Qwen3.6 27B (Reasoning) Alibaba |
2026-04-22 | $657 | 14.44 | 45.5 | -22.0 | 37.1 | 36.5 | 62.9 | 28.2 | 45.5 | |
| 23 | Nemotron 3 Ultra 550B A55B (Reasoning) NVIDIA |
2026-06-04 | $443 | 10.04 | 44.1 | -23.4 | 37.8 | 37.6 | 57.1 | 26.5 | 44.1 | |
| 24 | Qwen3.5 397B A17B (Reasoning) Alibaba |
2026-02-16 | $528 | 12.11 | 43.6 | -23.9 | 33.7 | 41.3 | 55.8 | 27.2 | 43.6 | |
| 25 | GPT-5.4 nano (xhigh) OpenAI |
2026-03-17 | $287 | 6.63 | 43.3 | -24.2 | 38.2 | 43.9 | 47.6 | 24.6 | 43.3 | |
| 26 | Step 3.7 Flash StepFun |
2026-05-29 | $320 | 7.61 | 42.1 | -25.4 | 29.7 | 37.1 | 59.5 | 25.1 | 42.1 | |
| 27 | Qwen3.5 122B A10B (Reasoning) Alibaba |
2026-02-24 | $446 | 11.14 | 40.0 | -27.5 | 32.3 | 34.7 | 53.0 | 26.5 | 40.0 | |
| 28 | Mistral Medium 3.5 Mistral |
2026-04-29 | $1,478 | 37.42 | 39.5 | -28.0 | 29.9 | 35.4 | 53.2 | 31.7 | 39.5 | |
| 29 | Ring-2.6-1T InclusionAI |
2026-05-08 | $458 | 11.91 | 38.5 | -29.0 | 30.6 | 33.3 | 51.5 | 26.6 | 38.5 | |
| 30 | Claude 4.5 Haiku (Reasoning) Anthropic |
2025-10-15 | $539 | 15.80 | 34.1 | -33.4 | 29.6 | 32.6 | 40.2 | 27.3 | 34.1 | |
| 31 | Nova 2.0 Pro Preview (medium) Amazon |
2025-11-27 | $407 | 12.31 | 33.0 | -34.4 | 21.8 | 30.4 | 47.0 | 26.1 | 33.0 | |
| 32 | Grok 4.3 (Non-reasoning) xAI |
2026-04-30 | $344 | 10.46 | 32.9 | -34.6 | 24.8 | 25.1 | 48.8 | 25.4 | 32.9 | |
| 33 | NVIDIA Nemotron 3 Super 120B A12B (Reasoning) NVIDIA |
2026-03-11 | $295 | 9.13 | 32.3 | -35.2 | 25.4 | 31.2 | 40.2 | 24.7 | 32.3 | |
| 34 | gpt-oss-120b (high) OpenAI |
2025-08-05 | $96.3 | 3.20 | 30.1 | -37.4 | 23.8 | 28.6 | 37.9 | 19.8 | 30.1 | |
| 35 | Gemini 3.1 Flash-Lite |
2026-03-03 | $94.8 | 3.52 | 26.9 | -40.5 | 25.0 | 30.1 | 25.7 | 19.8 | 26.9 | |
| 36 | ✓ | Gemma 4 26B A4B (Reasoning) |
2026-04-02 | $54.5 | 2.04 | 26.8 | -40.7 | 25.7 | 22.4 | 32.1 | 17.4 | 26.8 |
| 37 | ✓ | gpt-oss-20B (high) OpenAI |
2025-08-05 | $29.9 | 1.47 | 20.3 | -47.1 | 14.9 | 18.5 | 27.6 | 14.8 | 20.3 |