Connections Evaluation Box Score
Latest runs for 33 models (>=11 puzzles, >40 guesses, sorted by solve rate, accuracy, avg time, cost per game)
Model Ver Date GP W WIN PCT ATT HIT MISS ERR ACC PCT TIME AVG/G TOK TOK/G COST $/G
openai/gpt-5.2-pro 2.0.2 2025-12-11 20 20 1.000 80 80 0 0 1.000 148m21s 7m25s 164.4k 8.2k $8.66 $0.433
google/gemini-3-pro-preview 2.0.2 2025-11-18 20 20 1.000 81 80 0 1 .987 49m16s 2m27s 350.2k 17.5k $1.55 $0.078
openai/o3 2.0.2 2025-12-12 20 20 1.000 81 80 1 0 .987 89m6s 4m27s 461.6k 23.1k $1.58 $0.079
openai/gpt-5 2.0.2 2025-10-03 11 11 1.000 45 44 1 0 .977 68m13s 6m12s 258.9k 23.5k $1.13 $0.103
x-ai/grok-4 2.0.2 2025-10-18 20 20 1.000 82 80 2 0 .975 139m16s 6m57s 543.7k 27.2k $2.53 $0.127
openrouter/sherlock-think-alpha 2.0.2 2025-11-16 20 20 1.000 83 80 3 0 .963 50m39s 2m31s 396.8k 19.8k $0.00 $0.000
x-ai/grok-4-fast 2.0.2 2025-10-03 20 20 1.000 83 80 3 0 .963 114m16s 5m42s 322.7k 16.1k $0.06 $0.003
google/gemini-3-flash-preview 2.0.2 2025-12-17 20 20 1.000 84 80 4 0 .952 8m59s 26s 274.6k 13.7k $0.15 $0.007
openai/gpt-5.2 2.0.2 2025-12-19 20 20 1.000 166 80 3 0 .481 112m6s 5m36s 624.3k 31.2k $2.57 $0.128
openai/gpt-5-mini 2.0.2 2025-12-19 20 20 1.000 168 80 4 0 .476 296m37s 14m49s 1262.8k 63.1k $0.95 $0.048
anthropic/claude-opus-4.5 2.0.2 2025-12-19 20 20 1.000 176 80 8 0 .454 129m18s 6m27s 1450.6k 72.5k $7.60 $0.380
z-ai/glm-4.7 2.0.2 2026-01-30 20 19 .950 88 77 10 1 .875 212m4s 10m36s 840.7k 42.0k $1.53 $0.077
moonshotai/kimi-k2.5 2.0.2 2026-01-29 20 19 .950 90 78 4 8 .866 300m16s 15m0s 2742.9k 137.1k $3.09 $0.155
openrouter/sherlock-dash-alpha 2.0.2 2025-11-16 20 19 .950 91 77 14 0 .846 21m42s 1m5s 371.0k 18.6k $0.00 $0.000
anthropic/claude-4.5-sonnet 2.0.2 2025-12-19 20 19 .950 172 78 8 0 .453 103m0s 5m9s 1143.7k 57.2k $3.62 $0.181
openai/o4-mini 2.0.2 2025-12-20 20 19 .950 194 78 19 0 .402 568m59s 28m26s 4681.0k 234.0k $9.65 $0.483
google/gemini-2.5-pro 2.0.2 2025-10-18 20 18 .900 89 75 12 2 .842 41m4s 2m3s 457.1k 22.9k $1.30 $0.065
deepseek/deepseek-v3.2 2.0.2 2025-12-02 20 17 .850 92 72 20 0 .782 174m33s 8m43s 537.7k 26.9k $0.08 $0.004
moonshotai/kimi-k2-thinking 2.0.2 2025-11-12 20 17 .850 101 74 16 11 .732 337m26s 16m52s 716.8k 35.8k $0.65 $0.032
deepseek/deepseek-r1-0528 2.0.2 2025-10-18 20 16 .800 99 69 30 0 .696 369m31s 18m28s 855.4k 42.8k $0.98 $0.049
anthropic/claude-haiku-4.5 2.0.2 2025-12-19 20 16 .800 188 70 24 0 .372 99m43s 4m59s 1999.1k 100.0k $2.23 $0.112
google/gemini-2.5-flash 2.0.2 2025-10-02 11 8 .727 49 32 17 0 .653 9m56s 54s 340.1k 30.9k $0.17 $0.016
openai/gpt-oss-120b 2.0.2 2025-10-18 20 14 .700 111 65 25 21 .585 97m23s 4m52s 775.2k 38.8k $0.12 $0.006
openai/o3-mini 2.0.2 2025-12-20 20 14 .700 200 64 35 1 .320 422m11s 21m6s 3461.0k 173.1k $6.89 $0.345
qwen/qwen3-max 2.0.2 2025-10-18 20 13 .650 106 63 29 14 .594 86m25s 4m19s 549.7k 27.5k $0.69 $0.034
moonshotai/kimi-k2-0905 2.0.2 2025-10-18 20 10 .500 100 53 47 0 .530 46m10s 2m18s 285.3k 14.3k $0.15 $0.008
openai/gpt-oss-20b 2.0.2 2025-10-18 20 8 .400 108 45 57 6 .416 162m48s 8m8s 1374.1k 68.7k $0.12 $0.006
z-ai/glm-4.6 2.0.2 2025-10-18 20 7 .350 115 45 48 22 .391 67m42s 3m23s 328.5k 16.4k $0.17 $0.008
microsoft/phi-4 2.0.2 2025-11-06 20 1 .050 102 17 79 6 .166 31m34s 1m34s 301.1k 15.1k $0.01 $0.001
mistralai/mistral-large 2.0.2 2025-12-20 20 1 .050 198 23 76 0 .116 156m43s 7m50s 1690.9k 84.5k $2.71 $0.136
meta-llama/llama-3.3-70b-instruct 2.0.2 2025-12-20 20 1 .050 194 19 77 1 .097 135m9s 6m45s 720.0k 36.0k $0.11 $0.005
amazon/nova-pro-v1 2.0.2 2025-12-20 20 1 .050 180 13 76 1 .072 31m4s 1m33s 748.9k 37.4k $0.51 $0.025
baidu/ernie-4.5-21b-a3b-thinking 2.0.2 2025-10-18 20 0 .000 99 18 80 1 .181 134m58s 6m44s 891.8k 44.6k $0.11 $0.005