Archive
Verified Runs
Every accepted leaderboard submission has a permanent static page.
| Rank | Model | Accuracy | Prompt | Dataset | Created | Run |
|---|---|---|---|---|---|---|
| 1 | gpt-5.5 | 95.60% | p001 | lexen-v1 | 2026-06-17T15:45:49.081775+00:00 | gpt-5.5-xhigh-reasoning-p001-lexen-v1-20260617 |
| 2 | gpt-5.5 | 95.25% | p001 | lexen-v1 | 2026-06-14T02:57:35.564261+00:00 | gpt-5.5-medium-reasoning-p001-lexen-v1-20260614 |
| 3 | gpt-5.5 | 95.19% | p001 | lexen-v1 | 2026-06-14T10:40:02.498375+00:00 | gpt-5.5-high-reasoning-p001-lexen-v1-20260614 |
| 4 | gpt-5.5 | 95.15% | p004 | lexen-v1 | 2026-06-16T02:04:07.572024+00:00 | gpt-5.5-medium-reasoning-p004-lexen-v1-20260616 |
| 5 | gpt-5.5 | 95.15% | p004 | lexen-v1 | 2026-06-17T16:05:54.293112+00:00 | gpt-5.5-xhigh-reasoning-p004-lexen-v1-20260617 |
| 6 | gpt-5.5 | 95.10% | p003 | lexen-v1 | 2026-06-17T08:02:30.444392+00:00 | gpt-5.5-high-reasoning-p003-lexen-v1-20260617 |
| 7 | gpt-5.5 | 95.00% | p003 | lexen-v1 | 2026-06-16T02:01:57.439526+00:00 | gpt-5.5-low-reasoning-p003-lexen-v1-20260616 |
| 8 | gpt-5.5 | 95.00% | p001 | lexen-v1 | 2026-06-14T10:38:21.202553+00:00 | gpt-5.5-low-reasoning-p001-lexen-v1-20260614 |
| 9 | gpt-5.5 | 94.98% | p003 | lexen-v1 | 2026-06-16T02:02:51.976097+00:00 | gpt-5.5-medium-reasoning-p003-lexen-v1-20260616 |
| 10 | gemini/gemini-3.1-pro-preview | 94.92% | p001 | lexen-v1 | 2026-06-14T13:03:44.802733+00:00 | gemini-3.1-pro-high-reasoning-p001-lexen-v1-20260614 |
| 11 | gemini/gemini-3.1-pro-preview | 94.61% | p001 | lexen-v1 | 2026-06-14T12:09:22.494330+00:00 | gemini-3.1-pro-low-reasoning-p001-lexen-v1-20260614 |
| 12 | claude-opus-4-8 | 94.57% | p001 | lexen-v1 | 2026-06-14T12:38:24.818958+00:00 | claude-opus-4.8-xhigh-reasoning-p001-lexen-v1-20260614 |
| 13 | gemini/gemini-3.1-pro-preview | 94.53% | p004 | lexen-v1 | 2026-06-16T02:57:54.138994+00:00 | gemini-3.1-pro-high-reasoning-p004-lexen-v1-20260616 |
| 14 | gemini/gemini-3.1-pro-preview | 94.43% | p001 | lexen-v1 | 2026-06-14T12:31:55.744716+00:00 | gemini-3.1-pro-medium-reasoning-p001-lexen-v1-20260614 |
| 15 | gemini/gemini-3.1-pro-preview | 94.40% | p003 | lexen-v1 | 2026-06-16T02:21:16.914989+00:00 | gemini-3.1-pro-low-reasoning-p003-lexen-v1-20260616 |
| 16 | gpt-5.5 | 94.22% | p002 | lexen-v1 | 2026-06-14T10:42:14.146514+00:00 | gpt-5.5-high-reasoning-p002-lexen-v1-20260614 |
| 17 | gemini/gemini-3.5-flash | 94.20% | p004 | lexen-v1 | 2026-06-16T03:05:35.303206+00:00 | gemini-3.5-flash-p004-lexen-v1-20260616 |
| 18 | claude-opus-4-8 | 94.16% | p001 | lexen-v1 | 2026-06-14T12:22:29.216965+00:00 | claude-opus-4.8-high-reasoning-p001-lexen-v1-20260614 |
| 19 | gemini/gemini-3.5-flash | 94.16% | p001 | lexen-v1 | 2026-06-14T11:40:41.558979+00:00 | gemini-3.5-flash-p001-lexen-v1-20260614 |
| 20 | gpt-5.5 | 94.03% | p002 | lexen-v1 | 2026-06-14T02:58:29.150879+00:00 | gpt-5.5-medium-reasoning-p002-lexen-v1-20260614 |
| 21 | claude-opus-4-8 | 93.85% | p001 | lexen-v1 | 2026-06-17T11:56:28.079841+00:00 | claude-opus-4.8-p001-lexen-v1-20260617 |
| 22 | claude-opus-4-8 | 93.81% | p004 | lexen-v1 | 2026-06-16T02:20:31.037218+00:00 | claude-opus-4.8-xhigh-reasoning-p004-lexen-v1-20260616 |
| 23 | google/gemma-4-31B-it | 93.73% | p003 | lexen-v1 | 2026-06-16T03:47:55.278110+00:00 | vllm-gemma-4-31b-fp8-h100-p003-lexen-v1-20260616-thinking |
| 24 | claude-opus-4-8 | 93.68% | p001 | lexen-v1 | 2026-06-14T12:02:22.453692+00:00 | claude-opus-4.8-low-reasoning-p001-lexen-v1-20260614 |
| 25 | claude-opus-4-8 | 93.62% | p001 | lexen-v1 | 2026-06-14T12:12:07.036543+00:00 | claude-opus-4.8-medium-reasoning-p001-lexen-v1-20260614 |
| 26 | google/gemma-4-31B-it | 93.44% | p004 | lexen-v1 | 2026-06-16T04:34:22.281346+00:00 | vllm-gemma-4-31b-fp8-h100-p004-lexen-v1-20260616-thinking |
| 27 | google/gemma-4-31B-it | 93.38% | p001 | lexen-v1 | 2026-06-15T16:44:33.637582+00:00 | vllm-gemma-4-31b-fp8-h100-p001-lexen-v1-20260615-thinking |
| 28 | openrouter/z-ai/glm-5 | 93.38% | p001 | lexen-v1 | 2026-06-14T19:07:58.881361+00:00 | glm-5-low-reasoning-p001-lexen-v1-20260614 |
| 29 | openrouter/moonshotai/kimi-k2.7-code | 93.31% | p001 | lexen-v1 | 2026-06-17T22:30:28.396407+00:00 | kimi-k2.7-code-xhigh-reasoning-p001-lexen-v1-20260617 |
| 30 | openrouter/x-ai/grok-4.3 | 93.13% | p001 | lexen-v1 | 2026-06-14T21:16:55.424172+00:00 | grok-4.3-medium-reasoning-p001-lexen-v1-20260614 |
| 31 | openrouter/x-ai/grok-4.3 | 93.09% | p001 | lexen-v1 | 2026-06-14T21:09:10.548112+00:00 | grok-4.3-low-reasoning-p001-lexen-v1-20260614 |
| 32 | claude-opus-4-7 | 93.01% | p001 | lexen-v1 | 2026-06-14T11:43:13.094657+00:00 | claude-opus-4.7-medium-reasoning-p001-lexen-v1-20260614 |
| 33 | gpt-5.5 | 92.88% | p003 | lexen-v1 | 2026-06-17T11:29:06.305785+00:00 | gpt-5.5-none-reasoning-p003-lexen-v1-20260617 |
| 34 | openrouter/x-ai/grok-4.3 | 92.88% | p001 | lexen-v1 | 2026-06-14T21:25:12.643492+00:00 | grok-4.3-high-reasoning-p001-lexen-v1-20260614 |
| 35 | openrouter/qwen/qwen3.7-max | 92.82% | p002 | lexen-v1 | 2026-06-14T21:04:35.848779+00:00 | qwen3.7-max-medium-reasoning-p002-lexen-v1-20260614 |
| 36 | openrouter/qwen/qwen3.7-plus | 92.70% | p001 | lexen-v1 | 2026-06-14T17:27:03.417524+00:00 | qwen3.7-plus-p001-lexen-v1-20260614 |
| 37 | claude-opus-4-6 | 92.70% | p001 | lexen-v1 | 2026-06-14T11:52:50.971670+00:00 | claude-opus-4.6-medium-reasoning-p001-lexen-v1-20260614 |
| 38 | openrouter/moonshotai/kimi-k2.5 | 92.66% | p001 | lexen-v1 | 2026-06-14T16:01:24.343537+00:00 | kimi-k2.5-p001-lexen-v1-20260614 |
| 39 | gemini/gemini-3-flash-preview | 92.57% | p002 | lexen-v1 | 2026-06-14T11:28:13.627340+00:00 | gemini-3-flash-p002-lexen-v1-20260614 |
| 40 | openrouter/deepseek/deepseek-v4-pro | 92.41% | p001 | lexen-v1 | 2026-06-14T18:07:45.068845+00:00 | deepseek-v4-pro-high-reasoning-p001-lexen-v1-20260614 |
| 41 | google/gemma-4-31B-it | 92.39% | p001 | lexen-v1 | 2026-06-14T13:05:53.315665+00:00 | vllm-gemma-4-31b-fp8-h100-p001-lexen-v1-20260614 |
| 42 | openrouter/moonshotai/kimi-k2.5 | 92.39% | p002 | lexen-v1 | 2026-06-14T16:49:12.308221+00:00 | kimi-k2.5-p002-lexen-v1-20260614 |
| 43 | google/gemma-4-31B-it | 92.29% | p003 | lexen-v1 | 2026-06-16T03:16:10.855287+00:00 | vllm-gemma-4-31b-fp8-h100-p003-lexen-v1-20260616 |
| 44 | openrouter/z-ai/glm-5 | 92.22% | p002 | lexen-v1 | 2026-06-14T19:49:03.779025+00:00 | glm-5-low-reasoning-p002-lexen-v1-20260614 |
| 45 | claude-opus-4-7 | 92.22% | p003 | lexen-v1 | 2026-06-17T08:24:14.201902+00:00 | claude-opus-4.7-medium-reasoning-p003-lexen-v1-20260617 |
| 46 | claude-opus-4-8 | 91.79% | p002 | lexen-v1 | 2026-06-14T12:49:29.194985+00:00 | claude-opus-4.8-high-reasoning-p002-lexen-v1-20260614 |
| 47 | openrouter/deepseek/deepseek-v4-pro | 91.50% | p002 | lexen-v1 | 2026-06-14T18:31:09.158241+00:00 | deepseek-v4-pro-high-reasoning-p002-lexen-v1-20260614 |
| 48 | openrouter/qwen/qwen3.7-plus | 91.46% | p002 | lexen-v1 | 2026-06-14T17:43:42.620560+00:00 | qwen3.7-plus-p002-lexen-v1-20260614 |
| 49 | gpt-5-mini | 91.22% | p003 | lexen-v1 | 2026-06-17T07:47:29.768045+00:00 | gpt-5-mini-high-reasoning-p003-lexen-v1-20260617 |
| 50 | google/gemma-4-31B-it | 91.20% | p002 | lexen-v1 | 2026-06-14T13:07:36.382312+00:00 | vllm-gemma-4-31b-fp8-h100-p002-lexen-v1-20260614 |
| 51 | gpt-5-mini | 90.99% | p003 | lexen-v1 | 2026-06-17T07:45:03.292014+00:00 | gpt-5-mini-medium-reasoning-p003-lexen-v1-20260617 |
| 52 | openrouter/deepseek/deepseek-v4-flash | 90.93% | p001 | lexen-v1 | 2026-06-14T10:48:09.221591+00:00 | deepseek-v4-flash-high-reasoning-p001-lexen-v1-20260614 |
| 53 | gpt-5.4-mini | 90.89% | p003 | lexen-v1 | 2026-06-17T07:38:06.024744+00:00 | gpt-5.4-mini-low-reasoning-p003-lexen-v1-20260617 |
| 54 | gpt-5-mini | 90.72% | p001 | lexen-v1 | 2026-06-14T03:01:27.567145+00:00 | gpt-5-mini-high-reasoning-p001-lexen-v1-20260614 |
| 55 | claude-sonnet-4-6 | 90.72% | p001 | lexen-v1 | 2026-06-14T11:25:24.877127+00:00 | claude-sonnet-4.6-low-reasoning-p001-lexen-v1-20260614 |
| 56 | openrouter/minimax/minimax-m3 | 90.62% | p001 | lexen-v1 | 2026-06-14T17:01:11.417107+00:00 | minimax-m3-p001-lexen-v1-20260614 |
| 57 | gpt-5-mini | 90.54% | p001 | lexen-v1 | 2026-06-14T02:59:58.106259+00:00 | gpt-5-mini-medium-reasoning-p001-lexen-v1-20260614 |
| 58 | gpt-5-mini | 90.48% | p003 | lexen-v1 | 2026-06-17T07:42:53.328684+00:00 | gpt-5-mini-low-reasoning-p003-lexen-v1-20260617 |
| 59 | claude-haiku-4-5 | 90.48% | p001 | lexen-v1 | 2026-06-14T10:58:15.856929+00:00 | claude-haiku-4.5-low-reasoning-p001-lexen-v1-20260614 |
| 60 | gemini/gemini-3.1-flash-lite | 90.41% | p001 | lexen-v1 | 2026-06-14T10:40:32.982176+00:00 | gemini-3.1-flash-lite-p001-lexen-v1-20260614 |
| 61 | gpt-5.4-mini | 90.10% | p001 | lexen-v1 | 2026-06-14T10:36:37.746436+00:00 | gpt-5.4-mini-low-reasoning-p001-lexen-v1-20260614 |
| 62 | gpt-5-mini | 90.04% | p002 | lexen-v1 | 2026-06-14T03:02:45.517164+00:00 | gpt-5-mini-medium-reasoning-p002-lexen-v1-20260614 |
| 63 | gpt-5-mini | 89.94% | p001 | lexen-v1 | 2026-06-14T02:59:23.338570+00:00 | gpt-5-mini-low-reasoning-p001-lexen-v1-20260614 |
| 64 | openrouter/deepseek/deepseek-v4-flash | 89.69% | p002 | lexen-v1 | 2026-06-14T11:07:29.161962+00:00 | deepseek-v4-flash-high-reasoning-p002-lexen-v1-20260614 |
| 65 | gemini/gemini-2.5-flash | 89.61% | p002 | lexen-v1 | 2026-06-14T11:07:54.336960+00:00 | gemini-2.5-flash-p002-lexen-v1-20260614 |
| 66 | Qwen/Qwen3.6-27B-FP8 | 89.51% | p001 | lexen-v1 | 2026-06-14T03:37:04.989093+00:00 | vllm-qwen3.6-27b-fp8-h200-p001-lexen-v1-20260614 |
| 67 | gpt-5.4-mini | 89.49% | p002 | lexen-v1 | 2026-06-14T10:37:18.500343+00:00 | gpt-5.4-mini-low-reasoning-p002-lexen-v1-20260614 |
| 68 | google/gemma-4-26B-A4B-it | 89.43% | p001 | lexen-v1 | 2026-06-14T13:02:48.331932+00:00 | vllm-gemma-4-26b-a4b-fp8-a100-p001-lexen-v1-20260614 |
| 69 | gemini/gemini-2.5-flash | 89.41% | p001 | lexen-v1 | 2026-06-14T11:03:39.454582+00:00 | gemini-2.5-flash-p001-lexen-v1-20260614 |
| 70 | Qwen/Qwen3.6-27B-FP8 | 89.36% | p001 | lexen-v1 | 2026-06-14T09:20:49.931691+00:00 | vllm-qwen3.6-27b-fp8-h100-p001-lexen-v1-20260614 |
| 71 | google/gemma-4-26B-A4B-it | 89.32% | p001 | lexen-v1 | 2026-06-14T03:43:54.686558+00:00 | vllm-gemma-4-26b-a4b-fp8-h200-p001-lexen-v1-20260614 |
| 72 | Qwen/Qwen3.5-397B-A17B | 89.24% | p001 | lexen-v1 | 2026-06-14T17:58:20.590182+00:00 | vllm-qwen3.5-397b-a17b-gptq-b300-p001-lexen-v1-20260614 |
| 73 | google/gemma-4-26B-A4B-it | 89.20% | p001 | lexen-v1 | 2026-06-14T12:56:57.382019+00:00 | vllm-gemma-4-26b-a4b-fp8-h100-p001-lexen-v1-20260614 |
| 74 | gpt-4.1 | 89.14% | p003 | lexen-v1 | 2026-06-17T07:33:57.499800+00:00 | gpt-4.1-p003-lexen-v1-20260617 |
| 75 | google/gemma-4-26B-A4B-it | 89.12% | p003 | lexen-v1 | 2026-06-16T02:50:20.177857+00:00 | vllm-gemma-4-26b-a4b-fp8-h100-p003-lexen-v1-20260616 |
| 76 | gemini/gemini-3.1-flash-lite | 89.08% | p002 | lexen-v1 | 2026-06-14T10:51:07.533822+00:00 | gemini-3.1-flash-lite-p002-lexen-v1-20260614 |
| 77 | Qwen/Qwen3.5-27B | 89.06% | p001 | lexen-v1 | 2026-06-14T21:46:09.965074+00:00 | vllm-qwen3.5-27b-bf16-a100-p001-lexen-v1-20260614 |
| 78 | Qwen/Qwen3.5-27B | 88.97% | p001 | lexen-v1 | 2026-06-14T21:14:44.906254+00:00 | vllm-qwen3.5-27b-bf16-h200-p001-lexen-v1-20260614 |
| 79 | Qwen/Qwen3.5-27B | 88.89% | p001 | lexen-v1 | 2026-06-14T18:22:27.863955+00:00 | vllm-qwen3.5-27b-bf16-h100-p001-lexen-v1-20260614 |
| 80 | claude-sonnet-4-6 | 88.89% | p002 | lexen-v1 | 2026-06-14T11:34:12.718972+00:00 | claude-sonnet-4.6-low-reasoning-p002-lexen-v1-20260614 |
| 81 | zai-org/GLM-4.6 | 88.71% | p001 | lexen-v1 | 2026-06-14T14:17:06.058829+00:00 | vllm-glm-4.6-awq-b300-p001-lexen-v1-20260614 |
| 82 | gpt-4.1 | 88.69% | p001 | lexen-v1 | 2026-06-14T10:22:33.163281+00:00 | gpt-4.1-p001-lexen-v1-20260614 |
| 83 | gpt-4.1 | 87.86% | p002 | lexen-v1 | 2026-06-14T10:31:43.235840+00:00 | gpt-4.1-p002-lexen-v1-20260614 |
| 84 | claude-haiku-4-5 | 87.60% | p002 | lexen-v1 | 2026-06-14T11:17:37.678627+00:00 | claude-haiku-4.5-low-reasoning-p002-lexen-v1-20260614 |
| 85 | meta-llama/Llama-4-Maverick-17B-128E-Instruct | 87.55% | p001 | lexen-v1 | 2026-06-14T15:13:31.931784+00:00 | vllm-llama-4-maverick-int4-b300-p001-lexen-v1-20260614 |
| 86 | google/gemma-4-26B-A4B-it | 87.25% | p002 | lexen-v1 | 2026-06-14T12:57:33.409617+00:00 | vllm-gemma-4-26b-a4b-fp8-h100-p002-lexen-v1-20260614 |
| 87 | google/gemma-4-26B-A4B-it | 87.25% | p002 | lexen-v1 | 2026-06-14T03:44:42.664730+00:00 | vllm-gemma-4-26b-a4b-fp8-h200-p002-lexen-v1-20260614 |
| 88 | Qwen/Qwen3.6-27B-FP8 | 87.20% | p002 | lexen-v1 | 2026-06-14T09:22:18.123666+00:00 | vllm-qwen3.6-27b-fp8-h100-p002-lexen-v1-20260614 |
| 89 | Qwen/Qwen3.6-27B-FP8 | 87.18% | p002 | lexen-v1 | 2026-06-14T03:38:33.776628+00:00 | vllm-qwen3.6-27b-fp8-h200-p002-lexen-v1-20260614 |
| 90 | zai-org/GLM-4.6 | 87.12% | p002 | lexen-v1 | 2026-06-14T14:21:11.936952+00:00 | vllm-glm-4.6-awq-b300-p002-lexen-v1-20260614 |
| 91 | google/gemma-4-26B-A4B-it | 87.10% | p002 | lexen-v1 | 2026-06-14T13:04:18.320439+00:00 | vllm-gemma-4-26b-a4b-fp8-a100-p002-lexen-v1-20260614 |
| 92 | Qwen/Qwen3.5-122B-A10B | 87.06% | p001 | lexen-v1 | 2026-06-14T20:59:20.246395+00:00 | vllm-qwen3.5-122b-a10b-fp8-h200-p001-lexen-v1-20260614 |
| 93 | Qwen/Qwen3.5-122B-A10B | 87.02% | p001 | lexen-v1 | 2026-06-14T17:26:39.154524+00:00 | vllm-qwen3.5-122b-a10b-fp8-b300-p001-lexen-v1-20260614 |
| 94 | Qwen/Qwen3.5-397B-A17B | 86.69% | p002 | lexen-v1 | 2026-06-14T18:01:16.443151+00:00 | vllm-qwen3.5-397b-a17b-gptq-b300-p002-lexen-v1-20260614 |
| 95 | gpt-5.4-nano | 86.63% | p003 | lexen-v1 | 2026-06-17T07:39:04.268726+00:00 | gpt-5.4-nano-low-reasoning-p003-lexen-v1-20260617 |
| 96 | Qwen/Qwen3-235B-A22B-Instruct-2507 | 86.59% | p001 | lexen-v1 | 2026-06-14T20:01:55.579872+00:00 | vllm-qwen3-235b-2507-awq-h200-p001-lexen-v1-20260614 |
| 97 | Qwen/Qwen3.5-27B | 86.55% | p002 | lexen-v1 | 2026-06-14T18:24:27.165748+00:00 | vllm-qwen3.5-27b-bf16-h100-p002-lexen-v1-20260614 |
| 98 | Qwen/Qwen3.5-27B | 86.55% | p002 | lexen-v1 | 2026-06-14T21:16:43.650053+00:00 | vllm-qwen3.5-27b-bf16-h200-p002-lexen-v1-20260614 |
| 99 | gpt-5-nano | 86.48% | p003 | lexen-v1 | 2026-06-17T07:52:26.335235+00:00 | gpt-5-nano-medium-reasoning-p003-lexen-v1-20260617 |
| 100 | Qwen/Qwen3.5-27B | 86.46% | p002 | lexen-v1 | 2026-06-14T21:51:15.471462+00:00 | vllm-qwen3.5-27b-bf16-a100-p002-lexen-v1-20260614 |
| 101 | gpt-5.4-nano | 86.26% | p001 | lexen-v1 | 2026-06-14T10:26:49.278471+00:00 | gpt-5.4-nano-low-reasoning-p001-lexen-v1-20260614 |
| 102 | google/gemma-4-E4B-it | 85.58% | p003 | lexen-v1 | 2026-06-16T12:12:21.657535+00:00 | vllm-gemma-4-e4b-fp8-h100-p003-lexen-v1-20260616-thinking |
| 103 | Qwen/Qwen3.6-35B-A3B-FP8 | 85.56% | p001 | lexen-v1 | 2026-06-14T19:07:57.673409+00:00 | vllm-qwen3.6-35b-a3b-fp8-h100-p001-lexen-v1-20260614 |
| 104 | Qwen/Qwen3.6-35B-A3B-FP8 | 85.50% | p001 | lexen-v1 | 2026-06-14T03:54:43.714581+00:00 | vllm-qwen3.6-35b-a3b-fp8-h200-p001-lexen-v1-20260614 |
| 105 | gpt-5-nano | 85.50% | p002 | lexen-v1 | 2026-06-14T03:09:12.248467+00:00 | gpt-5-nano-medium-reasoning-p002-lexen-v1-20260614 |
| 106 | Qwen/Qwen3.5-122B-A10B | 85.48% | p002 | lexen-v1 | 2026-06-14T21:01:03.796592+00:00 | vllm-qwen3.5-122b-a10b-fp8-h200-p002-lexen-v1-20260614 |
| 107 | Qwen/Qwen3.5-122B-A10B | 85.29% | p002 | lexen-v1 | 2026-06-14T17:28:29.170586+00:00 | vllm-qwen3.5-122b-a10b-fp8-b300-p002-lexen-v1-20260614 |
| 108 | Qwen/Qwen3-235B-A22B | 85.27% | p001 | lexen-v1 | 2026-06-14T17:02:46.131355+00:00 | vllm-qwen3-235b-a22b-fp8-b300-p001-lexen-v1-20260614 |
| 109 | CohereLabs/c4ai-command-a-03-2025 | 85.23% | p001 | lexen-v1 | 2026-06-14T12:51:16.703271+00:00 | vllm-command-a-fp8-h200-p001-lexen-v1-20260614 |
| 110 | google/gemma-4-E4B-it | 85.19% | p001 | lexen-v1 | 2026-06-15T16:54:34.206811+00:00 | vllm-gemma-4-e4b-fp8-h100-p001-lexen-v1-20260615-thinking |
| 111 | gpt-5-mini | 85.00% | p001 | lexen-v1 | 2026-06-14T02:58:57.465838+00:00 | gpt-5-mini-minimal-reasoning-p001-lexen-v1-20260614 |
| 112 | Qwen/Qwen2.5-72B-Instruct | 84.84% | p002 | lexen-v1 | 2026-06-14T21:23:56.668649+00:00 | vllm-qwen2.5-72b-awq-a100-p002-lexen-v1-20260614 |
| 113 | Qwen/Qwen2.5-72B-Instruct | 84.82% | p002 | lexen-v1 | 2026-06-14T18:10:51.795477+00:00 | vllm-qwen2.5-72b-awq-h100-p002-lexen-v1-20260614 |
| 114 | Qwen/Qwen2.5-72B-Instruct | 84.82% | p002 | lexen-v1 | 2026-06-14T19:36:35.051728+00:00 | vllm-qwen2.5-72b-awq-h200-p002-lexen-v1-20260614 |
| 115 | Qwen/Qwen3-235B-A22B-Instruct-2507 | 84.74% | p002 | lexen-v1 | 2026-06-14T20:04:54.740808+00:00 | vllm-qwen3-235b-2507-awq-h200-p002-lexen-v1-20260614 |
| 116 | Qwen/Qwen3.5-35B-A3B | 84.45% | p001 | lexen-v1 | 2026-06-14T21:29:51.511649+00:00 | vllm-qwen3.5-35b-a3b-bf16-h200-p001-lexen-v1-20260614 |
| 117 | gpt-5-nano | 84.32% | p001 | lexen-v1 | 2026-06-14T03:08:14.245995+00:00 | gpt-5-nano-high-reasoning-p001-lexen-v1-20260614 |
| 118 | Qwen/Qwen3-235B-A22B | 84.08% | p001 | lexen-v1 | 2026-06-14T20:30:58.208934+00:00 | vllm-qwen3-235b-a22b-awq-h200-p001-lexen-v1-20260614 |
| 119 | nvidia/NVIDIA-Nemotron-3-Super-120B-A12B | 83.93% | p003 | lexen-v1 | 2026-06-17T08:49:01.056118+00:00 | vllm-nemotron-3-super-120b-fp8-h200-p003-lexen-v1-20260617 |
| 120 | gpt-5.4-nano | 83.93% | p002 | lexen-v1 | 2026-06-14T10:27:23.808460+00:00 | gpt-5.4-nano-low-reasoning-p002-lexen-v1-20260614 |
| 121 | gpt-5-mini | 83.77% | p003 | lexen-v1 | 2026-06-17T07:40:43.737804+00:00 | gpt-5-mini-minimal-reasoning-p003-lexen-v1-20260617 |
| 122 | gpt-4.1-mini | 83.69% | p001 | lexen-v1 | 2026-06-14T03:10:31.779383+00:00 | gpt-4.1-mini-p001-lexen-v1-20260614 |
| 123 | gpt-5-nano | 83.56% | p003 | lexen-v1 | 2026-06-17T07:49:45.901920+00:00 | gpt-5-nano-low-reasoning-p003-lexen-v1-20260617 |
| 124 | gpt-5-nano | 83.56% | p001 | lexen-v1 | 2026-06-14T03:04:55.641744+00:00 | gpt-5-nano-medium-reasoning-p001-lexen-v1-20260614 |
| 125 | google/gemma-4-E4B-it | 83.44% | p001 | lexen-v1 | 2026-06-14T03:25:03.326707+00:00 | vllm-gemma-4-e4b-fp8-h200-p001-lexen-v1-20260614 |
| 126 | Qwen/Qwen2.5-72B-Instruct | 83.44% | p001 | lexen-v1 | 2026-06-14T21:09:10.349327+00:00 | vllm-qwen2.5-72b-awq-a100-p001-lexen-v1-20260614 |
| 127 | Qwen/Qwen2.5-72B-Instruct | 83.42% | p001 | lexen-v1 | 2026-06-14T19:32:09.858826+00:00 | vllm-qwen2.5-72b-awq-h200-p001-lexen-v1-20260614 |
| 128 | Qwen/Qwen2.5-72B-Instruct | 83.40% | p001 | lexen-v1 | 2026-06-14T18:06:23.582043+00:00 | vllm-qwen2.5-72b-awq-h100-p001-lexen-v1-20260614 |
| 129 | google/gemma-4-E4B-it | 83.30% | p002 | lexen-v1 | 2026-06-16T12:07:56.371195+00:00 | vllm-gemma-4-e4b-fp8-h100-p002-lexen-v1-20260616-thinking |
| 130 | gpt-4o-mini | 82.90% | p002 | lexen-v1 | 2026-06-14T03:13:09.202699+00:00 | gpt-4o-mini-p002-lexen-v1-20260614 |
| 131 | Qwen/Qwen3.6-35B-A3B-FP8 | 82.72% | p002 | lexen-v1 | 2026-06-14T19:08:36.001596+00:00 | vllm-qwen3.6-35b-a3b-fp8-h100-p002-lexen-v1-20260614 |
| 132 | Qwen/Qwen3.6-35B-A3B-FP8 | 82.70% | p002 | lexen-v1 | 2026-06-14T03:55:37.865365+00:00 | vllm-qwen3.6-35b-a3b-fp8-h200-p002-lexen-v1-20260614 |
| 133 | gpt-4o-mini | 82.70% | p001 | lexen-v1 | 2026-06-14T03:11:36.212233+00:00 | gpt-4o-mini-p001-lexen-v1-20260614 |
| 134 | nvidia/NVIDIA-Nemotron-3-Super-120B-A12B | 82.35% | p002 | lexen-v1 | 2026-06-14T19:12:56.557834+00:00 | vllm-nemotron-3-super-120b-fp8-h200-p002-lexen-v1-20260614 |
| 135 | Qwen/Qwen3-235B-A22B | 82.27% | p002 | lexen-v1 | 2026-06-14T17:04:02.610404+00:00 | vllm-qwen3-235b-a22b-fp8-b300-p002-lexen-v1-20260614 |
| 136 | Qwen/Qwen3.5-35B-A3B | 82.10% | p002 | lexen-v1 | 2026-06-14T21:30:47.805630+00:00 | vllm-qwen3.5-35b-a3b-bf16-h200-p002-lexen-v1-20260614 |
| 137 | nvidia/NVIDIA-Nemotron-3-Super-120B-A12B | 81.86% | p001 | lexen-v1 | 2026-06-14T19:11:26.351859+00:00 | vllm-nemotron-3-super-120b-fp8-h200-p001-lexen-v1-20260614 |
| 138 | meta-llama/Llama-4-Maverick-17B-128E-Instruct | 81.53% | p002 | lexen-v1 | 2026-06-14T15:17:21.457755+00:00 | vllm-llama-4-maverick-int4-b300-p002-lexen-v1-20260614 |
| 139 | google/gemma-4-E4B-it | 81.44% | p003 | lexen-v1 | 2026-06-16T02:47:50.795294+00:00 | vllm-gemma-4-e4b-fp8-h100-p003-lexen-v1-20260616 |
| 140 | Qwen/Qwen3-235B-A22B | 81.03% | p002 | lexen-v1 | 2026-06-14T20:34:00.314833+00:00 | vllm-qwen3-235b-a22b-awq-h200-p002-lexen-v1-20260614 |
| 141 | Qwen/Qwen3.5-9B | 80.91% | p001 | lexen-v1 | 2026-06-14T22:26:23.261470+00:00 | vllm-qwen3.5-9b-bf16-a100-p001-lexen-v1-20260614 |
| 142 | Qwen/Qwen3.5-9B | 80.89% | p001 | lexen-v1 | 2026-06-14T21:37:26.714544+00:00 | vllm-qwen3.5-9b-bf16-h200-p001-lexen-v1-20260614 |
| 143 | Qwen/Qwen3.5-9B | 80.79% | p001 | lexen-v1 | 2026-06-14T18:55:28.122562+00:00 | vllm-qwen3.5-9b-bf16-h100-p001-lexen-v1-20260614 |
| 144 | gpt-4.1-mini | 80.54% | p002 | lexen-v1 | 2026-06-14T03:10:47.348916+00:00 | gpt-4.1-mini-p002-lexen-v1-20260614 |
| 145 | zai-org/GLM-4-9B-0414 | 80.46% | p002 | lexen-v1 | 2026-06-14T13:13:51.676910+00:00 | vllm-glm-4-9b-bf16-a100-p002-lexen-v1-20260614 |
| 146 | zai-org/GLM-4-9B-0414 | 79.78% | p001 | lexen-v1 | 2026-06-14T13:12:11.870521+00:00 | vllm-glm-4-9b-bf16-a100-p001-lexen-v1-20260614 |
| 147 | mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 79.39% | p002 | lexen-v1 | 2026-06-14T04:09:21.457204+00:00 | vllm-mistral-small-3.2-fp8-h200-p002-lexen-v1-20260614 |
| 148 | mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 79.28% | p002 | lexen-v1 | 2026-06-14T17:01:31.952947+00:00 | vllm-mistral-small-3.2-fp8-h100-p002-lexen-v1-20260614 |
| 149 | mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 79.00% | p002 | lexen-v1 | 2026-06-14T19:43:05.518865+00:00 | vllm-mistral-small-3.2-fp8-a100-p002-lexen-v1-20260614 |
| 150 | google/gemma-4-E4B-it | 78.89% | p002 | lexen-v1 | 2026-06-14T03:25:43.646502+00:00 | vllm-gemma-4-e4b-fp8-h200-p002-lexen-v1-20260614 |
| 151 | google/gemma-4-E2B-it | 78.89% | p003 | lexen-v1 | 2026-06-16T11:55:39.619700+00:00 | vllm-gemma-4-e2b-fp8-h100-p003-lexen-v1-20260616-thinking |
| 152 | google/gemma-4-E4B-it | 78.87% | p002 | lexen-v1 | 2026-06-14T09:10:08.959121+00:00 | vllm-gemma-4-e4b-fp8-h100-p002-lexen-v1-20260614 |
| 153 | tencent/Hunyuan-A13B-Instruct | 78.73% | p001 | lexen-v1 | 2026-06-14T13:56:24.339501+00:00 | vllm-hunyuan-a13b-int4-a100-p001-lexen-v1-20260614 |
| 154 | zai-org/GLM-4.7-Flash | 77.60% | p001 | lexen-v1 | 2026-06-14T13:25:17.068123+00:00 | vllm-glm-4.7-flash-bf16-a100-p001-lexen-v1-20260614 |
| 155 | mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 77.17% | p001 | lexen-v1 | 2026-06-14T19:37:17.468158+00:00 | vllm-mistral-small-3.2-fp8-a100-p001-lexen-v1-20260614 |
| 156 | google/gemma-4-E2B-it | 77.12% | p002 | lexen-v1 | 2026-06-16T11:51:52.132443+00:00 | vllm-gemma-4-e2b-fp8-h100-p002-lexen-v1-20260616-thinking |
| 157 | gpt-5-nano | 76.98% | p001 | lexen-v1 | 2026-06-14T03:03:34.449002+00:00 | gpt-5-nano-low-reasoning-p001-lexen-v1-20260614 |
| 158 | mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 76.75% | p001 | lexen-v1 | 2026-06-14T17:00:10.756548+00:00 | vllm-mistral-small-3.2-fp8-h100-p001-lexen-v1-20260614 |
| 159 | mistralai/Mistral-Small-3.2-24B-Instruct-2506 | 76.61% | p001 | lexen-v1 | 2026-06-14T04:08:01.202862+00:00 | vllm-mistral-small-3.2-fp8-h200-p001-lexen-v1-20260614 |
| 160 | Qwen/Qwen3.5-9B | 76.34% | p002 | lexen-v1 | 2026-06-14T18:56:17.224420+00:00 | vllm-qwen3.5-9b-bf16-h100-p002-lexen-v1-20260614 |
| 161 | Qwen/Qwen3.5-9B | 76.28% | p002 | lexen-v1 | 2026-06-14T21:38:16.517970+00:00 | vllm-qwen3.5-9b-bf16-h200-p002-lexen-v1-20260614 |
| 162 | zai-org/GLM-4.7-Flash | 75.70% | p002 | lexen-v1 | 2026-06-14T13:26:48.163045+00:00 | vllm-glm-4.7-flash-bf16-a100-p002-lexen-v1-20260614 |
| 163 | allenai/Olmo-3.1-32B-Instruct | 73.98% | p001 | lexen-v1 | 2026-06-14T17:48:27.265217+00:00 | vllm-olmo-3.1-32b-fp8-h100-p001-lexen-v1-20260614 |
| 164 | allenai/Olmo-3.1-32B-Instruct | 73.32% | p002 | lexen-v1 | 2026-06-14T17:49:57.208847+00:00 | vllm-olmo-3.1-32b-fp8-h100-p002-lexen-v1-20260614 |
| 165 | gpt-4.1-nano | 73.28% | p002 | lexen-v1 | 2026-06-14T03:11:19.453636+00:00 | gpt-4.1-nano-p002-lexen-v1-20260614 |
| 166 | ibm-granite/granite-4.1-8b-fp8 | 70.17% | p001 | lexen-v1 | 2026-06-14T03:08:55.286627+00:00 | vllm-granite-4.1-8b-fp8-h200-p001-lexen-v1-20260614 |
| 167 | google/gemma-4-E2B-it | 70.13% | p001 | lexen-v1 | 2026-06-14T03:19:39.961246+00:00 | vllm-gemma-4-e2b-fp8-h200-p001-lexen-v1-20260614 |
| 168 | gpt-4.1-nano | 70.07% | p001 | lexen-v1 | 2026-06-14T03:10:59.879737+00:00 | gpt-4.1-nano-p001-lexen-v1-20260614 |
| 169 | ibm-granite/granite-4.1-8b-fp8 | 69.92% | p001 | lexen-v1 | 2026-06-14T08:55:32.038625+00:00 | vllm-granite-4.1-8b-fp8-h100-p001-lexen-v1-20260614 |
| 170 | nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B | 69.22% | p002 | lexen-v1 | 2026-06-14T17:08:45.708843+00:00 | vllm-nemotron-3-nano-30b-fp8-h100-p002-lexen-v1-20260614 |
| 171 | ibm-granite/granite-4.1-8b | 69.02% | p001 | lexen-v1 | 2026-06-14T13:38:08.800623+00:00 | vllm-granite-4.1-8b-bf16-a100-p001-lexen-v1-20260614 |
| 172 | nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B | 65.34% | p003 | lexen-v1 | 2026-06-17T08:25:49.592029+00:00 | vllm-nemotron-3-nano-30b-fp8-h100-p003-lexen-v1-20260617 |
| 173 | google/gemma-4-E2B-it | 65.07% | p003 | lexen-v1 | 2026-06-16T02:48:05.643909+00:00 | vllm-gemma-4-e2b-fp8-h100-p003-lexen-v1-20260616 |
| 174 | microsoft/Phi-4-mini-instruct | 63.98% | p001 | lexen-v1 | 2026-06-14T20:23:39.820467+00:00 | vllm-phi-4-mini-bf16-a100-p001-lexen-v1-20260614 |
| 175 | nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B | 63.55% | p001 | lexen-v1 | 2026-06-14T17:08:07.789507+00:00 | vllm-nemotron-3-nano-30b-fp8-h100-p001-lexen-v1-20260614 |
| 176 | google/gemma-4-E2B-it | 61.65% | p002 | lexen-v1 | 2026-06-14T09:05:03.172757+00:00 | vllm-gemma-4-e2b-fp8-h100-p002-lexen-v1-20260614 |
| 177 | google/gemma-4-E2B-it | 61.63% | p002 | lexen-v1 | 2026-06-14T03:20:22.722243+00:00 | vllm-gemma-4-e2b-fp8-h200-p002-lexen-v1-20260614 |
| 178 | gpt-5-nano | 59.06% | p001 | lexen-v1 | 2026-06-14T03:03:01.790422+00:00 | gpt-5-nano-minimal-reasoning-p001-lexen-v1-20260614 |
| 179 | meta-llama/Llama-3.1-8B-Instruct | 58.82% | p001 | lexen-v1 | 2026-06-14T14:08:58.621756+00:00 | vllm-llama-3.1-8b-bf16-a100-p001-lexen-v1-20260614 |
| 180 | meta-llama/Llama-3.1-8B-Instruct | 58.05% | p001 | lexen-v1 | 2026-06-14T03:14:37.607430+00:00 | vllm-llama-3.1-8b-fp8-h200-p001-lexen-v1-20260614 |
| 181 | meta-llama/Llama-3.1-8B-Instruct | 57.99% | p001 | lexen-v1 | 2026-06-14T08:59:47.839377+00:00 | vllm-llama-3.1-8b-fp8-h100-p001-lexen-v1-20260614 |
| 182 | meta-llama/Llama-3.1-8B-Instruct | 44.31% | p002 | lexen-v1 | 2026-06-14T09:00:22.409515+00:00 | vllm-llama-3.1-8b-fp8-h100-p002-lexen-v1-20260614 |
| 183 | meta-llama/Llama-3.1-8B-Instruct | 44.21% | p002 | lexen-v1 | 2026-06-14T03:15:18.793310+00:00 | vllm-llama-3.1-8b-fp8-h200-p002-lexen-v1-20260614 |
| 184 | meta-llama/Llama-3.1-8B-Instruct | 39.85% | p002 | lexen-v1 | 2026-06-14T14:10:27.680965+00:00 | vllm-llama-3.1-8b-bf16-a100-p002-lexen-v1-20260614 |