Archive

Verified Runs

Every accepted leaderboard submission has a permanent static page.

Rank Model Accuracy Prompt Dataset Created Run
1 gpt-5.5 95.60% p001 lexen-v1 2026-06-17T15:45:49.081775+00:00 gpt-5.5-xhigh-reasoning-p001-lexen-v1-20260617
2 gpt-5.5 95.25% p001 lexen-v1 2026-06-14T02:57:35.564261+00:00 gpt-5.5-medium-reasoning-p001-lexen-v1-20260614
3 gpt-5.5 95.19% p001 lexen-v1 2026-06-14T10:40:02.498375+00:00 gpt-5.5-high-reasoning-p001-lexen-v1-20260614
4 gpt-5.5 95.15% p004 lexen-v1 2026-06-16T02:04:07.572024+00:00 gpt-5.5-medium-reasoning-p004-lexen-v1-20260616
5 gpt-5.5 95.15% p004 lexen-v1 2026-06-17T16:05:54.293112+00:00 gpt-5.5-xhigh-reasoning-p004-lexen-v1-20260617
6 gpt-5.5 95.10% p003 lexen-v1 2026-06-17T08:02:30.444392+00:00 gpt-5.5-high-reasoning-p003-lexen-v1-20260617
7 gpt-5.5 95.00% p003 lexen-v1 2026-06-16T02:01:57.439526+00:00 gpt-5.5-low-reasoning-p003-lexen-v1-20260616
8 gpt-5.5 95.00% p001 lexen-v1 2026-06-14T10:38:21.202553+00:00 gpt-5.5-low-reasoning-p001-lexen-v1-20260614
9 gpt-5.5 94.98% p003 lexen-v1 2026-06-16T02:02:51.976097+00:00 gpt-5.5-medium-reasoning-p003-lexen-v1-20260616
10 gemini/gemini-3.1-pro-preview 94.92% p001 lexen-v1 2026-06-14T13:03:44.802733+00:00 gemini-3.1-pro-high-reasoning-p001-lexen-v1-20260614
11 gemini/gemini-3.1-pro-preview 94.61% p001 lexen-v1 2026-06-14T12:09:22.494330+00:00 gemini-3.1-pro-low-reasoning-p001-lexen-v1-20260614
12 claude-opus-4-8 94.57% p001 lexen-v1 2026-06-14T12:38:24.818958+00:00 claude-opus-4.8-xhigh-reasoning-p001-lexen-v1-20260614
13 gemini/gemini-3.1-pro-preview 94.53% p004 lexen-v1 2026-06-16T02:57:54.138994+00:00 gemini-3.1-pro-high-reasoning-p004-lexen-v1-20260616
14 gemini/gemini-3.1-pro-preview 94.43% p001 lexen-v1 2026-06-14T12:31:55.744716+00:00 gemini-3.1-pro-medium-reasoning-p001-lexen-v1-20260614
15 gemini/gemini-3.1-pro-preview 94.40% p003 lexen-v1 2026-06-16T02:21:16.914989+00:00 gemini-3.1-pro-low-reasoning-p003-lexen-v1-20260616
16 gpt-5.5 94.22% p002 lexen-v1 2026-06-14T10:42:14.146514+00:00 gpt-5.5-high-reasoning-p002-lexen-v1-20260614
17 gemini/gemini-3.5-flash 94.20% p004 lexen-v1 2026-06-16T03:05:35.303206+00:00 gemini-3.5-flash-p004-lexen-v1-20260616
18 claude-opus-4-8 94.16% p001 lexen-v1 2026-06-14T12:22:29.216965+00:00 claude-opus-4.8-high-reasoning-p001-lexen-v1-20260614
19 gemini/gemini-3.5-flash 94.16% p001 lexen-v1 2026-06-14T11:40:41.558979+00:00 gemini-3.5-flash-p001-lexen-v1-20260614
20 gpt-5.5 94.03% p002 lexen-v1 2026-06-14T02:58:29.150879+00:00 gpt-5.5-medium-reasoning-p002-lexen-v1-20260614
21 claude-opus-4-8 93.85% p001 lexen-v1 2026-06-17T11:56:28.079841+00:00 claude-opus-4.8-p001-lexen-v1-20260617
22 claude-opus-4-8 93.81% p004 lexen-v1 2026-06-16T02:20:31.037218+00:00 claude-opus-4.8-xhigh-reasoning-p004-lexen-v1-20260616
23 google/gemma-4-31B-it 93.73% p003 lexen-v1 2026-06-16T03:47:55.278110+00:00 vllm-gemma-4-31b-fp8-h100-p003-lexen-v1-20260616-thinking
24 claude-opus-4-8 93.68% p001 lexen-v1 2026-06-14T12:02:22.453692+00:00 claude-opus-4.8-low-reasoning-p001-lexen-v1-20260614
25 claude-opus-4-8 93.62% p001 lexen-v1 2026-06-14T12:12:07.036543+00:00 claude-opus-4.8-medium-reasoning-p001-lexen-v1-20260614
26 google/gemma-4-31B-it 93.44% p004 lexen-v1 2026-06-16T04:34:22.281346+00:00 vllm-gemma-4-31b-fp8-h100-p004-lexen-v1-20260616-thinking
27 google/gemma-4-31B-it 93.38% p001 lexen-v1 2026-06-15T16:44:33.637582+00:00 vllm-gemma-4-31b-fp8-h100-p001-lexen-v1-20260615-thinking
28 openrouter/z-ai/glm-5 93.38% p001 lexen-v1 2026-06-14T19:07:58.881361+00:00 glm-5-low-reasoning-p001-lexen-v1-20260614
29 openrouter/moonshotai/kimi-k2.7-code 93.31% p001 lexen-v1 2026-06-17T22:30:28.396407+00:00 kimi-k2.7-code-xhigh-reasoning-p001-lexen-v1-20260617
30 openrouter/x-ai/grok-4.3 93.13% p001 lexen-v1 2026-06-14T21:16:55.424172+00:00 grok-4.3-medium-reasoning-p001-lexen-v1-20260614
31 openrouter/x-ai/grok-4.3 93.09% p001 lexen-v1 2026-06-14T21:09:10.548112+00:00 grok-4.3-low-reasoning-p001-lexen-v1-20260614
32 claude-opus-4-7 93.01% p001 lexen-v1 2026-06-14T11:43:13.094657+00:00 claude-opus-4.7-medium-reasoning-p001-lexen-v1-20260614
33 gpt-5.5 92.88% p003 lexen-v1 2026-06-17T11:29:06.305785+00:00 gpt-5.5-none-reasoning-p003-lexen-v1-20260617
34 openrouter/x-ai/grok-4.3 92.88% p001 lexen-v1 2026-06-14T21:25:12.643492+00:00 grok-4.3-high-reasoning-p001-lexen-v1-20260614
35 openrouter/qwen/qwen3.7-max 92.82% p002 lexen-v1 2026-06-14T21:04:35.848779+00:00 qwen3.7-max-medium-reasoning-p002-lexen-v1-20260614
36 openrouter/qwen/qwen3.7-plus 92.70% p001 lexen-v1 2026-06-14T17:27:03.417524+00:00 qwen3.7-plus-p001-lexen-v1-20260614
37 claude-opus-4-6 92.70% p001 lexen-v1 2026-06-14T11:52:50.971670+00:00 claude-opus-4.6-medium-reasoning-p001-lexen-v1-20260614
38 openrouter/moonshotai/kimi-k2.5 92.66% p001 lexen-v1 2026-06-14T16:01:24.343537+00:00 kimi-k2.5-p001-lexen-v1-20260614
39 gemini/gemini-3-flash-preview 92.57% p002 lexen-v1 2026-06-14T11:28:13.627340+00:00 gemini-3-flash-p002-lexen-v1-20260614
40 openrouter/deepseek/deepseek-v4-pro 92.41% p001 lexen-v1 2026-06-14T18:07:45.068845+00:00 deepseek-v4-pro-high-reasoning-p001-lexen-v1-20260614
41 google/gemma-4-31B-it 92.39% p001 lexen-v1 2026-06-14T13:05:53.315665+00:00 vllm-gemma-4-31b-fp8-h100-p001-lexen-v1-20260614
42 openrouter/moonshotai/kimi-k2.5 92.39% p002 lexen-v1 2026-06-14T16:49:12.308221+00:00 kimi-k2.5-p002-lexen-v1-20260614
43 google/gemma-4-31B-it 92.29% p003 lexen-v1 2026-06-16T03:16:10.855287+00:00 vllm-gemma-4-31b-fp8-h100-p003-lexen-v1-20260616
44 openrouter/z-ai/glm-5 92.22% p002 lexen-v1 2026-06-14T19:49:03.779025+00:00 glm-5-low-reasoning-p002-lexen-v1-20260614
45 claude-opus-4-7 92.22% p003 lexen-v1 2026-06-17T08:24:14.201902+00:00 claude-opus-4.7-medium-reasoning-p003-lexen-v1-20260617
46 claude-opus-4-8 91.79% p002 lexen-v1 2026-06-14T12:49:29.194985+00:00 claude-opus-4.8-high-reasoning-p002-lexen-v1-20260614
47 openrouter/deepseek/deepseek-v4-pro 91.50% p002 lexen-v1 2026-06-14T18:31:09.158241+00:00 deepseek-v4-pro-high-reasoning-p002-lexen-v1-20260614
48 openrouter/qwen/qwen3.7-plus 91.46% p002 lexen-v1 2026-06-14T17:43:42.620560+00:00 qwen3.7-plus-p002-lexen-v1-20260614
49 gpt-5-mini 91.22% p003 lexen-v1 2026-06-17T07:47:29.768045+00:00 gpt-5-mini-high-reasoning-p003-lexen-v1-20260617
50 google/gemma-4-31B-it 91.20% p002 lexen-v1 2026-06-14T13:07:36.382312+00:00 vllm-gemma-4-31b-fp8-h100-p002-lexen-v1-20260614
51 gpt-5-mini 90.99% p003 lexen-v1 2026-06-17T07:45:03.292014+00:00 gpt-5-mini-medium-reasoning-p003-lexen-v1-20260617
52 openrouter/deepseek/deepseek-v4-flash 90.93% p001 lexen-v1 2026-06-14T10:48:09.221591+00:00 deepseek-v4-flash-high-reasoning-p001-lexen-v1-20260614
53 gpt-5.4-mini 90.89% p003 lexen-v1 2026-06-17T07:38:06.024744+00:00 gpt-5.4-mini-low-reasoning-p003-lexen-v1-20260617
54 gpt-5-mini 90.72% p001 lexen-v1 2026-06-14T03:01:27.567145+00:00 gpt-5-mini-high-reasoning-p001-lexen-v1-20260614
55 claude-sonnet-4-6 90.72% p001 lexen-v1 2026-06-14T11:25:24.877127+00:00 claude-sonnet-4.6-low-reasoning-p001-lexen-v1-20260614
56 openrouter/minimax/minimax-m3 90.62% p001 lexen-v1 2026-06-14T17:01:11.417107+00:00 minimax-m3-p001-lexen-v1-20260614
57 gpt-5-mini 90.54% p001 lexen-v1 2026-06-14T02:59:58.106259+00:00 gpt-5-mini-medium-reasoning-p001-lexen-v1-20260614
58 gpt-5-mini 90.48% p003 lexen-v1 2026-06-17T07:42:53.328684+00:00 gpt-5-mini-low-reasoning-p003-lexen-v1-20260617
59 claude-haiku-4-5 90.48% p001 lexen-v1 2026-06-14T10:58:15.856929+00:00 claude-haiku-4.5-low-reasoning-p001-lexen-v1-20260614
60 gemini/gemini-3.1-flash-lite 90.41% p001 lexen-v1 2026-06-14T10:40:32.982176+00:00 gemini-3.1-flash-lite-p001-lexen-v1-20260614
61 gpt-5.4-mini 90.10% p001 lexen-v1 2026-06-14T10:36:37.746436+00:00 gpt-5.4-mini-low-reasoning-p001-lexen-v1-20260614
62 gpt-5-mini 90.04% p002 lexen-v1 2026-06-14T03:02:45.517164+00:00 gpt-5-mini-medium-reasoning-p002-lexen-v1-20260614
63 gpt-5-mini 89.94% p001 lexen-v1 2026-06-14T02:59:23.338570+00:00 gpt-5-mini-low-reasoning-p001-lexen-v1-20260614
64 openrouter/deepseek/deepseek-v4-flash 89.69% p002 lexen-v1 2026-06-14T11:07:29.161962+00:00 deepseek-v4-flash-high-reasoning-p002-lexen-v1-20260614
65 gemini/gemini-2.5-flash 89.61% p002 lexen-v1 2026-06-14T11:07:54.336960+00:00 gemini-2.5-flash-p002-lexen-v1-20260614
66 Qwen/Qwen3.6-27B-FP8 89.51% p001 lexen-v1 2026-06-14T03:37:04.989093+00:00 vllm-qwen3.6-27b-fp8-h200-p001-lexen-v1-20260614
67 gpt-5.4-mini 89.49% p002 lexen-v1 2026-06-14T10:37:18.500343+00:00 gpt-5.4-mini-low-reasoning-p002-lexen-v1-20260614
68 google/gemma-4-26B-A4B-it 89.43% p001 lexen-v1 2026-06-14T13:02:48.331932+00:00 vllm-gemma-4-26b-a4b-fp8-a100-p001-lexen-v1-20260614
69 gemini/gemini-2.5-flash 89.41% p001 lexen-v1 2026-06-14T11:03:39.454582+00:00 gemini-2.5-flash-p001-lexen-v1-20260614
70 Qwen/Qwen3.6-27B-FP8 89.36% p001 lexen-v1 2026-06-14T09:20:49.931691+00:00 vllm-qwen3.6-27b-fp8-h100-p001-lexen-v1-20260614
71 google/gemma-4-26B-A4B-it 89.32% p001 lexen-v1 2026-06-14T03:43:54.686558+00:00 vllm-gemma-4-26b-a4b-fp8-h200-p001-lexen-v1-20260614
72 Qwen/Qwen3.5-397B-A17B 89.24% p001 lexen-v1 2026-06-14T17:58:20.590182+00:00 vllm-qwen3.5-397b-a17b-gptq-b300-p001-lexen-v1-20260614
73 google/gemma-4-26B-A4B-it 89.20% p001 lexen-v1 2026-06-14T12:56:57.382019+00:00 vllm-gemma-4-26b-a4b-fp8-h100-p001-lexen-v1-20260614
74 gpt-4.1 89.14% p003 lexen-v1 2026-06-17T07:33:57.499800+00:00 gpt-4.1-p003-lexen-v1-20260617
75 google/gemma-4-26B-A4B-it 89.12% p003 lexen-v1 2026-06-16T02:50:20.177857+00:00 vllm-gemma-4-26b-a4b-fp8-h100-p003-lexen-v1-20260616
76 gemini/gemini-3.1-flash-lite 89.08% p002 lexen-v1 2026-06-14T10:51:07.533822+00:00 gemini-3.1-flash-lite-p002-lexen-v1-20260614
77 Qwen/Qwen3.5-27B 89.06% p001 lexen-v1 2026-06-14T21:46:09.965074+00:00 vllm-qwen3.5-27b-bf16-a100-p001-lexen-v1-20260614
78 Qwen/Qwen3.5-27B 88.97% p001 lexen-v1 2026-06-14T21:14:44.906254+00:00 vllm-qwen3.5-27b-bf16-h200-p001-lexen-v1-20260614
79 Qwen/Qwen3.5-27B 88.89% p001 lexen-v1 2026-06-14T18:22:27.863955+00:00 vllm-qwen3.5-27b-bf16-h100-p001-lexen-v1-20260614
80 claude-sonnet-4-6 88.89% p002 lexen-v1 2026-06-14T11:34:12.718972+00:00 claude-sonnet-4.6-low-reasoning-p002-lexen-v1-20260614
81 zai-org/GLM-4.6 88.71% p001 lexen-v1 2026-06-14T14:17:06.058829+00:00 vllm-glm-4.6-awq-b300-p001-lexen-v1-20260614
82 gpt-4.1 88.69% p001 lexen-v1 2026-06-14T10:22:33.163281+00:00 gpt-4.1-p001-lexen-v1-20260614
83 gpt-4.1 87.86% p002 lexen-v1 2026-06-14T10:31:43.235840+00:00 gpt-4.1-p002-lexen-v1-20260614
84 claude-haiku-4-5 87.60% p002 lexen-v1 2026-06-14T11:17:37.678627+00:00 claude-haiku-4.5-low-reasoning-p002-lexen-v1-20260614
85 meta-llama/Llama-4-Maverick-17B-128E-Instruct 87.55% p001 lexen-v1 2026-06-14T15:13:31.931784+00:00 vllm-llama-4-maverick-int4-b300-p001-lexen-v1-20260614
86 google/gemma-4-26B-A4B-it 87.25% p002 lexen-v1 2026-06-14T12:57:33.409617+00:00 vllm-gemma-4-26b-a4b-fp8-h100-p002-lexen-v1-20260614
87 google/gemma-4-26B-A4B-it 87.25% p002 lexen-v1 2026-06-14T03:44:42.664730+00:00 vllm-gemma-4-26b-a4b-fp8-h200-p002-lexen-v1-20260614
88 Qwen/Qwen3.6-27B-FP8 87.20% p002 lexen-v1 2026-06-14T09:22:18.123666+00:00 vllm-qwen3.6-27b-fp8-h100-p002-lexen-v1-20260614
89 Qwen/Qwen3.6-27B-FP8 87.18% p002 lexen-v1 2026-06-14T03:38:33.776628+00:00 vllm-qwen3.6-27b-fp8-h200-p002-lexen-v1-20260614
90 zai-org/GLM-4.6 87.12% p002 lexen-v1 2026-06-14T14:21:11.936952+00:00 vllm-glm-4.6-awq-b300-p002-lexen-v1-20260614
91 google/gemma-4-26B-A4B-it 87.10% p002 lexen-v1 2026-06-14T13:04:18.320439+00:00 vllm-gemma-4-26b-a4b-fp8-a100-p002-lexen-v1-20260614
92 Qwen/Qwen3.5-122B-A10B 87.06% p001 lexen-v1 2026-06-14T20:59:20.246395+00:00 vllm-qwen3.5-122b-a10b-fp8-h200-p001-lexen-v1-20260614
93 Qwen/Qwen3.5-122B-A10B 87.02% p001 lexen-v1 2026-06-14T17:26:39.154524+00:00 vllm-qwen3.5-122b-a10b-fp8-b300-p001-lexen-v1-20260614
94 Qwen/Qwen3.5-397B-A17B 86.69% p002 lexen-v1 2026-06-14T18:01:16.443151+00:00 vllm-qwen3.5-397b-a17b-gptq-b300-p002-lexen-v1-20260614
95 gpt-5.4-nano 86.63% p003 lexen-v1 2026-06-17T07:39:04.268726+00:00 gpt-5.4-nano-low-reasoning-p003-lexen-v1-20260617
96 Qwen/Qwen3-235B-A22B-Instruct-2507 86.59% p001 lexen-v1 2026-06-14T20:01:55.579872+00:00 vllm-qwen3-235b-2507-awq-h200-p001-lexen-v1-20260614
97 Qwen/Qwen3.5-27B 86.55% p002 lexen-v1 2026-06-14T18:24:27.165748+00:00 vllm-qwen3.5-27b-bf16-h100-p002-lexen-v1-20260614
98 Qwen/Qwen3.5-27B 86.55% p002 lexen-v1 2026-06-14T21:16:43.650053+00:00 vllm-qwen3.5-27b-bf16-h200-p002-lexen-v1-20260614
99 gpt-5-nano 86.48% p003 lexen-v1 2026-06-17T07:52:26.335235+00:00 gpt-5-nano-medium-reasoning-p003-lexen-v1-20260617
100 Qwen/Qwen3.5-27B 86.46% p002 lexen-v1 2026-06-14T21:51:15.471462+00:00 vllm-qwen3.5-27b-bf16-a100-p002-lexen-v1-20260614
101 gpt-5.4-nano 86.26% p001 lexen-v1 2026-06-14T10:26:49.278471+00:00 gpt-5.4-nano-low-reasoning-p001-lexen-v1-20260614
102 google/gemma-4-E4B-it 85.58% p003 lexen-v1 2026-06-16T12:12:21.657535+00:00 vllm-gemma-4-e4b-fp8-h100-p003-lexen-v1-20260616-thinking
103 Qwen/Qwen3.6-35B-A3B-FP8 85.56% p001 lexen-v1 2026-06-14T19:07:57.673409+00:00 vllm-qwen3.6-35b-a3b-fp8-h100-p001-lexen-v1-20260614
104 Qwen/Qwen3.6-35B-A3B-FP8 85.50% p001 lexen-v1 2026-06-14T03:54:43.714581+00:00 vllm-qwen3.6-35b-a3b-fp8-h200-p001-lexen-v1-20260614
105 gpt-5-nano 85.50% p002 lexen-v1 2026-06-14T03:09:12.248467+00:00 gpt-5-nano-medium-reasoning-p002-lexen-v1-20260614
106 Qwen/Qwen3.5-122B-A10B 85.48% p002 lexen-v1 2026-06-14T21:01:03.796592+00:00 vllm-qwen3.5-122b-a10b-fp8-h200-p002-lexen-v1-20260614
107 Qwen/Qwen3.5-122B-A10B 85.29% p002 lexen-v1 2026-06-14T17:28:29.170586+00:00 vllm-qwen3.5-122b-a10b-fp8-b300-p002-lexen-v1-20260614
108 Qwen/Qwen3-235B-A22B 85.27% p001 lexen-v1 2026-06-14T17:02:46.131355+00:00 vllm-qwen3-235b-a22b-fp8-b300-p001-lexen-v1-20260614
109 CohereLabs/c4ai-command-a-03-2025 85.23% p001 lexen-v1 2026-06-14T12:51:16.703271+00:00 vllm-command-a-fp8-h200-p001-lexen-v1-20260614
110 google/gemma-4-E4B-it 85.19% p001 lexen-v1 2026-06-15T16:54:34.206811+00:00 vllm-gemma-4-e4b-fp8-h100-p001-lexen-v1-20260615-thinking
111 gpt-5-mini 85.00% p001 lexen-v1 2026-06-14T02:58:57.465838+00:00 gpt-5-mini-minimal-reasoning-p001-lexen-v1-20260614
112 Qwen/Qwen2.5-72B-Instruct 84.84% p002 lexen-v1 2026-06-14T21:23:56.668649+00:00 vllm-qwen2.5-72b-awq-a100-p002-lexen-v1-20260614
113 Qwen/Qwen2.5-72B-Instruct 84.82% p002 lexen-v1 2026-06-14T18:10:51.795477+00:00 vllm-qwen2.5-72b-awq-h100-p002-lexen-v1-20260614
114 Qwen/Qwen2.5-72B-Instruct 84.82% p002 lexen-v1 2026-06-14T19:36:35.051728+00:00 vllm-qwen2.5-72b-awq-h200-p002-lexen-v1-20260614
115 Qwen/Qwen3-235B-A22B-Instruct-2507 84.74% p002 lexen-v1 2026-06-14T20:04:54.740808+00:00 vllm-qwen3-235b-2507-awq-h200-p002-lexen-v1-20260614
116 Qwen/Qwen3.5-35B-A3B 84.45% p001 lexen-v1 2026-06-14T21:29:51.511649+00:00 vllm-qwen3.5-35b-a3b-bf16-h200-p001-lexen-v1-20260614
117 gpt-5-nano 84.32% p001 lexen-v1 2026-06-14T03:08:14.245995+00:00 gpt-5-nano-high-reasoning-p001-lexen-v1-20260614
118 Qwen/Qwen3-235B-A22B 84.08% p001 lexen-v1 2026-06-14T20:30:58.208934+00:00 vllm-qwen3-235b-a22b-awq-h200-p001-lexen-v1-20260614
119 nvidia/NVIDIA-Nemotron-3-Super-120B-A12B 83.93% p003 lexen-v1 2026-06-17T08:49:01.056118+00:00 vllm-nemotron-3-super-120b-fp8-h200-p003-lexen-v1-20260617
120 gpt-5.4-nano 83.93% p002 lexen-v1 2026-06-14T10:27:23.808460+00:00 gpt-5.4-nano-low-reasoning-p002-lexen-v1-20260614
121 gpt-5-mini 83.77% p003 lexen-v1 2026-06-17T07:40:43.737804+00:00 gpt-5-mini-minimal-reasoning-p003-lexen-v1-20260617
122 gpt-4.1-mini 83.69% p001 lexen-v1 2026-06-14T03:10:31.779383+00:00 gpt-4.1-mini-p001-lexen-v1-20260614
123 gpt-5-nano 83.56% p003 lexen-v1 2026-06-17T07:49:45.901920+00:00 gpt-5-nano-low-reasoning-p003-lexen-v1-20260617
124 gpt-5-nano 83.56% p001 lexen-v1 2026-06-14T03:04:55.641744+00:00 gpt-5-nano-medium-reasoning-p001-lexen-v1-20260614
125 google/gemma-4-E4B-it 83.44% p001 lexen-v1 2026-06-14T03:25:03.326707+00:00 vllm-gemma-4-e4b-fp8-h200-p001-lexen-v1-20260614
126 Qwen/Qwen2.5-72B-Instruct 83.44% p001 lexen-v1 2026-06-14T21:09:10.349327+00:00 vllm-qwen2.5-72b-awq-a100-p001-lexen-v1-20260614
127 Qwen/Qwen2.5-72B-Instruct 83.42% p001 lexen-v1 2026-06-14T19:32:09.858826+00:00 vllm-qwen2.5-72b-awq-h200-p001-lexen-v1-20260614
128 Qwen/Qwen2.5-72B-Instruct 83.40% p001 lexen-v1 2026-06-14T18:06:23.582043+00:00 vllm-qwen2.5-72b-awq-h100-p001-lexen-v1-20260614
129 google/gemma-4-E4B-it 83.30% p002 lexen-v1 2026-06-16T12:07:56.371195+00:00 vllm-gemma-4-e4b-fp8-h100-p002-lexen-v1-20260616-thinking
130 gpt-4o-mini 82.90% p002 lexen-v1 2026-06-14T03:13:09.202699+00:00 gpt-4o-mini-p002-lexen-v1-20260614
131 Qwen/Qwen3.6-35B-A3B-FP8 82.72% p002 lexen-v1 2026-06-14T19:08:36.001596+00:00 vllm-qwen3.6-35b-a3b-fp8-h100-p002-lexen-v1-20260614
132 Qwen/Qwen3.6-35B-A3B-FP8 82.70% p002 lexen-v1 2026-06-14T03:55:37.865365+00:00 vllm-qwen3.6-35b-a3b-fp8-h200-p002-lexen-v1-20260614
133 gpt-4o-mini 82.70% p001 lexen-v1 2026-06-14T03:11:36.212233+00:00 gpt-4o-mini-p001-lexen-v1-20260614
134 nvidia/NVIDIA-Nemotron-3-Super-120B-A12B 82.35% p002 lexen-v1 2026-06-14T19:12:56.557834+00:00 vllm-nemotron-3-super-120b-fp8-h200-p002-lexen-v1-20260614
135 Qwen/Qwen3-235B-A22B 82.27% p002 lexen-v1 2026-06-14T17:04:02.610404+00:00 vllm-qwen3-235b-a22b-fp8-b300-p002-lexen-v1-20260614
136 Qwen/Qwen3.5-35B-A3B 82.10% p002 lexen-v1 2026-06-14T21:30:47.805630+00:00 vllm-qwen3.5-35b-a3b-bf16-h200-p002-lexen-v1-20260614
137 nvidia/NVIDIA-Nemotron-3-Super-120B-A12B 81.86% p001 lexen-v1 2026-06-14T19:11:26.351859+00:00 vllm-nemotron-3-super-120b-fp8-h200-p001-lexen-v1-20260614
138 meta-llama/Llama-4-Maverick-17B-128E-Instruct 81.53% p002 lexen-v1 2026-06-14T15:17:21.457755+00:00 vllm-llama-4-maverick-int4-b300-p002-lexen-v1-20260614
139 google/gemma-4-E4B-it 81.44% p003 lexen-v1 2026-06-16T02:47:50.795294+00:00 vllm-gemma-4-e4b-fp8-h100-p003-lexen-v1-20260616
140 Qwen/Qwen3-235B-A22B 81.03% p002 lexen-v1 2026-06-14T20:34:00.314833+00:00 vllm-qwen3-235b-a22b-awq-h200-p002-lexen-v1-20260614
141 Qwen/Qwen3.5-9B 80.91% p001 lexen-v1 2026-06-14T22:26:23.261470+00:00 vllm-qwen3.5-9b-bf16-a100-p001-lexen-v1-20260614
142 Qwen/Qwen3.5-9B 80.89% p001 lexen-v1 2026-06-14T21:37:26.714544+00:00 vllm-qwen3.5-9b-bf16-h200-p001-lexen-v1-20260614
143 Qwen/Qwen3.5-9B 80.79% p001 lexen-v1 2026-06-14T18:55:28.122562+00:00 vllm-qwen3.5-9b-bf16-h100-p001-lexen-v1-20260614
144 gpt-4.1-mini 80.54% p002 lexen-v1 2026-06-14T03:10:47.348916+00:00 gpt-4.1-mini-p002-lexen-v1-20260614
145 zai-org/GLM-4-9B-0414 80.46% p002 lexen-v1 2026-06-14T13:13:51.676910+00:00 vllm-glm-4-9b-bf16-a100-p002-lexen-v1-20260614
146 zai-org/GLM-4-9B-0414 79.78% p001 lexen-v1 2026-06-14T13:12:11.870521+00:00 vllm-glm-4-9b-bf16-a100-p001-lexen-v1-20260614
147 mistralai/Mistral-Small-3.2-24B-Instruct-2506 79.39% p002 lexen-v1 2026-06-14T04:09:21.457204+00:00 vllm-mistral-small-3.2-fp8-h200-p002-lexen-v1-20260614
148 mistralai/Mistral-Small-3.2-24B-Instruct-2506 79.28% p002 lexen-v1 2026-06-14T17:01:31.952947+00:00 vllm-mistral-small-3.2-fp8-h100-p002-lexen-v1-20260614
149 mistralai/Mistral-Small-3.2-24B-Instruct-2506 79.00% p002 lexen-v1 2026-06-14T19:43:05.518865+00:00 vllm-mistral-small-3.2-fp8-a100-p002-lexen-v1-20260614
150 google/gemma-4-E4B-it 78.89% p002 lexen-v1 2026-06-14T03:25:43.646502+00:00 vllm-gemma-4-e4b-fp8-h200-p002-lexen-v1-20260614
151 google/gemma-4-E2B-it 78.89% p003 lexen-v1 2026-06-16T11:55:39.619700+00:00 vllm-gemma-4-e2b-fp8-h100-p003-lexen-v1-20260616-thinking
152 google/gemma-4-E4B-it 78.87% p002 lexen-v1 2026-06-14T09:10:08.959121+00:00 vllm-gemma-4-e4b-fp8-h100-p002-lexen-v1-20260614
153 tencent/Hunyuan-A13B-Instruct 78.73% p001 lexen-v1 2026-06-14T13:56:24.339501+00:00 vllm-hunyuan-a13b-int4-a100-p001-lexen-v1-20260614
154 zai-org/GLM-4.7-Flash 77.60% p001 lexen-v1 2026-06-14T13:25:17.068123+00:00 vllm-glm-4.7-flash-bf16-a100-p001-lexen-v1-20260614
155 mistralai/Mistral-Small-3.2-24B-Instruct-2506 77.17% p001 lexen-v1 2026-06-14T19:37:17.468158+00:00 vllm-mistral-small-3.2-fp8-a100-p001-lexen-v1-20260614
156 google/gemma-4-E2B-it 77.12% p002 lexen-v1 2026-06-16T11:51:52.132443+00:00 vllm-gemma-4-e2b-fp8-h100-p002-lexen-v1-20260616-thinking
157 gpt-5-nano 76.98% p001 lexen-v1 2026-06-14T03:03:34.449002+00:00 gpt-5-nano-low-reasoning-p001-lexen-v1-20260614
158 mistralai/Mistral-Small-3.2-24B-Instruct-2506 76.75% p001 lexen-v1 2026-06-14T17:00:10.756548+00:00 vllm-mistral-small-3.2-fp8-h100-p001-lexen-v1-20260614
159 mistralai/Mistral-Small-3.2-24B-Instruct-2506 76.61% p001 lexen-v1 2026-06-14T04:08:01.202862+00:00 vllm-mistral-small-3.2-fp8-h200-p001-lexen-v1-20260614
160 Qwen/Qwen3.5-9B 76.34% p002 lexen-v1 2026-06-14T18:56:17.224420+00:00 vllm-qwen3.5-9b-bf16-h100-p002-lexen-v1-20260614
161 Qwen/Qwen3.5-9B 76.28% p002 lexen-v1 2026-06-14T21:38:16.517970+00:00 vllm-qwen3.5-9b-bf16-h200-p002-lexen-v1-20260614
162 zai-org/GLM-4.7-Flash 75.70% p002 lexen-v1 2026-06-14T13:26:48.163045+00:00 vllm-glm-4.7-flash-bf16-a100-p002-lexen-v1-20260614
163 allenai/Olmo-3.1-32B-Instruct 73.98% p001 lexen-v1 2026-06-14T17:48:27.265217+00:00 vllm-olmo-3.1-32b-fp8-h100-p001-lexen-v1-20260614
164 allenai/Olmo-3.1-32B-Instruct 73.32% p002 lexen-v1 2026-06-14T17:49:57.208847+00:00 vllm-olmo-3.1-32b-fp8-h100-p002-lexen-v1-20260614
165 gpt-4.1-nano 73.28% p002 lexen-v1 2026-06-14T03:11:19.453636+00:00 gpt-4.1-nano-p002-lexen-v1-20260614
166 ibm-granite/granite-4.1-8b-fp8 70.17% p001 lexen-v1 2026-06-14T03:08:55.286627+00:00 vllm-granite-4.1-8b-fp8-h200-p001-lexen-v1-20260614
167 google/gemma-4-E2B-it 70.13% p001 lexen-v1 2026-06-14T03:19:39.961246+00:00 vllm-gemma-4-e2b-fp8-h200-p001-lexen-v1-20260614
168 gpt-4.1-nano 70.07% p001 lexen-v1 2026-06-14T03:10:59.879737+00:00 gpt-4.1-nano-p001-lexen-v1-20260614
169 ibm-granite/granite-4.1-8b-fp8 69.92% p001 lexen-v1 2026-06-14T08:55:32.038625+00:00 vllm-granite-4.1-8b-fp8-h100-p001-lexen-v1-20260614
170 nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B 69.22% p002 lexen-v1 2026-06-14T17:08:45.708843+00:00 vllm-nemotron-3-nano-30b-fp8-h100-p002-lexen-v1-20260614
171 ibm-granite/granite-4.1-8b 69.02% p001 lexen-v1 2026-06-14T13:38:08.800623+00:00 vllm-granite-4.1-8b-bf16-a100-p001-lexen-v1-20260614
172 nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B 65.34% p003 lexen-v1 2026-06-17T08:25:49.592029+00:00 vllm-nemotron-3-nano-30b-fp8-h100-p003-lexen-v1-20260617
173 google/gemma-4-E2B-it 65.07% p003 lexen-v1 2026-06-16T02:48:05.643909+00:00 vllm-gemma-4-e2b-fp8-h100-p003-lexen-v1-20260616
174 microsoft/Phi-4-mini-instruct 63.98% p001 lexen-v1 2026-06-14T20:23:39.820467+00:00 vllm-phi-4-mini-bf16-a100-p001-lexen-v1-20260614
175 nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B 63.55% p001 lexen-v1 2026-06-14T17:08:07.789507+00:00 vllm-nemotron-3-nano-30b-fp8-h100-p001-lexen-v1-20260614
176 google/gemma-4-E2B-it 61.65% p002 lexen-v1 2026-06-14T09:05:03.172757+00:00 vllm-gemma-4-e2b-fp8-h100-p002-lexen-v1-20260614
177 google/gemma-4-E2B-it 61.63% p002 lexen-v1 2026-06-14T03:20:22.722243+00:00 vllm-gemma-4-e2b-fp8-h200-p002-lexen-v1-20260614
178 gpt-5-nano 59.06% p001 lexen-v1 2026-06-14T03:03:01.790422+00:00 gpt-5-nano-minimal-reasoning-p001-lexen-v1-20260614
179 meta-llama/Llama-3.1-8B-Instruct 58.82% p001 lexen-v1 2026-06-14T14:08:58.621756+00:00 vllm-llama-3.1-8b-bf16-a100-p001-lexen-v1-20260614
180 meta-llama/Llama-3.1-8B-Instruct 58.05% p001 lexen-v1 2026-06-14T03:14:37.607430+00:00 vllm-llama-3.1-8b-fp8-h200-p001-lexen-v1-20260614
181 meta-llama/Llama-3.1-8B-Instruct 57.99% p001 lexen-v1 2026-06-14T08:59:47.839377+00:00 vllm-llama-3.1-8b-fp8-h100-p001-lexen-v1-20260614
182 meta-llama/Llama-3.1-8B-Instruct 44.31% p002 lexen-v1 2026-06-14T09:00:22.409515+00:00 vllm-llama-3.1-8b-fp8-h100-p002-lexen-v1-20260614
183 meta-llama/Llama-3.1-8B-Instruct 44.21% p002 lexen-v1 2026-06-14T03:15:18.793310+00:00 vllm-llama-3.1-8b-fp8-h200-p002-lexen-v1-20260614
184 meta-llama/Llama-3.1-8B-Instruct 39.85% p002 lexen-v1 2026-06-14T14:10:27.680965+00:00 vllm-llama-3.1-8b-bf16-a100-p002-lexen-v1-20260614