submit

Submit A Run

How to add a verified SenseBench run to the public leaderboard.

Workflow

Run the benchmark locally with the registered dataset and prompt, verify the result, then add the complete run directory under results/<run-id>/ in a pull request.

Submissions must identify the runner: pass --github-handle to sensebench run, or stamp an existing run with sensebench set-runner.

Pull request CI rebuilds the site and fails if any submitted result is invalid; a maintainer reviews every submission, and runs appear on the leaderboard only after the pull request is accepted and merged.

Commands

Generate a run with: sensebench run --model <model> --prompt p001 --github-handle <your-handle>

Verify it with: sensebench verify runs/<run-id> --dataset lexen-v1 --prompt p001