OpenClaw

ml-model-eval-benchmark

Compare model candidates using weighted metrics and deterministic ranking outputs. Use for benchmark leaderboards and model promotion decisions.

2.8k stars

openclaw/skillsskills/0x-professor/ml-model-eval-benchmarkMarch 14, 2026

View on GitHub

Install command

python "$CODEX_HOME/skills/.system/skill-installer/scripts/install-skill-from-github.py" --repo openclaw/skills --path skills/0x-professor/ml-model-eval-benchmark

Tell me the task — I'll narrow the agent shortlist.