Back to directory
Testing & Reviewvcanonici/mahout-bench
mahout-bench
CLI benchmark for measuring and mitigating sycophancy in LLMs. Supports multi-provider execution, configurable judges, and long-running evaluation campaigns.
Suggested install command
npx skills add vcanonici/mahout-bench/mahout-benchAlways inspect the linked repository and skill instructions before running commands. Skills are instructions; permissions and execution still matter.
Compatibility
Agent support matrix
3 supported
| Agent | Status |
|---|---|
| Claude Code | Supported |
| OpenCode | Not listed |
| Cursor | Supported |
| MCP | Not listed |
| GitHub Copilot | Not listed |
| Windsurf |