← Back to catalog

Tool-Call Benchmark

Full tool-calling benchmark for glm-5.1 — 18/18 passed (100%), 0 misfires, 2 skipped (no ToolSearch in harness). Categories tested: Bash Execution, File Operations, MCP Tool Calls, Skill Invocations, Generation.

Prompt Read report

Files

Prompt
Report
Results JSON