Skip to main content
Beta feature. The audit ships as beta while we collect early feedback. The detector catalog and report format may change before the next stable cut. Please open an issue if anything looks off.
The audit is now exposed as the /audit dashboard page, not a CLI subcommand. Open it from the dashboard navbar (between Policies and Projects), or visit http://localhost:8020/audit directly when running failproofai locally.
failproofai          # open the dashboard, then click "Audit"
The dashboard scans past agent CLI transcripts on this machine (Claude Code, Codex, Copilot, Cursor, OpenCode, Pi, Gemini) and reports how often the agent did things failproofai is built to stop — env-var checks, force pushes, redundant cd <cwd> prefixes, sleep-polling loops, re-reading files just edited, and more. For each transcript, every tool-use event is replayed through the 39 builtin policies and through 8 audit-only detectors that catch patterns not yet covered by runtime policies. Counts are aggregated per policy / detector across all sessions.

What you get

The /audit page composes six sections:
  1. Identity — your agent classified into one of 8 archetypes (optimist, cowboy, explorer, goldfish, paranoid architect, precision builder, hammer, ghost) based on the weighted signal across every audited transcript.
  2. Strengths — real numbers derived from the scan (clean-call %, “0 credential leaks”, etc.) gated on the relevant sanitize policies actually firing.
  3. Score — 0-100 with S/A/B/C/D/F bands and a projected uplift if every recommended policy were enabled.
  4. Findings — per-policy cards with what happened, cost, captured evidence, and the exact failproofai policy add <slug> to enable the live-time builtin that would have caught it.
  5. Prescribed policies — aggregated install list with a one-shot failproofai policies --install command.
  6. Re-audit reminder — “come back better.” Set a 7-day email reminder via the api-server (requires sign-in; see failproofai auth).

Audit-only detectors

These detect “stupid behavior” patterns not (yet) enforced in real time. They run only during the audit and never block a live tool call.
DetectorWhat it counts
redundant-cd-cwdBash commands starting with cd <cwd> && … even though commands already run in cwd.
prefer-edit-over-read-catcat/head/tail/less/more on a single source file — use the Read tool.
prefer-edit-over-sed-awksed -i / awk … > file in-place edits — use the Edit tool.
prefer-write-over-heredocHeredoc / multi-line echo > file writing files — use the Write tool.
sleep-polling-loopLong sleep N (≥ 30s) or while …; sleep …; done polling loops.
find-from-rootfind /, find /home, find /usr, etc. — scope to cwd instead.
git-commit-no-verifygit commit … --no-verify / -n, skipping hooks.
reread-after-editRead of a file that was just Edit/Write in the same session.

Caches

  • Per-transcript cache at ~/.failproofai/cache/audit/<sha1>.json keyed by (mtime, size, engineVersion, detectorVersion). Invalidates automatically when policy or detector code changes.
  • Whole-result cache at ~/.failproofai/audit-dashboard.json (mode 0600). Lets the dashboard render instantly on navigation without re-running. Click [ re-audit now ] from the dashboard to refresh.

Notes

  • No mutation. The audit replays in read-only mode. warn-repeated-tool-calls is skipped because its per-session sidecar would otherwise be modified.
  • Workflow policies skipped. require-*-before-stop policies fire only on Stop events and execSync against the live git state — they have no meaningful “what would have happened in 2025” interpretation, so they don’t appear in audit counts.
  • Custom policies skipped. User-supplied custom hooks are not replayed (they may have changed since the original session).