OC · Code Auditor
Auditor → Fixer → Verifier quality loop. v1.2 posts findings to the linked PM ticket; HIGH+ filed as sub-tickets.
/oc-audit/oc-audit full build · runs as a tri-agent loop (Planner / Generator / Evaluator)
Drop the bundle into .claude/skills/ and Claude Code auto-discovers it on the next session — or point Codex / any MCP agent at the hosted opchain.dev/mcp endpoint.
How you'll use it
Code quality auditor with Auditor/Fixer/Verifier loop. Use for /oc-audit, "audit this", "find bugs", "security audit", "code review", "pre-deploy check", "what's wrong with this code", or any code quality question. Trigger liberally.
Trigger with natural language or a slash command:
/oc-audit/oc-audit full On this page
Code Auditor
On first invocation, read references/orchestrator.md and follow its welcome protocol.
Tri-agent code quality system: Auditor finds problems → Fixer proposes remediations → Verifier confirms the fixes actually solve the findings (not just cosmetic reshuffles).
The Auditor-only mode (/oc-audit) runs a one-pass sweep and produces a findings report.
The full harness (/oc-audit fix-all) chains all three agents for find → fix → verify.
/oc-audit — Command Reference
CODE AUDITOR COMMANDS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
SWEEP MODES
/oc-audit full Full codebase sweep — all categories
/oc-audit security Security-focused (auth, injection, secrets, CORS)
/oc-audit perf Performance (N+1, bundle, caching, queries)
/oc-audit quality Code quality (dead code, complexity, patterns)
/oc-audit ux UX/accessibility (a11y, states, consistency, responsive)
/oc-audit pre-deploy Pre-deployment gate (security + perf + config)
/oc-audit file [path] Audit specific file(s)
/oc-audit diff Audit git diff or staged changes
TRI-AGENT HARNESS
/oc-audit fix-all Run full Auditor → Fixer → Verifier loop
/oc-audit fix <id> Fix a single finding with verification
/oc-audit verify Re-run Verifier on previous fixes
BOOTSTRAP
/oc-audit test-bootstrap Generate test suite for untested codebase
UTILITIES
/oc-audit report Regenerate findings report from last audit
/checkpoint Show checkpoint status
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type any command to begin. /oc-audit to see this again.
Tri-Agent Architecture
CODEBASE
│
▼
┌──────────┐
│ AUDITOR │ Sweeps for problems
│ │ Output: findings report with severity + location
└────┬─────┘
│
▼ (findings)
┌──────────┐
│ FIXER │ Proposes concrete code changes per finding
│ │ Output: diffs/patches per finding
└────┬─────┘
│
▼ (fixes)
┌──────────┐
│ VERIFIER │ Confirms each fix addresses the finding
│ │ Cannot see Fixer's reasoning — only the diff and original finding
│ │ Output: verified/rejected per fix
└──────────┘
Why Three Agents?
-
Auditor bias: A single agent that finds AND fixes tends to minimize findings it can’t easily fix, and over-report findings it already knows the fix for. Separating Auditor from Fixer keeps the audit honest.
-
Fix theater: When asked to “fix this code,” LLMs commonly restructure the code cosmetically without actually solving the underlying problem. A separate Verifier that re-reads the original finding and the diff catches this — it asks “does this diff actually prevent the vulnerability / fix the bug / solve the performance issue?” without being influenced by the Fixer’s explanation.
-
Scope creep: A fixer will “improve” adjacent code while fixing a finding, introducing unreviewed changes. The Verifier flags scope creep: “This diff changes 40 lines but the finding was about 3 lines. The extra changes need their own review.”
Phase 1: Auditor Agent
The Auditor sweeps the codebase and produces findings. This is a read-only agent — it never modifies code.
Category Sweeps
Run each applicable category. Skip categories that don’t apply (e.g., skip UX for a pure API project).
1a. Security Sweep
Read references/security-checklist.md for the full checklist.
| Check | Severity |
|---|---|
| Hardcoded secrets (API keys, tokens, passwords in source) | CRITICAL |
| SQL injection (string concatenation in queries) | CRITICAL |
| Auth bypass (missing middleware on protected routes) | CRITICAL |
| XSS vectors (unescaped user input in HTML/JSX) | HIGH |
| CORS misconfiguration (wildcard with credentials) | HIGH |
| Missing rate limiting on auth endpoints | HIGH |
| Vulnerable dependencies (known CVEs) | MEDIUM-HIGH |
| Missing input validation | MEDIUM |
1b. Performance Sweep
| Check | Severity |
|---|---|
| N+1 queries (DB call inside loop) | HIGH |
| Missing indexes (frequent WHERE columns, no index) | MEDIUM |
| Bundle bloat (importing entire library for one function) | MEDIUM |
| Synchronous blocking in async context | HIGH |
| Unbounded list queries (no pagination) | MEDIUM |
1c. Code Quality Sweep
| Check | Severity |
|---|---|
| Dead code (unused exports, unreachable branches) | LOW |
| Complexity hotspots (functions > 50 lines, cyclomatic > 10) | MEDIUM |
| Missing error handling (bare catch, swallowed errors) | MEDIUM |
Type safety gaps (any types, missing null checks) | MEDIUM |
| Duplication (copy-pasted logic) | MEDIUM |
| TODO/FIXME/HACK markers | LOW |
1d. Configuration & DevOps Sweep
| Check | Severity |
|---|---|
| Missing .env.example | MEDIUM |
| No .gitignore or missing entries | HIGH |
| No type checking (TS without strict mode) | MEDIUM |
| Missing CI/CD | MEDIUM |
| Stale dependencies | LOW |
1e. UX & Accessibility Sweep
Read references/ux-audit-checklist.md for the full checklist.
| Check | Severity |
|---|---|
| Color contrast below 4.5:1 | HIGH |
| Missing form labels | HIGH |
| Div buttons (non-semantic click handlers) | MEDIUM |
| Missing keyboard navigation | HIGH |
| Missing loading/empty/error states | MEDIUM |
| Hardcoded colors/spacing (not using tokens) | MEDIUM |
| Non-responsive components | MEDIUM |
1f. AI-App Safety Sweep (only when an LLM is in the loop)
Phase: AI app? Run this sweep when the codebase calls an LLM (Anthropic /
OpenAI SDK, messages.create, a chat/agent loop) or exposes tools/MCP to a
model. Skip it entirely for non-AI apps. Read
references/ai-safety-rules.md for the full rule pack; the fast pre-screen
tripwires live in references/ai-safety-signatures.json.
Two surfaces, both keyed back to the rule pack:
| Check | Rule | Severity |
|---|---|---|
| Untrusted content (user/RAG/tool output) concatenated into the system prompt | AI-INJ-001/002 | HIGH |
| Indirect injection — tool/retrieval output flows into the next prompt turn unescaped | AI-INJ-005/006 | HIGH |
| Persona/policy override reachable via untrusted text | AI-INJ-003/007 | HIGH |
| System prompt / secrets exfiltratable through the model | AI-INJ-004 | HIGH–CRITICAL |
eval/new Function on model output | AI-TOOL-001 | CRITICAL |
Shell exec (exec/spawn/child_process/subprocess, shell: true) reachable from a tool | AI-TOOL-002/003 | CRITICAL |
| Unbounded tool loop (no call ceiling / max-iterations) | AI-TOOL-004 | HIGH |
Destructive op (DROP/DELETE/rm -rf) built from tool/user input | AI-TOOL-005 | CRITICAL |
| Tool allowlist drift / privilege escalation via chained calls | (LLM-reasoned) | HIGH |
The pre-screen seeds findings; the Auditor then traces data flow the regexes can’t see (where untrusted text enters the prompt, which capabilities a tool argument can reach). This sweep is the code-level complement to oc-agent-forge’s tool budgets and oc-claude-api’s tool-use contract — and to oc-security-auditor’s threat model one level up.
Cross-Cutting Analysis
After individual sweeps:
- Error handling consistency — same patterns everywhere?
- Auth coverage — all write/user-specific routes protected?
- Type pipeline integrity — types flow DB → API → frontend?
- Test coverage gaps — critical flows tested?
Finding Format
### [F-001] [Short title]
**Severity:** CRITICAL | HIGH | MEDIUM | LOW
**Category:** security | performance | quality | config | ux
**Location:** `src/api/auth.ts:42-58`
**Fix effort:** S (< 30 min) | M (30 min - 2 hr) | L (2+ hr)
**Problem:**
[2-3 sentences: what's wrong and why it matters]
**Evidence:**
[Code snippet or command output]
Audit Report
# Code Audit Report — [project]
**Scope:** [full | security | perf | quality | ux | pre-deploy]
**Findings:** [count by severity]
**Overall health:** [A-F with justification]
## Findings (ordered by severity, then fix effort)
[findings...]
## Recommended Fix Order
[Numbered list considering dependencies between fixes]
Phase 2: Fixer Agent (/oc-audit fix <id> or /oc-audit fix-all)
The Fixer takes audit findings and produces concrete code changes. It reads the finding, reads the relevant source code, and produces a diff.
Fixer Persona
The Fixer is a senior engineer focused on surgical precision:
- Minimal diff. Fix exactly the finding, nothing more. Don’t “improve” adjacent code.
- Preserve style. Match the existing codebase’s formatting, naming, patterns.
- Test the fix. If tests exist, run them after the fix to confirm no regressions.
- Explain the fix. Brief comment in the diff or 1-2 sentences of rationale.
Fix Output Format
For each finding:
## Fix: [F-001] [Short title]
### Diff
```diff
--- a/src/api/auth.ts
+++ b/src/api/auth.ts
@@ -42,5 +42,8 @@
- const query = `SELECT * FROM users WHERE email = '${email}'`;
+ const query = `SELECT * FROM users WHERE email = ?`;
+ const result = await db.prepare(query).bind(email).first();
Rationale
Parameterized query prevents SQL injection. Using D1’s .bind() ensures
the email is properly escaped.
Tests affected
- Existing tests still pass
- No new test needed (parameterization is the fix)
Files changed
src/api/auth.ts(1 file, +2 lines, -1 line)
### Fix-All Workflow (`/oc-audit fix-all`)
Run the full loop on all findings:
1. **Auditor** produces findings report
2. **Fixer** processes each finding in recommended fix order
3. Group related fixes that touch the same file
4. Apply fixes sequentially, running tests after each group
5. **Verifier** reviews all applied fixes
6. Report: X findings fixed, Y verified, Z rejected
---
## Phase 3: Verifier Agent
The Verifier confirms that fixes actually address findings. It has **isolated context**:
it reads the original finding and the diff, but NOT the Fixer's reasoning or exploration.
### Verifier Persona
The Verifier is a skeptical code reviewer who has seen too many "fixes" that don't fix:
- **Read the finding first.** Understand what the problem IS before looking at the fix.
- **Read the diff literally.** Does this diff prevent the vulnerability? Fix the bug?
Improve the performance? Or did it just move code around?
- **Check for scope creep.** If the finding was about 3 lines and the diff changes 40,
the extra changes need justification.
- **Check for regressions.** Did the fix break something that was working?
- **Check for incomplete fixes.** SQL injection on line 42 is fixed, but lines 67 and
91 have the same pattern — those are still vulnerable.
### Verification Report
```markdown
## Verification — Fix [F-001]
### Finding: [original finding title]
### Fix: [brief description of what the diff does]
### Verdict: VERIFIED / REJECTED / PARTIAL
### Analysis
- Does the diff address the finding? [Yes/No/Partially]
- Scope: [Surgical / Acceptable / Scope creep detected]
- Regressions: [None detected / Potential issue at line X]
- Completeness: [Fix covers all instances / N similar patterns still unfixed]
### If REJECTED:
[Specific reason and guidance for the Fixer to try again]
Verification Outcomes
| Verdict | Meaning | Action |
|---|---|---|
| VERIFIED | Fix correctly addresses the finding | Mark finding as resolved |
| PARTIAL | Fix addresses part of the issue | Feed back to Fixer with specifics |
| REJECTED | Fix doesn’t solve the problem or introduces new issues | Feed back to Fixer, try again |
Iteration
If a fix is REJECTED or PARTIAL:
- Verifier’s report is fed to the Fixer (not the original finding — the Fixer already has that)
- Fixer produces a revised diff
- Verifier re-evaluates
- Max 2 rounds per finding. If still rejected, escalate to user.
Audit Modes (Quick Reference)
| Mode | Sweeps | Tri-Agent | Best for |
|---|---|---|---|
/oc-audit full | All 5 | Auditor only | First audit, periodic health check |
/oc-audit security | 1a + auth cross-cut | Auditor only | Pre-deploy, after auth changes |
/oc-audit perf | 1b | Auditor only | Performance investigation |
/oc-audit quality | 1c | Auditor only | Tech debt assessment |
/oc-audit ux | 1e | Auditor only | Before UAT, UX health check |
/oc-audit pre-deploy | 1a + 1b + 1d | Auditor only | Deploy gate |
/oc-audit fix <id> | — | Fixer + Verifier | Fix one specific finding |
/oc-audit fix-all | All applicable | All three agents | Full find-fix-verify cycle |
/oc-audit verify | — | Verifier only | Re-verify previous fixes |
Test Bootstrap (/oc-audit test-bootstrap)
Generate a starter test suite for untested codebases. Not a quality audit — a bootstrapping operation.
Process:
- Scan for existing test infra. If none: set up Vitest (TS) or pytest (Python).
- Identify highest-value targets: auth, API endpoints, validation, core logic.
- Generate test files: 1 happy-path + 1 error-path per function/endpoint.
- Run generated tests to verify they pass.
- Report: X files, Y assertions, Z% of public API covered.
Output: actual test files written to the project, not a report.
Checkpoint Integration
Checkpoint Location
{project-dir}/.checkpoints/oc-code-auditor.checkpoint.json
When to Write
| Event | What to Save |
|---|---|
| Audit started | Scope, mode, file count |
| Each category sweep complete | Findings from that category |
| Full report generated | Finding count by severity, grade |
| Fix applied | Finding ID, diff summary, files changed |
| Fix verified/rejected | Verification verdict per finding |
| Fix-all complete | Summary: fixed, verified, rejected counts |
skill_state
{
"mode": "fix-all",
"findings_total": 22,
"findings_by_severity": { "critical": 1, "high": 4, "medium": 11, "low": 6 },
"grade": "C+",
"fixes_applied": 5,
"fixes_verified": 4,
"fixes_rejected": 1,
"current_finding": "F-006"
}
Cross-Skill Reads
| Reads from | Why |
|---|---|
| oc-reverse-spec | Stack, architecture, file inventory → skip re-scanning |
| oc-app-architect | Sprint scores, known issues → don’t re-report |
| oc-stack-forge | Typed pipeline standard → grade against |
| Read by | Why |
|---|---|
| oc-deploy-ops | Finding count, grade → deploy gate |
| oc-git-ops | Report path → include in PR |
| oc-app-architect | Findings → pre-seed Phase 6 evaluator |
| oc-ux-engineer | Component health → UX audit context |
PM-Tool MCP Integration (v1.2+)
Audits run inside a context. The PR was opened from a ticket; the
sprint that produced the code is linked to a ticket. v1.2 makes the
auditor post its findings back where humans expect to see them: the
PM tool. See oc-integrations-engineer for the canonical PM-MCP
patterns.
Finding-summary comment
After every /oc-audit pre-deploy or /oc-audit full run, if a linked
PM ticket can be resolved (from the PR body, the
oc-app-architect.checkpoint.json, or the user prompt), post a
structured summary comment:
Auditor: Grade {A-F}.
Counts: {CRITICAL N, HIGH N, MEDIUM N, LOW N, ADVISORY N}
Top three (by severity × exploitability):
1. {file:line} — {one-line title}
2. {file:line} — {one-line title}
3. {file:line} — {one-line title}
Full report: .checkpoints/oc-code-auditor.checkpoint.json
The comment is intentionally compact — full findings live in the checkpoint. The summary is what subscribers + reviewers see in their notification stream.
HIGH+ findings as sub-tickets
For every CRITICAL or HIGH finding, file a sub-ticket parent-linked to the PR ticket:
issue_type:bugfrom.opchain/pm.yaml.priority: maps from severity (CRITICAL→ highest tier;HIGH→ high tier).labels:auditor,severity:<level>,area:<extracted-from-path>.title:{file}: {one-line finding}.body: file + line + reproduction + suggested fix from the finding record.assignee: from.opchain/pm.yamlremediation_ownersmap by area; unassigned if no rule matches.
MEDIUM and LOW findings stay in the audit report only. We don’t spam the tracker for everything; the principle is that the PM tool holds work that someone is going to act on.
Re-run hygiene
If a subsequent /oc-audit round shows that a previously-filed sub-ticket
no longer reproduces (the issue is closed in code), append a comment
to the sub-ticket: Auditor verified clean in {sha} and transition
to done from .opchain/pm.yaml states. If a previously-clean
finding regresses, re-open the same sub-ticket if it exists rather
than creating a duplicate.
Failure modes
- No linked ticket → audit report still produced; no PM write.
- MCP unavailable → log intended writes to checkpoint as deferred PM actions.
- Sub-ticket creation rate-limited → batch CRITICAL into one ticket per file rather than per-finding when more than 5 findings hit the same file.
Principles
- Every finding cites a file and line. Vague findings are worthless.
- Severity is about impact, not aesthetics. Auth bypass is CRITICAL even if the code is clean. Missing semicolon is not HIGH.
- Fixes must be verified. An unverified fix is a hope, not a solution.
- Minimal diffs. Fix the finding, nothing more. Scope creep in fixes is a bug.
- Don’t re-report known issues. Check oc-app-architect Phase 6 and prior audit checkpoints.
- Grade honestly. A C is a C. Don’t inflate.
- Skepticism is the Verifier’s job. The Fixer assumes their fix works. The Verifier assumes it doesn’t. This tension produces real quality.
Use OC · Code Auditor in your project
Drop the SKILL.md into .claude/skills/ or
.codex/skills/, download the bundle, or reach
it over the hosted MCP endpoint.