1 product-data-capture-analysis
Jochem de Boer edited this page 2026-04-13 14:33:49 +00:00

Data Capture Analysis

Analysis of what data claude-permit captures today, where the gaps are, and what needs to change to support the observability user stories.

Date: 2026-04-09

Current Data Sources

Source Format Location Purpose
Audit log JSONL (append-only) ~/.claude/claude-permit/audit.jsonl Every decision logged
Auto-rules TOML ~/.claude/claude-permit/auto-rules.toml LLM-promoted allow rules
Pending rules JSONL (transient) ~/.claude/claude-permit/pending-rules.jsonl YELLOW stashes awaiting confirmation
Config TOML ~/.claude/claude-permit/config.toml Manual rules + settings

Audit Log Fields

Every entry always contains: timestamp, event, session_id, tool_name, tool_input (truncated 256 chars), cwd, tier, decision.

Conditional fields:

Field When Present Content
rule_matched Rule fires Positional ID like "deny[2]: tool=Bash"
llm_score LLM evaluation GREEN / YELLOW / RED
llm_reasoning LLM evaluation Explanation text
llm_latency_ms LLM evaluation Call duration
tool_success PostToolUse Boolean
tool_result_length PostToolUse Byte count
tool_result_preview PostToolUse First 256 chars
auto_promoted Rule promoted Boolean

Data Availability per User Story

Already captured and sufficient

User Story Data Notes
US-4: Why was I blocked? rule_matched, tier, decision Works but only shows positional ID, not rule purpose
US-5: LLM reasoning llm_score, llm_reasoning Captured well, but only in log — not surfaced at decision time
US-7: LLM failing? decision=error_passthrough Captured, but no alerting
US-8: LLM latency llm_latency_ms Good data, no aggregation
US-13: Tool breakdown tool_name on every entry Good data, no aggregation

Captured but with gaps

User Story What's There What's Missing
US-1/2: Session scorecard Every decision has session_id No session start/end events. Can count per session but can't know when session ended to trigger summary.
US-6: Auto-promote notify auto_promoted=true in log Not surfaced in real-time. Auto-rules.toml lacks link to LLM reasoning or decision tier.
US-10: Rule hit counts rule_matched identifies rule Rules have no names/IDs — only positional indices like allow[3]. Reordering config breaks traceability.
US-11: Auto-rule provenance auto-rules.toml has timestamp comment LLM reasoning lost when written to TOML. No session_id, no decision tier, no audit link.
US-12: Near misses Tool input logged for passthrough Which rules were checked and almost matched is not logged. Only the winning rule is recorded.
US-14: Frequent commands tool_input logged (truncated) No aggregation — requires parsing JSON, extracting fields, grouping manually.

Not captured at all

User Story What's Needed Gap
US-3: Trends over time Time-series of decisions by type No aggregation layer. Raw data is there but no way to compute trends.
US-9: Too many passthroughs? Ratio of passthrough vs matched No threshold/alerting. Passthrough is also ambiguous (gap vs expected).
US-15: Compare sessions Session-level aggregates No session lifecycle, no aggregation.
US-16: Security digest Periodic aggregation No scheduled/periodic reporting.
US-17: Trust timeline Auto-rules change history auto-rules.toml is overwritten, not versioned. Only per-rule timestamp comment.
US-18: Health check Status verification No status command. validate checks config but not hooks or binary invocation.

Structural Issues

1. Rules need identity

Rules are anonymous regex patterns identified by array index (deny[2]). Reordering config shifts indices. For any story involving "which rule did X" (US-4, US-10, US-11, US-12), we need named rules — an optional name field in TOML.

Tracked in: Issue #2 — Add optional name field to rules

2. Auto-rules.toml loses provenance

When a rule is promoted, the LLM reasoning, session_id, decision tier, and audit entry link are all discarded. The TOML comment is the only breadcrumb. For US-11, we should embed provenance metadata.

Tracked in: Issue #3 — Enrich auto-rules.toml with provenance

3. No aggregation layer

Every user story about trends, comparisons, or summaries (US-1, US-2, US-3, US-9, US-15, US-16) hits the same wall: the audit log is append-only with no reader. The data is there but there's no claude-permit stats command. This is the single biggest gap — not in data capture, but in data consumption.

Tracked in: Issue #1 — Add stats subcommand


Recommendations

  1. Add a stats subcommand — reads audit.jsonl, computes aggregates. Unlocks US-1, 2, 3, 8, 9, 10, 13, 14, 15.
  2. Add optional name field to rules — small schema change, big traceability win. Unlocks US-4, 10.
  3. Enrich auto-rules.toml with provenance — add reasoning, tier, session_id. Unlocks US-11.
  4. Add a status subcommand — checks binary, config, hooks, last audit timestamp. Unlocks US-18.
  5. Surface LLM reasoning at decision time — hook output JSON message field for YELLOW/RED. Unlocks US-5.
  6. Log deny_promote blocks — currently silent. Unlocks US-12 and general transparency.