Table of Contents
- Observability User Stories
- Context
- Priority Tier 1 — High Value
- US-1: Session Scorecard
- US-2: Session Summary
- US-4: Understand Denials
- US-5: See LLM Reasoning
- US-6: Auto-Promote Notifications
- US-10: Rule Hit Counts
- US-18: Health Check
- Priority Tier 2 — Important
- US-3: Trends Over Time
- US-7: LLM Health
- US-8: LLM Latency Trends
- US-9: Passthrough Alerts
- US-11: Auto-Rule Provenance
- US-14: Frequent Commands/Paths
- Priority Tier 3 — Nice to Have
- US-12: Near Misses
- US-13: Tool Type Breakdown
- US-15: Compare Sessions
- US-16: Security Digest
- US-17: Trust Timeline
- Related
Observability User Stories
Research into what users need to understand about claude-permit's runtime behavior. These user stories inform the observability features on the roadmap.
Date: 2026-04-09
Context
claude-permit works silently by design — it makes hundreds of decisions per session, but the user sees almost nothing beyond the occasional permission dialog. Everything is captured in audit.jsonl, but that's a raw JSONL file requiring jq to query. These user stories capture what users actually want to know.
Priority Tier 1 — High Value
US-1: Session Scorecard
As a user, I want to see how many permission prompts claude-permit saved me, so I know if it's worth keeping installed.
Example: "You were auto-approved 342 times today, asked 4 times, blocked 2."
US-2: Session Summary
As a user, I want to see a session summary when a Claude Code session ends — a quick scorecard of what happened: total tool calls, auto-approved, prompted, denied.
US-4: Understand Denials
As a user, when a tool call is blocked, I want to understand why — which rule matched, what the tool was trying to do, and how to override it if it's a false positive.
US-5: See LLM Reasoning
As a user, when the LLM makes a decision, I want to see its reasoning — not buried in a log file, but surfaced at the moment it matters (especially for YELLOW/RED).
US-6: Auto-Promote Notifications
As a user, I want to know when a rule was auto-promoted — something just got permanently added to my allow list; I should be aware of that.
US-10: Rule Hit Counts
As a user, I want a dashboard of my rules and how often each fires — which rules are doing heavy lifting, which are dead weight.
US-18: Health Check
As a user, I want to know if claude-permit is even running — a simple health check or status indicator confirming hooks are wired up and the binary is responding.
Priority Tier 2 — Important
US-3: Trends Over Time
As a user, I want to see cumulative stats over time — am I getting fewer prompts as auto-rules accumulate? Is the system learning?
US-7: LLM Health
As a user, I want to know if the LLM is slow or failing — if LLM calls are timing out or erroring, I'm getting error_passthrough (fail-open) and losing the safety net without knowing it.
US-8: LLM Latency Trends
As a user, I want to see LLM latency trends — is Haiku adding 200ms or 2000ms to my workflow? Is it getting worse?
US-9: Passthrough Alerts
As a user, I want to be alerted if a suspiciously large number of operations are passing through unmatched — that might mean my rules are stale or misconfigured.
US-11: Auto-Rule Provenance
As a user, I want to see which auto-promoted rules exist and how they got there — with the original LLM reasoning, so I can decide if I trust them.
US-14: Frequent Commands/Paths
As a user, I want to see which commands/paths are most frequent — hot spots in my workflow that might deserve explicit rules.
Priority Tier 3 — Nice to Have
US-12: Near Misses
As a user, I want to see "near misses" — operations that almost matched a deny rule, or that the LLM scored borderline YELLOW/GREEN.
US-13: Tool Type Breakdown
As a user, I want a breakdown by tool type — how many Bash calls vs. Read vs. Write? Are certain tools dominating?
US-15: Compare Sessions
As a user, I want to compare sessions — did this session behave differently from yesterday's? More denials? Different tools?
US-16: Security Digest
As a user, I want a periodic security digest — weekly summary of all RED/YELLOW decisions, auto-promotions, and anything unusual.
US-17: Trust Timeline
As a user, I want to see a "trust timeline" — how my rule set has evolved over time as auto-promotion adds rules.
Related
- Data Capture Analysis — maps these stories to available data and identifies gaps
- Issues tracking implementation: see labels
observability+enhancement
Product
Development
Security & Compliance