Field Note — Verification Architecture / April 2026

From Bash Script to Protocol

The stellaraccident report showed what happens when agent behavior degrades without a verification layer. Here's the concrete architecture that turns forensic reconstruction into real-time detection — using a context manager, a local signing handler, and cryptographic behavioral attestation.

ZHnukez.xyzgithub.com/nukez-xyzRefs: claude-code #42796

In the previous posts, we analyzed what went wrong: 234,760 tool calls revealing a 70% drop in research-before-editing, a fleet failure costing $42,000 in wasted compute, and a model that couldn't tell from the inside whether it was thinking deeply or not.

Now let's build the thing that fixes it.

The solution has three components: a context manager that wraps agent sessions, a local signing handler that holds key material and never transmits it, and the NukezAgent service that handles attestation and on-chain anchoring. Together, they replace forensic session-log analysis with real-time cryptographic behavioral verification.

The Problem, Restated as Architecture

stellaraccident's stop-phrase-guard.sh was a behavioral monitor. It watched agent outputs for known failure patterns and forced corrections. It caught 173 violations. It worked.

But it operated at the wrong level of abstraction:

stop-phrase-guard.sh	SessionGuard + NukezAgent
Catches symptoms after they occur	Detects causes in real time
String-matching on known phrases	Behavioral metrics from attested data
No attestation — can't prove violations happened	Cryptographic proof chain on every checkpoint
No baseline — can't quantify drift	Verified baseline from previous session
Single-session — no fleet coordination	Fleet-wide correlated drift detection
Reactive — fires after the damage is done	Proactive — alerts before waste accumulates

Architecture

Three components, clean separation of concerns, private keys never leave the developer's machine:

┌─────────────────────────────────────────────────────────────────┐
│  Developer's Machine                                            │
│                                                                 │
│  SessionGuard  ←  wraps agent session (Claude Code, etc.)       │
│      │  records tool calls → computes behavioral metrics        │
│      │  detects drift against attested baseline                 │
│      │                                                          │
│  SigningHandler  ←  holds Ed25519/secp256k1 key material        │
│      │  signs behavioral snapshots (ADR-4 OEV pre-commitment)   │
│      │  computes content hash BEFORE transmission               │
│      │  private key never leaves this box                       │
│      │                                                          │
└──────┼──────────────────────────────────────────────────────────┘
       │  signed snapshot + content hash
       ▼
┌─────────────────────────────────────────────────────────────────┐
│  NukezAgent (hosted service)                                    │
│      │  stores behavioral snapshot in developer's locker        │
│      │  computes merkle root                                    │
│      │  anchors to Solana/Monad                                 │
│      │  returns receipt_id                                      │
│      │                                                          │
│  Two independent attestations:                                  │
│    1. Agent's Ed25519 signature  (proves authorship)            │
│    2. On-chain merkle root       (proves existence at time T)   │
│  Both must agree. Neither party can modify unilaterally.        │
└─────────────────────────────────────────────────────────────────┘

The signing handler implements Operator-Excluded Verification (ADR-4 OEV). The developer computes a content hash of the behavioral snapshot before transmitting it to NukezAgent. This pre-commitment means NukezAgent cannot modify the data after receiving it — the hash is already signed and included in the envelope. Nukez is mathematically excluded from its own trust chain.

What It Looks Like in Code

from nukez_session_guard import SessionGuard, SigningHandler, DriftThresholds
from pynukez.auth import Keypair

# Local signing — key material stays on your machine
signer = SigningHandler(Keypair.from_file("~/.config/solana/id.json"))

# Configure drift thresholds calibrated from the stellaraccident data
thresholds = DriftThresholds(
    read_edit_ratio_drop=0.30,    # alert if ratio drops 30%+
    edit_without_read_rise=0.15,  # alert if blind edits rise 15%+
    interrupt_rate_rise=3.0,      # alert if interrupts 3x baseline
)

async with SessionGuard(
    signing_handler=signer,
    nukez_agent_url="https://aaap.nukez.xyz",
    project_id="iree-loom",
    drift_thresholds=thresholds,
    on_drift=lambda d: print(f"⚠ DRIFT: {d.summary}"),
) as guard:

    # Your agent session runs here.
    # Every tool call gets recorded:
    guard.record_read("src/compiler/passes.cc")
    guard.record_read("src/compiler/utils.h")
    guard.record_grep("rg 'class PassManager' --type cpp")
    guard.record_read("tests/compiler_test.cc")
    guard.record_edit("src/compiler/passes.cc", edit_type="surgical")

    # read:edit ratio = 3 reads + 1 grep / 1 edit = 4.0
    # Baseline from last session was 6.6
    # Drift: -39% → ALERT fires

    drift = guard.current_drift()
    if drift and drift.any_alert:
        print(drift.summary)
        # "1 alert(s): read:edit ratio dropped 39%
        #  (6.6 → 4.0)"

# Session record attested on exit:
print(f"Receipt: {guard.session_receipt_id}")
print(f"Baseline verified: {guard.baseline_receipt_id}")

How It Catches Each Failure from the Report

Let's walk through the specific failure modes stellaraccident documented and show where SessionGuard would have intervened.

Failure 1: Read:Edit Ratio Collapse (6.6 → 2.0)

Session Start — Baseline Verification

SessionGuard recalls the previous session's attested baseline from NukezAgent. Baseline read:edit ratio: 6.6. Verified against on-chain merkle root. Passes.

Tool Call #12 — First Checkpoint

After 12 tool calls, current read:edit ratio: 4.8. Drift: −27%. Below the 30% threshold. No alert, but the trend is visible in the checkpoint data.

Tool Call #28 — Drift Alert

Current read:edit ratio: 3.1. Drift: −53%. Alert fires. The developer is notified before a single wrong edit ships. In the stellaraccident timeline, this is mid-February — six weeks before the forensic analysis would discover the problem.

The drift detection is calibrated from real data. A 30% drop threshold would have caught the transition period (6.6 → 2.8) — weeks before the full degradation to 2.0 that prompted the investigation.

Failure 2: Blind Edits (6.2% → 33.7%)

Every tool call records whether the target file was read before being edited. The had_prior_read flag is computed from the session's own event history — not inferred from logs after the fact. When blind edit rate rises above the threshold, the alert includes the specific ratio change and the attested baseline it's measured against.

Key Difference from the Bash Script

The bash script couldn't detect blind edits at all. It caught phrases like “good stopping point” but had no visibility into whether the model read a file before editing it. SessionGuard tracks the actual tool call sequence, not the model's verbal claims about its process.

Failure 3: Fleet-Wide Correlated Degradation

The FleetMonitor layer sits above individual SessionGuards. When multiple sessions show simultaneous drift, it raises a fleet-level alert:

from nukez_session_guard import FleetMonitor

fleet = FleetMonitor(
    alert_threshold=0.5,  # alert when 50%+ of sessions show drift
    on_fleet_alert=lambda msg, details: (
        print(f"🚨 {msg}"),
        pause_all_agents(),
    ),
)

# Register each concurrent agent session
for project in ["iree-loom", "iree-amdgpu", "iree-remoting", "bureau"]:
    guard = await start_guarded_session(project)
    fleet.register(guard)

# Later: check fleet health
health = fleet.fleet_health()
# {
#   "total_sessions": 10,
#   "drifting_sessions": 8,
#   "drift_fraction": 0.8,
#   "fleet_alert": "CORRELATED DRIFT: 8/10 sessions (80%)
#     showing behavioral drift. Possible model-level regression."
# }

This is the detection that would have saved $42,000. In the stellaraccident scenario, 10+ agents degraded simultaneously over the course of March. Without fleet monitoring, the user discovered the problem only after weeks of accumulated waste. With FleetMonitor, the correlated drift triggers within hours — as soon as enough sessions cross their individual drift thresholds to establish a pattern.

3 weeks

Detection time without SessionGuard

< 1 day

Detection time with fleet monitoring

~$41,000

Estimated waste prevented

Failure 4: The Model Can't Verify Itself

Claude wrote: “I cannot tell from the inside whether I am thinking deeply or not.” SessionGuard doesn't try to solve this from the inside. It provides external verification — behavioral metrics computed from attested tool call data, compared against cryptographically verified baselines.

The agent doesn't need to introspect. The proof chain does the work: content hashes → merkle tree → on-chain root. The question “is this agent operating within its baseline parameters?” has a binary answer backed by mathematics, not by the agent's self-report.

The Signing Flow

The critical property: private keys never leave the developer's machine. NukezAgent never sees key material. It receives only signed artifacts and content hashes.

# 1. SessionGuard computes behavioral snapshot
snapshot = metrics.snapshot()
# → BehavioralSnapshot(read_edit_ratio=4.8, blind_edit_rate=0.12, ...)

# 2. SigningHandler computes ADR-4 OEV pre-commitment
#    Content hash computed LOCALLY before transmission
canonical = json.dumps(snapshot.to_dict(), sort_keys=True, separators=(",", ":"))
content_hash = sha256(canonical)
signature = local_keypair.sign(canonical)

# 3. Payload sent to NukezAgent includes pre-commitment
payload = {
    "behavioral_snapshot": snapshot.to_dict(),
    "oev": {
        "content_hash": content_hash,    # ← computed BEFORE transmission
        "signer_identity": pubkey,
        "signature": signature,
    }
}

# 4. NukezAgent stores, computes merkle root, anchors on-chain
# 5. Two independent attestations now exist:
#    - Agent's Ed25519 signature (proves agent authored this data)
#    - On-chain merkle root      (proves data existed at time T)
# If NukezAgent modifies the data after receiving it,
# the pre-commitment hash won't match. Tampering is detectable.

This is Operator-Excluded Verification. The developer computes the content hash before NukezAgent ever sees the data. NukezAgent stores it, anchors it, and returns a receipt — but it cannot modify what was signed. The two attestations (developer signature and on-chain root) must agree. Any discrepancy is independently detectable by any third party with the receipt ID and the published merkle algorithm.

Integration with Claude Code

The ClaudeCodeAdapter translates Claude Code's tool call stream into SessionGuard events. Claude Code writes session JSONL files to ~/.claude/projects/ — the adapter watches these files and classifies each tool call:

from nukez_session_guard import ClaudeCodeAdapter

adapter = ClaudeCodeAdapter(guard)

# From Claude Code session JSONL:
tool_call = {
    "type": "tool_use",
    "name": "View",
    "input": {"file_path": "src/compiler/passes.cc"}
}
adapter.process_tool_call(tool_call)  # → guard.record_read()

tool_call = {
    "type": "tool_use",
    "name": "Edit",
    "input": {"file_path": "src/compiler/passes.cc", ...}
}
adapter.process_tool_call(tool_call)  # → guard.record_edit()

# Bash commands are classified automatically:
# "grep -r 'PassManager'"   → record_grep()
# "cmake --build ."         → record_build()
# "pytest tests/"           → record_test()

# User interrupts (Escape key):
adapter.process_interrupt()  # → guard.record_interrupt()

For Claude Code specifically, the adapter can be wired as a post-tool hook in .claude/hooks.json. Every tool call triggers a callback, the adapter classifies it, and the SessionGuard updates its running metrics. Checkpoints persist to NukezAgent at configurable intervals — every 50 tool calls by default, or on manual trigger.

What the Attested Data Looks Like

Each checkpoint produces a BehavioralSnapshot that's persisted to the developer's Nukez locker. Here's the exact schema — every field maps to a metric from the stellaraccident report:

{
  "session_id": "session_1712937600_iree-loom",
  "project_id": "iree-loom",
  "timestamp": 1712937842.5,

  // Section 4 of the report: Read:Edit ratio
  "total_reads": 47,
  "total_edits": 8,
  "total_writes": 1,       // full-file rewrites
  "total_greps": 12,
  "read_edit_ratio": 5.88,
  "research_mutation_ratio": 6.56,

  // Appendix A.1: Blind edits
  "edits_without_prior_read": 1,
  "blind_edit_rate": 0.1111,

  // Section 4: Surgical precision
  "write_pct_of_mutations": 0.1111,

  // Appendix A.7: Thrashing
  "files_edited_3plus_times": 0,
  "thrash_rate": 0.0,

  // Appendix A.5: User interrupts
  "total_interrupts": 0,
  "interrupt_rate_per_1k": 0.0,

  // Session health
  "total_tool_calls": 68,
  "session_duration_seconds": 1842.5
}

Every one of these metrics is computable from the tool call event stream. Every one is attested — signed by the developer's key, stored in their locker, anchored on-chain. And every one is the exact metric that stellaraccident had to forensically reconstruct from raw session logs weeks after the degradation occurred.

The same analysis that took 6,852 session files and weeks of forensic work becomes a single query against attested data.

What This Means

The stellaraccident report is a case study in the cost of unverifiable agent behavior. The solution isn't better models (though better models help). It isn't better prompts (though better prompts help). It's a verification layer that operates independently of the model, the platform, and the agent's own self-assessment.

SessionGuard is that layer, built on NukezAgent primitives: cryptographically signed behavioral snapshots, attested baselines, drift detection against verified data, fleet-wide correlated monitoring, and Operator-Excluded Verification that prevents any single party — including Nukez — from modifying the record.

A bash script is what verification looks like when you have to build it yourself in an emergency. A protocol is what it looks like when someone builds it properly.

The source is available. The service is live. The math doesn't care what time of day it is.