๐Ÿ“ธ Agent Trace Format

A normalized JSON shape for capturing one agent run โ€” input, tool calls, output, and a fingerprint. Used by agentsnap to diff runs and detect silent regressions.

Full example

{
  "version": 1,
  "model": "claude-sonnet-4-6",
  "input": "search for python tutorials",
  "output": "Here are 3 results.",
  "tools": [
    { "name": "web_search", "args": { "q": "python tutorials" }, "result_hash": "abc123" },
    { "name": "fetch_page", "args": { "url": "https://example.com" }, "result_hash": "def456" }
  ],
  "error": null,
  "fingerprint": { "node": "20.0", "agentsnap": "0.1.0" }
}

Fields

FieldTypeNotes
versionintSchema version. Currently 1.
modelstringModel identifier. Used to skip diffs across model upgrades.
inputstringThe user prompt that started the run.
outputstringFinal agent response.
toolsarrayOrdered list of tool calls. Each entry is {name, args, result_hash}.
tools[].namestringTool identifier (dotted path like filesystem.read_file).
tools[].argsobjectArgs passed to the tool. Recorded literally.
tools[].result_hashstringHash of the tool's return value. Avoid storing PII / large payloads in the trace.
errorstring \| nullRun-level error message, if the run failed.
fingerprintobjectEnvironment metadata. node + agentsnap version are recommended; add your own keys.

Why hash the tool result?

Tool results are often large (files, API payloads, search results). Hashing keeps the trace small and avoids leaking PII into your snapshot store. The hash is enough to detect "the result changed" โ€” for "how did it change?", re-run with full payloads enabled.

Diffing two traces

from agentsnap import diff

result = diff(baseline_trace, current_trace)
print(result.status)    # "match" | "drift" | "regression"
for change in result.changes:
    print(change.path, change.from_, "โ†’", change.to)

Sample traces

The agent-trace-samples dataset has 10 example traces (good + regressed pairs) you can drop into your tests.