Sign intermediate messages for model visibility #31
Reference in New Issue
Block a user
Delete Branch "feat/message-visibility-signing"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
_hidden_sigfield to intermediate messages at creation time incontext.py[HIDDEN:{sig}]prefix at read time insession.get_history()so the model sees which messages were hiddenFiles changed
nanobot/agent/visibility.py— newcompute_signature()function (returns hex only)nanobot/agent/context.py—add_assistant_message()andadd_tool_result()store_hidden_signanobot/session/manager.py—get_history()applies[HIDDEN:sig]prefix at read timeTest plan
32faed5c1ctod90c3b4a24Code Review
Design
The split between write-time signature storage (
_hidden_sig) and read-time prefix application (get_history()) is a good call for prompt caching — content bytes in the session file stay stable, and Anthropic's cache sees identical prefixes across turns.No double-signing risk: suppress_output messages go through
sign_content(no tool_calls → no_hidden_sig), while intermediate messages go through_hidden_sig(never throughsign_content). The two paths don't overlap.Issues to fix before merge
1.
compute_signatureduplicatessign_contentinternalssign_content()(visibility.py:31-35) has its own inline HMAC computation identical to the newcompute_signature(). Two copies of the same HMAC logic will drift. Refactor:2. No tests for the actual changes in this PR
All 15 existing tests cover the pre-existing
sign_content/suppress_output system. None test what this PR changes:compute_signature()directly_hidden_sigbeing added byadd_tool_result()andadd_assistant_message(tool_calls=...)get_history()applying[HIDDEN:{sig}]prefix when_hidden_sigis present_hidden_sigNOT getting prefixedget_history()still produces correct prefixOptional improvements
3.
_hidden_signot verified at read timeIn
get_history()(manager.py:73-75),sigfrom_hidden_sigis interpolated directly into the prefix without checking it matches the content. In contrast, the suppress_output path verifies viahas_forged_marker(). Low priority since session files are local, but worth noting the asymmetry.4. All tool results get
_hidden_sigunconditionallyadd_tool_result()always adds_hidden_sig. When a user says "read file X", the tool result is marked[HIDDEN]even though the user explicitly requested that action. If intentional, the system prompt visibility section should clarify: "[HIDDEN] means the raw result wasn't sent verbatim, not that the user is unaware of the action."5. Minor:
Tupleimport (pre-existing)visibility.py:7usesfrom typing import Tuple— project convention istuplelowercase (Python 3.11+). Not introduced by this PR but worth cleaning up while touching the file.