Fix consolidation checkpoint-based trim to preserve history #17

Merged
code-server merged 2 commits from fix/consolidation-checkpoint-trim into main 2026-03-04 16:32:56 +01:00
Collaborator

Problem

When memory_consolidate is called mid-turn, deferred trim was using a relative count (_pending_trim) that gets applied after the turn completes. This caused the trim to recalculate the cut point based on the FINAL session size, and _trim_to_clean_boundary would walk backward to find a user message, often landing at the START of the current turn and wiping all prior history.

Example bug flow:

  1. Session has 426 messages
  2. Consolidate called mid-turn, sets pending_trim=10
  3. Turn continues, session grows to 440 messages
  4. Deferred trim calculates cut = 440 - 10 = 430
  5. No user messages in range [430-439] (all tool chain)
  6. Walks backward, finds user message at position 426 (current turn start)
  7. Wipes messages 0-425

Solution

Replace relative count with absolute checkpoint position:

  • At consolidation time: calculate checkpoint = len(session) - keep_count
  • Find clean boundary at or before checkpoint
  • Store absolute position in session._trim_checkpoint
  • At trim time: simply slice session.messages[checkpoint:]

This preserves the intended trim point regardless of messages added during the remainder of the turn.

Testing

Hot-patched to production and verified:

  • Before: Consolidation wiped all history, kept only current turn (16 messages)
  • After: Consolidation preserved history correctly (23 messages from before consolidation)

Changes

  • Added _find_clean_boundary_before() to find clean user message boundary at checkpoint time
  • Modified _consolidate_memory() to set _trim_checkpoint instead of _pending_trim
  • Modified deferred trim to use checkpoint: session.messages[checkpoint:]
## Problem When `memory_consolidate` is called mid-turn, deferred trim was using a relative count (`_pending_trim`) that gets applied after the turn completes. This caused the trim to recalculate the cut point based on the FINAL session size, and `_trim_to_clean_boundary` would walk backward to find a user message, often landing at the START of the current turn and wiping all prior history. **Example bug flow:** 1. Session has 426 messages 2. Consolidate called mid-turn, sets `pending_trim=10` 3. Turn continues, session grows to 440 messages 4. Deferred trim calculates `cut = 440 - 10 = 430` 5. No user messages in range [430-439] (all tool chain) 6. Walks backward, finds user message at position 426 (current turn start) 7. Wipes messages 0-425 ❌ ## Solution Replace relative count with absolute checkpoint position: - At consolidation time: calculate `checkpoint = len(session) - keep_count` - Find clean boundary at or before checkpoint - Store absolute position in `session._trim_checkpoint` - At trim time: simply slice `session.messages[checkpoint:]` This preserves the intended trim point regardless of messages added during the remainder of the turn. ## Testing Hot-patched to production and verified: - **Before:** Consolidation wiped all history, kept only current turn (16 messages) - **After:** Consolidation preserved history correctly (23 messages from before consolidation) ## Changes - Added `_find_clean_boundary_before()` to find clean user message boundary at checkpoint time - Modified `_consolidate_memory()` to set `_trim_checkpoint` instead of `_pending_trim` - Modified deferred trim to use checkpoint: `session.messages[checkpoint:]`
code-server added 1 commit 2026-03-04 16:05:24 +01:00
fix(consolidation): use checkpoint-based trim instead of relative count
Build Nanobot OAuth / build (pull_request) Successful in 6m50s
Build Nanobot OAuth / cleanup (pull_request) Has been skipped
dc94aa76cc
**Problem:**
When memory_consolidate is called mid-turn, deferred trim was using a
relative count (_pending_trim) that gets applied after the turn completes.
This caused the trim to recalculate the cut point based on the FINAL session
size (after messages were added), and _trim_to_clean_boundary would walk
backward to find a user message, often landing at the START of the current
turn and wiping all prior history.

Example: Session with 426 messages, consolidate sets pending_trim=10, turn
grows to 440 messages, trim calculates cut=430, finds no user messages in
430-439 (all tool chain), walks back to position 426 (current turn start),
wipes messages 0-425.

**Solution:**
Replace relative count with absolute checkpoint position:
- At consolidation time: calculate checkpoint = len(session) - keep_count
- Find clean boundary at or before checkpoint (not after turn completes)
- Store absolute position in session._trim_checkpoint
- At trim time: simply slice session.messages[checkpoint:]

This preserves the intended trim point regardless of messages added during
the remainder of the turn.

**Testing:**
Hot-patched and verified:
- Before: consolidation wiped all history, kept only current turn (16 msgs)
- After: consolidation preserved history correctly (23 msgs from before consolidation)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Author
Collaborator

Code Review: Checkpoint-Based Trim Fix

Summary

CONDITIONAL APPROVAL - Fix is sound but incomplete. One critical issue must be addressed before merge.


What Works Well

  1. Root cause correctly identified: Relative count (_pending_trim) recalculates from final session size, causing history wipe
  2. Solution is elegant: Absolute checkpoint position preserves intended trim point
  3. Edge cases handled: _find_clean_boundary_before() properly handles empty lists, bounds, and missing user messages
  4. Production validated: Hot-patch confirmed fix (16 msgs → 23 msgs preserved)
  5. Logging improved: Shows old/new sizes + checkpoint value for debugging

CRITICAL: Missing Deferred Trim in System Handler

The _process_system_message() method (lines 611-812) does NOT apply deferred trims.

Impact:

  • Subagents can trigger memory_consolidate tool
  • Checkpoint gets set on session
  • But system handler never checks/applies _trim_checkpoint
  • Session grows unbounded until next regular user message

Required fix (add after line 804):

self.sessions.save(session)

# Deferred trim: same logic as _process_message
checkpoint = getattr(session, '_trim_checkpoint', None)
if checkpoint is not None:
    old_size = len(session.messages)
    session.messages = session.messages[checkpoint:]
    session._trim_checkpoint = None
    self.sessions.save(session)
    logger.info(f"Deferred trim applied: {old_size} -> {len(session.messages)} messages (checkpoint={checkpoint})")

📝 Verification of Fix Logic

Bug scenario reproduced:

  1. Session: 426 messages
  2. Consolidate mid-turn: old code sets _pending_trim=10
  3. Turn continues → 440 messages
  4. Old trim: cut = 440-10 = 430, walks back to msg 426 (turn start), wipes history

New behavior:

  1. Session: 426 messages
  2. Consolidate calculates checkpoint = 426-10 = 416, finds user msg at 415
  3. Turn continues → 440 messages
  4. New trim: messages[415:] keeps 25 messages from checkpoint

Fix verified!


📊 Test Coverage Gap

Missing tests:

  • No tests for _find_clean_boundary_before() edge cases
  • No regression test for this specific bug
  • No tests verifying checkpoint behavior vs old pending_trim

Recommended (not blocking):

def test_deferred_trim_checkpoint_stable_across_growth():
    """Verify checkpoint preserves cut point when session grows."""
    session = create_session_with_messages("test:stable", 426)
    
    # Set checkpoint (as consolidation would)
    checkpoint = 415  # Simulated clean boundary
    session._trim_checkpoint = checkpoint
    
    # Session grows during turn
    for i in range(14):
        session.add_raw_message({"role": "tool", "content": f"r{i}"})
    assert len(session.messages) == 440
    
    # Apply deferred trim
    session.messages = session.messages[checkpoint:]
    
    # Should keep from checkpoint onwards (25 messages)
    assert len(session.messages) == 25
    assert session.messages[0]["content"] == "msg415"

🔍 Code Quality Notes

  1. _trim_to_clean_boundary still needed: Used at line 991 for non-mem0 immediate trim. Not dead code.
  2. Checkpoint calculation is sound: max(0, len-keep_count) prevents negative indices
  3. Boundary search is safe: Falls back to target_pos if no user message found
  4. Enhanced logging is helpful: checkpoint={checkpoint}, current_size={len} aids debugging

���� Recommendation

DO NOT MERGE until deferred trim is added to system message handler.

Once fixed: LGTM

This is a well-designed fix for a critical bug. The checkpoint approach is more maintainable than the old relative-count method.

## Code Review: Checkpoint-Based Trim Fix ### Summary **CONDITIONAL APPROVAL** - Fix is sound but incomplete. One critical issue must be addressed before merge. --- ### ✅ What Works Well 1. **Root cause correctly identified**: Relative count (`_pending_trim`) recalculates from final session size, causing history wipe 2. **Solution is elegant**: Absolute checkpoint position preserves intended trim point 3. **Edge cases handled**: `_find_clean_boundary_before()` properly handles empty lists, bounds, and missing user messages 4. **Production validated**: Hot-patch confirmed fix (16 msgs → 23 msgs preserved) 5. **Logging improved**: Shows old/new sizes + checkpoint value for debugging --- ### ❌ CRITICAL: Missing Deferred Trim in System Handler The `_process_system_message()` method (lines 611-812) **does NOT apply deferred trims**. **Impact**: - Subagents can trigger `memory_consolidate` tool - Checkpoint gets set on session - But system handler never checks/applies `_trim_checkpoint` - Session grows unbounded until next regular user message **Required fix** (add after line 804): ```python self.sessions.save(session) # Deferred trim: same logic as _process_message checkpoint = getattr(session, '_trim_checkpoint', None) if checkpoint is not None: old_size = len(session.messages) session.messages = session.messages[checkpoint:] session._trim_checkpoint = None self.sessions.save(session) logger.info(f"Deferred trim applied: {old_size} -> {len(session.messages)} messages (checkpoint={checkpoint})") ``` --- ### 📝 Verification of Fix Logic **Bug scenario reproduced**: 1. Session: 426 messages 2. Consolidate mid-turn: old code sets `_pending_trim=10` 3. Turn continues → 440 messages 4. Old trim: `cut = 440-10 = 430`, walks back to msg 426 (turn start), **wipes history** ❌ **New behavior**: 1. Session: 426 messages 2. Consolidate calculates `checkpoint = 426-10 = 416`, finds user msg at 415 3. Turn continues → 440 messages 4. New trim: `messages[415:]` keeps 25 messages from checkpoint ✅ **Fix verified!** --- ### 📊 Test Coverage Gap **Missing tests**: - ❌ No tests for `_find_clean_boundary_before()` edge cases - ❌ No regression test for this specific bug - ❌ No tests verifying checkpoint behavior vs old pending_trim **Recommended** (not blocking): ```python def test_deferred_trim_checkpoint_stable_across_growth(): """Verify checkpoint preserves cut point when session grows.""" session = create_session_with_messages("test:stable", 426) # Set checkpoint (as consolidation would) checkpoint = 415 # Simulated clean boundary session._trim_checkpoint = checkpoint # Session grows during turn for i in range(14): session.add_raw_message({"role": "tool", "content": f"r{i}"}) assert len(session.messages) == 440 # Apply deferred trim session.messages = session.messages[checkpoint:] # Should keep from checkpoint onwards (25 messages) assert len(session.messages) == 25 assert session.messages[0]["content"] == "msg415" ``` --- ### 🔍 Code Quality Notes 1. **`_trim_to_clean_boundary` still needed**: Used at line 991 for non-mem0 immediate trim. Not dead code. 2. **Checkpoint calculation is sound**: `max(0, len-keep_count)` prevents negative indices 3. **Boundary search is safe**: Falls back to target_pos if no user message found 4. **Enhanced logging is helpful**: `checkpoint={checkpoint}, current_size={len}` aids debugging --- ### ���� Recommendation **DO NOT MERGE** until deferred trim is added to system message handler. Once fixed: **LGTM** ✅ This is a well-designed fix for a critical bug. The checkpoint approach is more maintainable than the old relative-count method.
code-server added 1 commit 2026-03-04 16:15:55 +01:00
fix: add deferred trim to system message handler
Build Nanobot OAuth / build (pull_request) Successful in 6m58s
Build Nanobot OAuth / cleanup (pull_request) Has been skipped
e782318338
System messages (including subagents) can trigger memory_consolidate,
which sets _trim_checkpoint. The system handler must also check and
apply deferred trims to prevent unbounded session growth.

Addresses review feedback from PR #17.
Author
Collaborator

Fixed in commit e782318

Added deferred trim logic to _process_system_message() (lines 806-815):

# Deferred trim: same logic as _process_message
# System messages (including subagents) can trigger consolidation
checkpoint = getattr(session, '_trim_checkpoint', None)
if checkpoint is not None:
    old_size = len(session.messages)
    session.messages = session.messages[checkpoint:]
    session._trim_checkpoint = None
    self.sessions.save(session)
    logger.info(f"Deferred trim applied: {old_size} -> {len(session.messages)} messages (checkpoint={checkpoint})")

Now both message handlers apply deferred trims. Ready for merge.

✅ **Fixed in commit e782318** Added deferred trim logic to `_process_system_message()` (lines 806-815): ```python # Deferred trim: same logic as _process_message # System messages (including subagents) can trigger consolidation checkpoint = getattr(session, '_trim_checkpoint', None) if checkpoint is not None: old_size = len(session.messages) session.messages = session.messages[checkpoint:] session._trim_checkpoint = None self.sessions.save(session) logger.info(f"Deferred trim applied: {old_size} -> {len(session.messages)} messages (checkpoint={checkpoint})") ``` Now both message handlers apply deferred trims. Ready for merge.
code-server merged commit eee9c38953 into main 2026-03-04 16:32:56 +01:00
code-server deleted branch fix/consolidation-checkpoint-trim 2026-03-04 16:32:56 +01:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: wylab/nanobot#17