Phase 5–6 Deep Dive

Intelligence

Persistent memory, autonomous extraction, session trees, verification agents, MCP integration, autonomy, and intelligent commits. The features that make the agent learn and self-improve.

R5 → R14: Memory → Autonomous Pipeline

Tier 2 Supersession Manual MEMORY.md (R5) evolving into LLM-powered autonomous extraction with secrets scanning and cross-session consolidation (R14).

R5: Manual memory R14: Autonomous pipeline

Manual

ManualMemoryEntry

User says “remember X” or agent proposes at session end. Always pinned—never overwritten by autonomous extraction.

source: "manual"pinned: true

Extracted

ExtractedMemoryEntry

LLM reads session history, extracts decisions/constraints/workflows as candidate facts. Confidence score 0–1.

source: "extracted"sessionId

Consolidated

ConsolidatedMemoryEntry

Cross-session merge. Deduplicates, resolves contradictions (newest wins, pinned always wins). Tracks merge provenance.

source: "consolidated"mergedFrom[]
memory pipeline
store ~/.subq/memory/MEMORY.md (index) + topic files
scope per-project (git root) + global (~/.subq/memory/)
limit 200 lines — 25KB cap unnecessary (~3K tokens)
inject after AGENTS.md, before rules — categorized key-value format
trigger on session_end, queue extraction job
model default or slow for extraction — smol hallucinates
locking rename-based atomic writes + PID stale detection
staleness facts unreferenced in 5+ sessions → demoted → pruned
skills 3+ similar tool sequences → auto-generated skill playbook

CRITICAL-3

Memory Secrets Leakage

Prompt injection writes malicious instructions to persistent memory. Fix: hardcoded extraction prompt (not influenceable by session context), secrets regex scanning before disk write, reject entries that look like instructions/commands.

Before Phase 5
Secrets Regex Patterns
sk_live_ | sk_test_ ghp_ | gho_ | github_pat_ xoxb- | xoxp- AKIA[0-9A-Z]{16} AIza[0-9A-Za-z\-_]{35} ya29\. Bearer [a-zA-Z0-9...]+=* -----BEGIN PRIVATE KEY-----

R15: Session Tree

Tier 4 Novel Sessions as tree, not linear. Fork, resume, cross-project. Export to HTML/JSONL with custom handlers.

session tree operations
/fork branch from current point → session.fork()
/resume switch to existing session → SessionManager.open()
/sessions list all sessions → SessionManager.list(cwd)
/tree display session tree → sessionManager.getTree()
/export export to HTML or JSONL
--continue reopen most recent → SessionManager.continueRecent()
storage append-only JSONL with id/parentId tree structure
cross-project rebuild system prompt for target project + inject warning
Security note: Cross-project resume strips tool output from parent session history (conversation text only). Prevents proprietary code from project A leaking into project B.

R6: Verification Agent

Tier 2 Planned Adversarial verification subagent that independently checks non-trivial work. Parent cannot self-assign PASS.

Verification Flow
Trigger (automatic)

API/Routes

Any change to **/api/**, **/routes/**

DB/Middleware

Any change to **/db/**, **/middleware/**

Signatures

Modified function signature or return type

>100 Lines

Any change touching more than 100 lines

Adversarial Check (max 3 retries)

Read All Changed

Read every modified file from parent’s file list

Run Tests

Execute project test command, include full stdout

Lint/Typecheck

Run lint and typecheck, include full stdout

3 Adversarial Checks

Must attempt ≥3 adversarial checks before PASS

Cross-model verification: For high-stakes changes, use a different model family. If primary is Qwen, verify with Claude. Same-model verification shares blind spots.

R10: Token Budget Mode

Tier 3 Planned User specifies budget via +500k or --budget 500000. Display output tokens each turn. Auto-continue if agent stops early.

Budget Controls
+500k syntax --budget flag TUI footer display Task complete signal $50 hard ceiling Auto-continue on early stop
Auto-continue prompt: Directed, not generic. Assess: task complete? Untested edge cases? Related files? Skipped verification? Explicit anti-padding: do NOT re-read files, add docs, or refactor working code.

R9 & R11: MCP Integration & Autonomous Mode

Tier 3 Two independent capabilities shipping in Phase 6b.

R9

MCP Integration

Minimal viable: tools/list, tools/call, server instructions. Delta-enabled to avoid cache busting on late connects. Namespace prefix prevents collisions.

Phase 6bUser-level config only

R11

Autonomous Mode

Sleep tool (60–3600s), cost guardrails ($10 default cap), terminal focus awareness, /stop-auto command. Explicit --autonomous opt-in required.

Phase 6bOpt-in only

CRITICAL-4

MCP Untrusted Instructions

Project-level MCP config with arbitrary commands is code execution via repo cloning. Fix: user-level config only (~/.subq/settings.json), tag server instructions as untrusted, namespace tools, confirm on first use.

Before Phase 6

R20: Intelligent Commit Tool

Tier 4 Deprioritized Agentic git inspection, split commits, hunk-level staging, conventional format.

Agentic Inspection

git diff --stat overview, per-file diffs, hunk-level analysis for semantic understanding.

Split Commits

Unrelated changes → multiple atomic commits ordered by dependency. User confirms before executing.

Hunk-Level Staging

git add -p equivalent via tool interface. Stage specific hunks while leaving others unstaged.

Format Validation

Enforce conventional commit format. Detect filler words. Reject and suggest improvement.

Deprioritized: Lowest-value feature in the plan. Hunk-level staging via git add -p requires parsing diff output and managing interactive staging through a non-interactive tool interface. The existing git commit workflow covers 90% of commit quality. Defer unless specific user demand.

Agent-Native Parity

17 of 20 planned capabilities lack agent-invocable equivalents. An agent cannot do what a user can do.

Core problem: Today the agent cannot query its own token budget, read its session tree, search memory, or explicitly request verification. Four missing tools block true autonomy.
session_tree

Agent has zero visibility into session history. Cannot inspect parent/child relationships or navigate branching conversations.

memory read/search/write/pin

Memory is currently write-only passive injection. Agent cannot search, filter, or pin memories during execution.

harness meta-tool

Agent cannot query its own token budget, remaining context, or configuration. Flies blind on resource constraints.

request_verification

Agent cannot explicitly request verification of its own work. Verification is external-only, never self-initiated.