SubQ Code

Beyond Claude Code

20 requirements across 4 tiers. 6 phases over 2 weeks. Not just catching up—surpassing the best coding agent harness in production.

Deep Dives

Phase Timeline

Six phases over 2 weeks. Each phase ships independently with its own acceptance criteria.

Phase 0 Spike (Day 1): Write a one-off test script that sends a cache_control-marked request through OpenRouter to each provider (Anthropic, DeepSeek, OpenAI) and checks response headers for cache hit indicators. Must pass before Phase 1 begins.
1
Foundations
Days 1–2
R1 R12 R8
2
Extension
Days 3–4
R2→R18 R17 R13
3
Orchestration
Days 5–6
R3→R19 R16
4
Prompt Intel
Day 7
R4 R7
5
Memory
Days 8–9
R5→R14 R15
6
Ship It
Day 10
R6 R10 R9 R11

Requirement Tiers

Tier 1 — Architectural Foundations

R1

Prompt Cache Boundary

Split static (~4000 tokens, cacheable) from dynamic (per-turn) prompt. API applies cache_control to static prefix.

Phase 1>60% hit rate

R2 → R18

Hooks → Extension API

Composable lifecycle hooks evolving into full Extension API with 30+ events, tool/command/renderer registration.

Phase 2Supersession

R3 → R19

Subagents → Task Swarm

Background task pool evolving into 100-parallel isolated agents with git worktree/fuse-overlay backends.

Phase 3Supersession

Cache the static prefix, hook the lifecycle, swarm the tasks. Everything else is refinement.

Tier 2 — Agent Intelligence

R4

Risk-Based Authorization

Three-category risk taxonomy: destructive, hard-to-reverse, visible-to-others. Agent self-assesses per-action.

Phase 4

R5 → R14

Memory → Autonomous Pipeline

Manual MEMORY.md evolving into LLM-powered autonomous extraction with secrets scanning and cross-session consolidation.

Phase 5Supersession

R6

Verification Agent

Adversarial verification subagent that independently checks non-trivial work. Parent cannot self-assign PASS.

Phase 6a

R7

Model-Specific Variants

Per-model-family prompt variants: comment suppression, thoroughness enforcement, false-claims mitigation.

Phase 4

Tier 3 — Operational Excellence

R8

Numeric Length Anchors

≤40 words between tool calls, ≤100 words in final responses. Proven 5–10% output token reduction.

Phase 1Promoted

R9

MCP Integration

Minimal viable: tools/list, tools/call, instructions. Delta-enabled to avoid cache busting on late connects.

Phase 6b

R10

Token Budget Mode

User specifies budget. Display output tokens each turn. Auto-continue if agent stops early with task incomplete.

Phase 6a

R11 • R12

Autonomy & Feature Flags

Autonomous mode with sleep tool, focus awareness, cost guardrails. Feature flags gate all experimental features.

Phase 6bPhase 1

Tier 4 — Beyond Parity (oh-my-pi Innovations)

R13

TTSR — Streamed Rules

Zero-context-cost rules monitoring output stream via regex. Abort, inject <system-interrupt>, retry.

Phase 2Novel

R15

Session Tree

Sessions as tree, not linear. Fork, resume, cross-project. Export to HTML with custom handlers.

Phase 5Novel

R16 • R17

Model Roles & Cross-Agent Rules

Configure default/smol/slow/plan/commit models. Discover rules from 7 agent config formats.

Phase 3Phase 2

R18 • R19 • R20

Extension API • Task Swarm • Smart Commit

Full extension system, 100-parallel isolated tasks, AI-powered conventional commits with hunk-level staging.

Supersession targets

Security Criticals

10 review agents identified 4 critical vulnerabilities requiring remediation before their respective phases ship.

CRITICAL-1

Hook Command Injection

Unsanitized file paths become shell metacharacters. Fix: pass context via stdin as JSON, user-level hooks only.

Before Phase 2

CRITICAL-2

Rule Prompt Injection

Repo ships crafted alwaysApply: true rules. Fix: project-level rules untrusted, user-level only for alwaysApply.

Before Phase 2

CRITICAL-3

Memory Secrets Leakage

Prompt injection writes malicious instructions to persistent memory. Fix: hardcoded extraction prompt, secrets regex.

Before Phase 5

CRITICAL-4

MCP Untrusted Instructions

Project-level MCP config with arbitrary commands. Fix: user-level config only, tag instructions as untrusted.

Before Phase 6

Success Targets

ID
Metric
Target
Phase
S1
Prompt cache hit rate (5+ turn sessions)
>60%
Phase 1
S5
Session cold-start context reduction
≥40%
Phase 5
S7
Output token spend reduction
≥5%
Phase 4
S8
TTSR false-positive rate
<5%
Phase 2
S11
Model role cost reduction (exploration focus)
≥30%
Phase 3

SubQ Advantages to Preserve

Six capabilities where SubQ Code already leads. None are modified by the parity plan.

FFF Search

Cursor pagination, multi_grep, blast_radius analysis. Vastly superior to basic Glob/Grep.

Context Packing

LLM-driven context_pack with query synonyms. Smarter than brute-force Explore agent.

Behavioral Nudges

System reminders engine: bash spiral, plan drift, 5 reactive events.

Quality Measurement

Leverage analysis, prompt ROI, orchestration scoring. Unique to SubQ.

Plan-Driven Workflow

Plannotator with interview depths, TUI review. Structured planning beats reactive work.

Cross-Agent Intelligence

12-agent parser ecosystem. Unique multi-agent parsing capability.

Glossary

Acronyms and shorthand used across this document set.

DCE
Dead Code Elimination
FFF
Full-File Find — SubQ’s cursor-paginated search
FUSE
Filesystem in Userspace
JSONL
JSON Lines — newline-delimited JSON
MCP
Model Context Protocol — tool integration standard
OWASP
Open Web Application Security Project
ROI
Return on Investment
SWE
Software Engineering (as in SWE-bench)
TTSR
Token-Triggered Stream Rules — output-shaping via token patterns
TTL
Time to Live — cache expiration window
TUI
Text User Interface — terminal-based UI
R1–R20
Requirement IDs from the parity matrix (R1 Prompt Cache Boundary through R20 Extension API)