Beyond Parity — SubQ Code Roadmap

Deep Dives

Parity Matrix

Three-way feature comparison: SubQ vs Claude Code vs oh-my-pi

Foundations

Phase 1–2: Prompt caching, hooks, rules, TTSR

Orchestration

Phase 3–4: Subagents, model roles, risk taxonomy

Intelligence

Phase 5–6: Memory, sessions, verification, MCP

Security

4 critical vulnerabilities, remediation roadmap

Economics

Cost optimization, cache savings, model role ROI

Session Insights

25 real sessions analyzed: reminder loops, edit failures, navigation overhead

Phase Timeline

Six phases over 2 weeks. Each phase ships independently with its own acceptance criteria.

Phase 0 Spike (Day 1): Write a one-off test script that sends a cache_control-marked request through OpenRouter to each provider (Anthropic, DeepSeek, OpenAI) and checks response headers for cache hit indicators. Must pass before Phase 1 begins.

Foundations

Days 1–2

R1 R12 R8

Extension

Days 3–4

R2→R18 R17 R13

Orchestration

Days 5–6

R3→R19 R16

Prompt Intel

Day 7

R4 R7

Memory

Days 8–9

R5→R14 R15

Ship It

Day 10

R6 R10 R9 R11

Requirement Tiers

Tier 1 — Architectural Foundations

R1

Prompt Cache Boundary

Split static (~4000 tokens, cacheable) from dynamic (per-turn) prompt. API applies cache_control to static prefix.

Phase 1>60% hit rate

R2 → R18

Hooks → Extension API

Composable lifecycle hooks evolving into full Extension API with 30+ events, tool/command/renderer registration.

Phase 2Supersession

R3 → R19

Subagents → Task Swarm

Background task pool evolving into 100-parallel isolated agents with git worktree/fuse-overlay backends.

Phase 3Supersession

Cache the static prefix, hook the lifecycle, swarm the tasks. Everything else is refinement.

Tier 2 — Agent Intelligence

R4

Risk-Based Authorization

Three-category risk taxonomy: destructive, hard-to-reverse, visible-to-others. Agent self-assesses per-action.

Phase 4

R5 → R14

Memory → Autonomous Pipeline

Manual MEMORY.md evolving into LLM-powered autonomous extraction with secrets scanning and cross-session consolidation.

Phase 5Supersession

R6

Verification Agent

Adversarial verification subagent that independently checks non-trivial work. Parent cannot self-assign PASS.

Phase 6a

R7

Model-Specific Variants

Per-model-family prompt variants: comment suppression, thoroughness enforcement, false-claims mitigation.

Phase 4

Tier 3 — Operational Excellence

R8

Numeric Length Anchors

≤40 words between tool calls, ≤100 words in final responses. Proven 5–10% output token reduction.

Phase 1Promoted

R9

MCP Integration

Minimal viable: tools/list, tools/call, instructions. Delta-enabled to avoid cache busting on late connects.

Phase 6b

R10

Token Budget Mode

User specifies budget. Display output tokens each turn. Auto-continue if agent stops early with task incomplete.

Phase 6a

R11 • R12

Autonomy & Feature Flags

Autonomous mode with sleep tool, focus awareness, cost guardrails. Feature flags gate all experimental features.

Phase 6bPhase 1

Tier 4 — Beyond Parity (oh-my-pi Innovations)

R13

TTSR — Streamed Rules

Zero-context-cost rules monitoring output stream via regex. Abort, inject <system-interrupt>, retry.

Phase 2Novel

R15

Session Tree

Sessions as tree, not linear. Fork, resume, cross-project. Export to HTML with custom handlers.

Phase 5Novel

R16 • R17

Model Roles & Cross-Agent Rules

Configure default/smol/slow/plan/commit models. Discover rules from 7 agent config formats.

Phase 3Phase 2

R18 • R19 • R20

Extension API • Task Swarm • Smart Commit

Full extension system, 100-parallel isolated tasks, AI-powered conventional commits with hunk-level staging.

Supersession targets

Security Criticals

10 review agents identified 4 critical vulnerabilities requiring remediation before their respective phases ship.

CRITICAL-1

Hook Command Injection

Unsanitized file paths become shell metacharacters. Fix: pass context via stdin as JSON, user-level hooks only.

Before Phase 2

CRITICAL-2

Rule Prompt Injection

Repo ships crafted alwaysApply: true rules. Fix: project-level rules untrusted, user-level only for alwaysApply.

Before Phase 2

CRITICAL-3

Memory Secrets Leakage

Prompt injection writes malicious instructions to persistent memory. Fix: hardcoded extraction prompt, secrets regex.

Before Phase 5

CRITICAL-4

MCP Untrusted Instructions

Project-level MCP config with arbitrary commands. Fix: user-level config only, tag instructions as untrusted.

Before Phase 6

Success Targets

Metric

Target

Phase

Prompt cache hit rate (5+ turn sessions)

>60%

Phase 1

Session cold-start context reduction

≥40%

Phase 5

Output token spend reduction

≥5%

Phase 4

TTSR false-positive rate

<5%

Phase 2

S11

Model role cost reduction (exploration focus)

≥30%

Phase 3

SubQ Advantages to Preserve

Six capabilities where SubQ Code already leads. None are modified by the parity plan.

FFF Search

Cursor pagination, multi_grep, blast_radius analysis. Vastly superior to basic Glob/Grep.

Context Packing

LLM-driven context_pack with query synonyms. Smarter than brute-force Explore agent.

Behavioral Nudges

System reminders engine: bash spiral, plan drift, 5 reactive events.

Quality Measurement

Leverage analysis, prompt ROI, orchestration scoring. Unique to SubQ.

Plan-Driven Workflow

Plannotator with interview depths, TUI review. Structured planning beats reactive work.

Cross-Agent Intelligence

12-agent parser ecosystem. Unique multi-agent parsing capability.

Glossary

Acronyms and shorthand used across this document set.

DCE: Dead Code Elimination
FFF: Full-File Find — SubQ’s cursor-paginated search
FUSE: Filesystem in Userspace
JSONL: JSON Lines — newline-delimited JSON
MCP: Model Context Protocol — tool integration standard
OWASP: Open Web Application Security Project
ROI: Return on Investment
SWE: Software Engineering (as in SWE-bench)
TTSR: Token-Triggered Stream Rules — output-shaping via token patterns
TTL: Time to Live — cache expiration window
TUI: Text User Interface — terminal-based UI
R1–R20: Requirement IDs from the parity matrix (R1 Prompt Cache Boundary through R20 Extension API)