Cost Optimization Ranking
Ranked by ROI—implement in this order for maximum cost impact.
cost: {total: 0}. No cost optimization can be measured until instrumentation is fixed. No current requirement covers cost telemetry. See Session Insights.
Cache Economics
Prefix-based caching across all three major providers. Universal rule: maximize identical byte prefix length.
Model Role Savings
Route read-only operations to cheaper models. 60% of exploration turns use smol → 30% total session cost reduction.
Success Metrics
13 measurable targets across all 6 phases. Each metric has a defined measurement method.
Resource Requirements
~10 weeks total with 1–2 developers. Phase 6 is parallelizable across 2 developers.
Phase 1 — 1 week
1 developer. Prompt refactoring (R1), feature flag config (R12), length anchors (R8). Low risk, immediate value.
Phase 2 — 2 weeks
1–2 developers. Hook system (R2/R18) + TTSR (R13) requires streaming knowledge. Cross-agent rules (R17) parallelizable.
Phase 3 — 2 weeks
1–2 developers. Subagent orchestration (R3/R19) is the most complex phase. Model roles (R16) parallelizable.
Phase 4 — 1 week
1 developer. Prompt-only changes: risk taxonomy (R4), model variants (R7). A/B testing methodology needed.
Phase 5 — 2 weeks
1–2 developers. Memory pipeline (R5/R14) + session tree (R15). LLM extraction quality requires iteration.
Phase 6 — 2 weeks
2 developers. Split into 6a (R6+R10, verification+budget) and 6b (R9+R11+R20, MCP+autonomy+commit). Fully parallel.