Midora AI Case Study # 3: Project Context Compression

How raw conversation history hit token limits within weeks and naive truncation was removing the most important early context first. Fixed with a three-tier architecture (permanent core, rolling importance-weighted summary, live window), query-adaptive tier selection that only sent the tiers each query actually needed, and asynchronous summarisation so it never added latency.

External link preview

Google Doc

Open in new tab →