Prompt Engineering Is Dead. Long Live Context Engineering.
PersonalI stopped optimizing my prompts. Not because it doesn’t work – but because it’s the wrong question.
The right question is: What’s in the context when the agent starts working?
The Problem with Long Conversations
Anyone who works regularly with LLMs knows the pattern: at the beginning of a chat, everything runs smoothly. After twenty, thirty messages, the model starts ignoring earlier instructions, contradicting itself, or forgetting details it actually knows. This is called Context Rot.
That’s not a bug. It’s a structural problem.
Meredith Whittaker, President of the Signal Foundation, put it into words at 39C3: Exponential Decay of Success. The math behind it is sobering. If a model has an error rate of just one percent at each step (meaning it’s 99% correct), the probability of success after 100 steps is still around 37 percent. After 1,000 steps: 0.004 percent. Current top models lose significant reliability after roughly 60 steps – even at a nominal accuracy above 85 percent.
Long conversations are therefore not just inconvenient. They are systematically unreliable.
What Newer Models Do Differently
Early models were optimized for single requests – short context, one answer, done. Newer models are designed differently. They’re not built for long dialogues, but for loading a complete context once and then acting autonomously: launching sub-agents, calling tools, delegating sub-problems.
That’s not a gradual difference. That’s a different paradigm.
The open-weights model GPT-OSS-20B represents the old school: a model that was primarily supplied with information through a carefully crafted prompt – large context was neither the goal nor the strength. That’s precisely why it’s explicitly documented as not suitable for long context recall and tool calling. That wasn’t a weakness of the model, but a reflection of the assumptions at the time. Today it’s becoming clear that this can be changed with reasonable effort: newer models like Nvidia Nemotron 3 Super or the fine-tuned Persona Kappa (20.9B MoE, 131K token context, RULER benchmark 100 percent across all context lengths) are specifically designed for large contexts and tool calling – and Kappa was trained on a single workstation with four desktop GPUs, no data center, no InfiniBand.
Context Engineering Instead of Prompt Engineering
Anthropic puts it succinctly in their engineering blog:
“Find the smallest possible set of high-signal tokens that maximize the likelihood of desired outcomes.”
High-signal tokens are tokens that actually provide the model with useful information – as opposed to filler text that bloats the context without contributing anything. A precise function name is a high-signal token. A lengthy introduction that prepares the model for what it already knows is not.
Context is a finite, valuable resource – not a free-text field. What ends up in the context window determines the quality of the result: which documents, which tool definitions, which system prompt, which artifacts from previous steps.
Prompt Engineering was the art of getting the best out of a poor context. Context Engineering is the discipline of building the context correctly from the start.
Ralph Loops: Short Cycles Instead of Long Chains
A practical answer to Context Rot is Ralph Loops: instead of one long, increasingly degrading conversation, you work in short, focused iterations. Each loop gets a freshly built, targeted context. Errors are resolved, then you move to the next loop – with a clean starting state.
That sounds more effort than a long chat. In practice, it’s more reliable.
Two Phases, Not One
Anyone working with agents today essentially has two phases – even though most don’t explicitly treat it that way yet.
Phase 1: Build the context. I work interactively with an agent to develop a project’s specification. Not through a single long prompt, but in dialogue: roughly outline the project, define technical requirements, create individual spec files – based on PDFs, existing scripts, requirements. I clarify open questions in interview mode: the agent asks, I answer. The result is an implementation-plan.md – the document that starts the next agent.
Phase 2: Let the agents loose. Hand over the finished context, start one or more agents, Ralph Loop style, and then – largely autonomously – let them run. No readjusting via prompt. No magic.
What This Means
All the tricks from the Prompt Engineering era are becoming obsolete. “Act as an expert in…” – unnecessary. Magical phrasings meant to put the model in the right mode – workarounds for a poorly built context.
The actual work shifts forward: What information does the agent really need? What do I leave out? How do I structure the spec so the next step can start cleanly?
That’s less wizardry. And significantly more engineering.
Sources and Further Reading
- Anthropic: Effective Context Engineering for AI Agents
- arxiv: Beyond Exponential Decay
- Geoffrey Huntley: Ralph Loops
- Level1Techs (Wendel): Best 120b Model for Offline Use? Nemotron 3 Super Out Now
- Level1Techs Forum: Persona Kappa
- Meredith Whittaker, 39C3: youtube.com
Related
Archives
- April 2026
- March 2026
- August 2025
- November 2023
- February 2023
- January 2023
- April 2020
- January 2018
- December 2017
- May 2017
- February 2016
- September 2015
- December 2014
- August 2014
- June 2014
- March 2014
- February 2014
- September 2013
- August 2013
- July 2013
- November 2012
- October 2012
- September 2012
- June 2012
- May 2012
- April 2012
- March 2012
- February 2012
- January 2012
- December 2011
- November 2011
- October 2011
- August 2011
- July 2011
- June 2011
- May 2011
- January 2011
- August 2010
- July 2010
- June 2010
- May 2010
- January 2010
- November 2009
- October 2009
- September 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- September 2006
- June 2006
- May 2006
- April 2006
- March 2006
- February 2006
- January 2006
Leave a Reply