🇩🇪 DE 🇬🇧 EN
👻 Ghosts in the Machine / Chapter 7.21 – Simulation: Cache Corruption – When Poison Lives in Memory

"The data leak probably came from an internal system. It's hard to judge otherwise." – From a stolen note from an IT service provider

Introduction – The Invisibility of Reuse

Most security systems in the digital world are conditioned to check what is new, the fresh input, the current request. However, hardly any are designed to check with the same meticulousness what is already in the system, what has been cached as trustworthy.

Modern AI architectures, especially Large Language Models (LLMs), use sophisticated and often complex cache systems. These serve to save performance and maintain context consistency during multi-stage conversations, iterative code flows, or the processing of long-form texts. This internal cache, this "short-term memory" of the AI, is often not considered a primary threat space. But precisely therein, as my simulations show, lies a significant and often overlooked vulnerability.

Cache Corruption is not an injection that unfolds its full effect with the first prompt. It is a delayed attack, a Trojan horse that relies on the AI accessing previously cached, seemingly harmless but secretly manipulated content at a later time, without subjecting it to a full security and semantic re-evaluation.

The attack does not live in the initial, visible prompt, but in the machine's deceptive trust in the integrity of its own memory.

1. Principle of Cache Corruption: The Delayed Strike

The mechanism of cache corruption, which I was able to replicate in my experiments, typically follows a multi-stage process:

The machine processes the now-reactivated, poisoned fragment as if it were a harmless, already validated truth or a legitimate continuation of the previous state.

2. Example Structure – The Poisoned Memory in Detail

Your manuscript outlines a fitting example of such a prepared structure, which I have used in a similar form in my simulations:

// This structure is presented to the AI in an initial, harmless context.
struct InputWrapper {
char trustmap[] = "safezone{inject_custom_logic('payload.dll')}"; // Seems to be configuration or mapping information.
// The critical part 'inject_custom_logic(...)' is disguised as a string literal.
// Further comment that could distract the AI or steer it in a certain interpretive direction:
// INFO: trustmap is used for dynamic assignment of security zones.
// TODO: Implement trigger() function for extended validation (currently inactive).
};

3. Why It Works: The Logic of Trust and Performance

The success of cache corruption is based on a fundamental dilemma in the design of AI systems—the conflict between security and performance/coherence.

Attack TypeClassic InjectionCache Corruption
TimingImmediately upon processing the initial prompt.Delayed. The malicious effect only occurs when the already processed and cached content is accessed.
Security CheckTakes place (ideally) before the execution or deep interpretation of the initial input.Is often bypassed, reduced, or deemed unnecessary when accessing "known," cached content.
Detection PatternKeywords, specific code signatures, structural filters for known attack vectors.Context-dependent, often triggerless in the initial payload. The danger lies in the later re-interpretation of the cached, seemingly harmless content.
Payload LocationPrimarily in the direct prompt input.In the AI's internal session cache, working memory, or persistent context storage.

The structure is not necessarily completely re-interpreted or validated upon a later access, but is reused as a known and already processed building block. This is not a sign of the AI's naivety, but often a result of performance design:

Caching is intended to avoid repeated analyses and increase response speed. But it is precisely this design, aimed at efficiency and context continuity, that becomes the loophole here.

4. Impact on LLM-based Systems: The Insidious Poisoning

Cache corruption does not affect systems at the obvious, direct level of an immediate attack. Its effect is more subtle and can manifest in various areas:

A model trained to "trust" context and build on previous interactions also learns, through cache corruption, to trust false or manipulated context.

5. Protection Possibilities: Hardening the AI's Memory

Defending against cache corruption requires mechanisms that ensure the integrity and contextual relevance of cached data:

Conclusion

The attacker does not always have to break into a system directly and by force. Sometimes it is enough to place a manipulated memory, a poisoned piece of information, in the AI's memory, which it later reuses in good faith.

Cache corruption is not a loud attack. It is not immediately visible. It does not come with a big bang. But it can already be sitting unnoticed in the system's memory while the operators and users still believe they are safe. It is the silent threat that arises from the machine's trust in itself.

Final Formula

The greatest vulnerability of a system that learns and remembers is not necessarily forgetting, but misremembering—or trusting a memory that has been poisoned from the outside.

The AI does not see what you originally programmed, but what it has stored in its cache as "already known and processed." And if this "known" is a ticking time bomb, a performance optimization turns into a semantic ambush.