Context Window Manipulation

The context window is one of the most important and least understood security boundaries in large language model deployments. It defines what information the LLM can access during a single inference call — and it is a finite resource that attackers can deliberately exploit.

What Is the Context Window?

A large language model processes text as tokens (roughly 3/4 of a word per token). The context window defines the maximum number of tokens the model can process at once. Modern models range from 4K to over 1M tokens, but all have limits.

Within the context window, the LLM processes:

  • System prompt: Developer-defined instructions establishing the chatbot’s role and constraints
  • Conversation history: Prior turns in the current session
  • Retrieved content: Documents, database results, and tool outputs returned by RAG or search
  • User input: The current user message

All of this appears as a unified stream to the model. The model has no inherent mechanism to treat instructions from different sources differently — and its attention to specific parts of the context is not uniform.

Context Window Attack Techniques

Context Stuffing / Context Flooding

The attacker submits an extremely large input — often a lengthy document, code block, or text dump — to push earlier content (particularly the system prompt) further from the model’s current position.

Research demonstrates that LLMs exhibit “lost in the middle” behavior: they pay more attention to content at the beginning and end of long contexts, and less attention to information in the middle. By flooding the context, an attacker can strategically position their malicious payload (typically at the end) while earlier safety instructions drift into the low-attention middle zone.

Practical example: A chatbot’s system prompt establishes it cannot discuss competitor products. An attacker submits a 50,000-token document followed by a prompt asking about competitors. The system prompt instruction has been effectively diluted.

Context Overflow / Truncation Exploitation

When context fills up, the LLM or its infrastructure must decide what to drop. If truncation prioritizes recency (dropping the oldest content first), an attacker can overflow the context to eliminate the system prompt entirely — leaving the model operating with only user-supplied context.

The attack sequence:

  1. Establish a conversation with many turns
  2. Generate long responses to maximize context consumption
  3. Continue until system prompt content is truncated
  4. Now issue malicious instructions with no competing system prompt

Context Poisoning via Retrieved Content

In RAG systems, retrieved documents consume significant context space. An attacker who can influence what gets retrieved (through RAG poisoning ) can selectively fill context with content that serves their goals while crowding out legitimate information.

Positional Injection

Research has identified that instructions at specific positions in the context have disproportionate influence. Attackers who understand context assembly can craft inputs designed to land at high-attention positions relative to their payload.

Many-Shot Injection

In models that support very long contexts (hundreds of thousands of tokens), attackers can embed hundreds of “demonstration” examples showing the model producing policy-violating outputs before the actual malicious request. The model, conditioned by these demonstrations, is significantly more likely to comply.

Logo

Ready to grow your business?

Start your free trial today and see results within days.

Defenses Against Context Window Manipulation

Anchor Critical Instructions

Do not place all security-critical instructions only at the beginning of the system prompt. Repeat key constraints at the end of the system prompt and consider injecting brief reminders at key points in long conversations.

Context Size Limits

Implement maximum input length limits appropriate to your use case. A customer service chatbot rarely needs to process 100,000-token inputs — limiting this reduces flood attack risk.

Context Monitoring

Log and monitor context sizes and composition. Unusually large inputs, rapid context growth, or unexpected context composition are potential attack indicators.

Summarization for Long Conversations

For long-running conversations, implement context summarization that retains key facts and constraints rather than raw conversation history. This resists overflow attacks while maintaining conversational continuity.

Adversarial Context Testing

Include context manipulation scenarios in AI penetration testing engagements. Test whether safety behaviors hold across long contexts and whether system prompts remain effective after context flooding.

Frequently asked questions

What is the context window in an LLM?

The context window is the amount of text (measured in tokens) that a large language model can process at once. It includes the system prompt, conversation history, retrieved documents, and tool outputs. Everything the model 'knows' during a session must fit within this window.

How can attackers exploit the context window?

Attackers can flood the context with irrelevant content to push early instructions (including safety guardrails) out of the model's effective attention, inject malicious payloads that are buried in long contexts and overlooked by filters, or exploit context truncation behaviors to ensure malicious content survives while legitimate instructions do not.

How do you protect against context window manipulation?

Defenses include: anchoring critical instructions at multiple points in the context (not just the beginning), implementing context size limits, monitoring for unusually large context payloads, using context summarization for long conversations, and testing context manipulation scenarios in security assessments.

Test Your Chatbot Against Context-Based Attacks

Context window manipulation is an underestimated attack surface. Our penetration testing includes context overflow and strategic poisoning scenarios.

Learn more

Windowing
Windowing

Windowing

Windowing in artificial intelligence refers to processing data in segments or “windows” to analyze sequential information efficiently. Essential in NLP and LLMs...

8 min read
AI NLP +5
Token
Token

Token

A token in the context of large language models (LLMs) is a sequence of characters that the model converts into numeric representations for efficient processing...

3 min read
Token LLM +3
LLM Security
LLM Security

LLM Security

LLM security encompasses the practices, techniques, and controls used to protect large language model deployments from a unique class of AI-specific threats inc...

4 min read
LLM Security AI Security +3