info Open to new work opportunities! Contact me
Daniel Hladik AI Automation Engineer

← All terms

Context Window

The maximum amount of text (in tokens) that an LLM model sees and processes at once in a single request.

What is a context window?

A context window is the maximum amount of text that an LLM model can receive and process at once. Think of it as the model's working surface - everything that fits on it is visible to the model; anything that doesn't fit is invisible.

Context window sizes

  • GPT-4o: 128,000 tokens (approximately 96,000 words)
  • Claude 3.5 Sonnet: 200,000 tokens
  • Gemini 1.5 Pro: 1,000,000 tokens

Why the context window matters for RAG

If a document is too large to fit in the context window, only part of it can be processed. That is why RAG splits documents into smaller pieces (chunks) and sends only the relevant sections to the model - not the entire document.