info Open to new work opportunities! Contact me
Daniel Hladik AI Automation Engineer

← All terms

Chunking

The process of dividing long text into smaller, logical units (chunks) for efficient processing in RAG systems.

What is chunking?

Chunking is the process of dividing a long text into smaller, logical units - called chunks. It is a key step when preparing data for RAG systems.

Why chunking matters

LLM models have a limited context window. Instead of presenting the entire document at once, only relevant chunks are passed to the model, saving tokens and increasing response accuracy.

Typical parameters

  • Chunk size: 500–1,000 characters
  • Overlap: 10–20% - to maintain context between adjacent chunks
  • Splitting: By paragraphs, headings, or logical sections