« Back to Glossary Index

Reranking is the process of reordering retrieved documents or chunks based on their relevance to a query. It acts as a second filtering step to boost the most useful results and discard less relevant ones.

📊 When Does Reranking Have the Most Impact?

ScenarioEffect of Reranking
Large chunks (>1000 tokens)✅ Filters out irrelevant sections within large text blocks.
Small chunks (<400 tokens)✅ Helps find the best combinations of relevant smaller pieces.
Many retrieved documents (k > 10)✅ Prevents irrelevant results from polluting the output.
Semantically complex queries✅ Identifies meaningful matches beyond simple keyword overlaps.

⚙️ How is Reranking Applied?

  1. Initial Retrieval (BM25, Vector Search, Hybrid Search) → Fetches the top-k relevant chunks.
  2. Reranking using an AI model (e.g., Cohere Rerank, Cross-Encoder models like BERT/RoBERTa) → Rescores and reorders results based on deeper relevance.
  3. Final selection of the best-ranked chunks for LLM processing.
« Back to Glossary Index