Subscribe Now

Edit Template

Subscribe Now

Edit Template

Implementing Hybrid Semantic-Lexical Search in RAG


In this article, you will learn how to implement a hybrid search strategy for RAG systems by combining BM25 lexical search with semantic search, fused together using Reciprocal Rank Fusion.

Topics we will cover include:

  • Why hybrid search outperforms either lexical or semantic search alone in retrieval-augmented generation systems.
  • How to implement BM25 lexical search and dense vector semantic search as independent retrieval engines in Python.
  • How to merge both rankings using Reciprocal Rank Fusion (RRF) to produce a final, balanced retrieval result.

Let’s get straight to it.

Implementing Hybrid Semantic-Lexical Search in RAG

Implementing Hybrid Semantic-Lexical Search in RAG

Introduction

Implementing hybrid search strategies is a critical step in building modern RAG (Retrieval-Augmented Generation) systems, especially when shifting from prototype to production-ready solutions.

There is little argument against semantic search — fueled by dense vectors or embeddings, which are numerical representations of text — being incredibly useful at understanding semantics, synonyms, and context. However, lexical, keyword-based search with approaches like BM25 covers a small blind spot neglected by semantic search. Combining the best of both worlds is therefore the perfect recipe to take your RAG system’s retrieval mechanism the extra mile.

Let’s explore how to implement such a hybrid search strategy through a gentle coding example, guiding you through every step of the process!

Note: If you are unfamiliar with RAG systems, you may find the “Understanding RAG” article series remarkably insightful for getting the most out of this read. In particular, I recommend acquiring an understanding of vector databases first through this article.

Step-by-Step Implementation

The first step is to ensure all the necessary external Python libraries are installed, in particular these three:

  • rank_bm25: an implementation of the BM25 lexical search algorithm for information retrieval (BM stands for “Best Matching”).
  • sentence-transformers: provides pre-trained language models for generating text embeddings. In a real setting, you may already have your own vector database containing many document embeddings and not need this, but we will use it here to simulate the construction of a toy vector database and illustrate hybrid search on it.
  • requests: used to fetch the raw dataset package from a public GitHub datasets repository prepared for this example.

With these ingredients at hand, we start by loading the dataset and storing the raw texts in a list (we do so because it is a small dataset).

The hybrid search process is divided into three stages: two of them take place in parallel, or independently from each other. The third is where the fusion of both approaches happens, using a merging method called Reciprocal Rank Fusion (RRF).

Let’s cover lexical search with BM25 first:

The lexical search process has been encapsulated in a function called search_bm25(). This function takes two input arguments: a string containing the user’s query to the RAG system, and the number of top results to retrieve. The rank_bm25 library provides a get_scores() method that computes, for each document — treated as a collection of tokens — a lexical relevance score. We then rank documents by decreasing score, select the top-k, and return them.

Meanwhile, the semantic search engine first uses a sentence transformer model to obtain embedding vectors for the texts and the user query, then applies a vector similarity metric like cosine similarity to rank texts by semantic relevance and retrieve the most relevant k:

Time to put it all together. The two scores calculated for each document cannot simply be added, because they operate on very different numeric scales. Instead, we perform the fusion based on ranks rather than raw similarity or relevance scores. For this, RRF is the gold industry standard for fusing ranking information: it calculates an overall ranking for each document by rewarding those that appear in high positions across both lists. The underlying logic is somewhat similar to that of the harmonic mean operator in statistics.

The overarching hybrid search process is implemented as follows:

Now it’s time to try it all out. Let’s formulate a user query and see what results we get.

The results are not excellent compared to a production RAG system, but bear in mind we tested this on a tiny, nine-document dataset. With that context, the outcome is quite reasonable.

Try modifying the query and replacing it with others related to temples, beaches, mountains, or anything else that comes to mind when thinking about eastern destinations. Can you find a scenario in which both the semantic results and the BM25 results are highly consistent with each other?

Wrapping Up

This article guided you through implementing a hybrid search mechanism for the retrieval stage of RAG systems. Choosing not to rely solely on semantic search is an important consideration when scaling RAG solutions to production environments.

crossroad.joykonark.com

Writer & Blogger

Considered an invitation do introduced sufficient understood instrument it. Of decisively friendship in as collecting at. No affixed be husband ye females brother garrets proceed. Least child who seven happy yet balls young. Discovery sweetness principle discourse shameless bed one excellent. Sentiments of surrounded friendship dispatched connection is he.

Leave a Reply

Your email address will not be published. Required fields are marked *

About Me

Kapil Kumar

Founder & Editor

As a passionate explorer of the intersection between technology, art, and the natural world, I’ve embarked on a journey to unravel the fascinating connections that weave our world together. In my digital haven, you’ll find a blend of insights into cutting-edge technology, the mesmerizing realms of artificial intelligence, the expressive beauty of art.

Popular Articles

  • All Posts
  • AIArt
  • Blog
  • EcoStyle
  • Nature Bytes
  • Technology
  • Travel
  • VogueTech
  • WildTech
Edit Template
As a passionate explorer of the intersection between technology, art, and the natural world, I’ve embarked on a journey to unravel the fascinating connections.
You have been successfully Subscribed! Ops! Something went wrong, please try again.

Quick Links

Home

Features

Terms & Conditions

Privacy Policy

Contact

Recent Posts

  • All Posts
  • AIArt
  • Blog
  • EcoStyle
  • Nature Bytes
  • Technology
  • Travel
  • VogueTech
  • WildTech

Contact Us

© 2024 Created by Shadowbiz

As a passionate explorer of the intersection between technology, art, and the natural world, I’ve embarked on a journey to unravel the fascinating connections.
You have been successfully Subscribed! Ops! Something went wrong, please try again.

Quick Links

Home

Features

Terms & Conditions

Privacy Policy

Contact

Recent Posts

  • All Posts
  • AIArt
  • Blog
  • EcoStyle
  • Nature Bytes
  • Technology
  • Travel
  • VogueTech
  • WildTech

Contact Us

© 2024 Created by Shadowbiz

Fill Your Contact Details

Fill out this form, and we’ll reach out to you through WhatsApp for further communication.

Popup Form