Text and vector search (Hybrid search)

Adjustable word matching and semantics

Hybrid search (adjustable word matching and semantics)

Hybrid search result for query "Pavilion DV6-20" with a large emphasis on word-matching. As can be seen, the first three returned results, all include the id "DV6-20" in the query.Hybrid search result for query "Pavilion DV6-20" with a large emphasis on word-matching. As can be seen, the first three returned results, all include the id "DV6-20" in the query.

Hybrid search result for query "Pavilion DV6-20" with a large emphasis on word-matching. As can be seen, the first three returned results, all include the id "DV6-20" in the query.

Concept

This endpoint provides search through both word matching and search in the vector space. There is full control over which one to emphasize via a weighting parameter.

Sample use-cases

  • Combining traditional text search with semantic search into one search bar by providing support for ID search and vector search (for example - being able to combine "white shoe" and "Product JI36D" into the same search to return the same result.
  • Searching for shoes (sneakers, boots, sandals, etc.) but specifically looking for white-colored ones.

Sample code

Sample codes using Relevance-AI SDK and Python requests for hybrid search endpoint are shown below.

from relevanceai import Client

dataset_id = 'ecommerce-demo'

client = Client()

query = "white shoes"
query_vec = client.services.encoders.text(text=query)

# weight assigned to word matching
if len(query.split()) > 2:
    traditional_weight = 0
else:
    traditional_weight = 0.075

hybrid_search = client.services.search.hybrid(
    # dataset name
    dataset_id=dataset_id,

    multivector_query=[
        {
            "vector": query_vec["vector"],

            # list of vector fields against which to run the query
            "fields": ["description_default_vector_"],
        }
    ],

    # query text
    text=query,

    # text fields against which to match the query
    fields=["description"],

    # number of returned results
    page_size=5,
    
    traditional_weight=traditional_weight
)

This search provides you with both word matching and search in context. You have the option of assigning the desired weight to traditional search. For instance, if word matching is important, the traditional_weight parameter is set to a higher value to emphasize on exact text matching, it is normally set to a small value (e.g. 0.025 to 0.1). Same as vector search, hybrid search relies on machine learning techniques for vectorizing and similarity detection. Therefore, a vectorizer is needed. It is possible to use multiple models for vectorizing and combine them all in search (i.e multivector_query in the request body). This model provides you with more exploration possibilities on the effect of traditional and vector search.


Did this page help you?