Semantic Search

Automatic word matching and semantics

Semantic search (automatic word matching and semantics)

Semantic search result for query "Pavilion DV6-20" with a small emphasis on word-matching. As can be seen, there is no focus on "DV6-20" in the results, but all results are HP laptops.Semantic search result for query "Pavilion DV6-20" with a small emphasis on word-matching. As can be seen, there is no focus on "DV6-20" in the results, but all results are HP laptops.

Semantic search result for query "Pavilion DV6-20" with a small emphasis on word-matching. As can be seen, there is no focus on "DV6-20" in the results, but all results are HP laptops.

Concept

This endpoint is very similar to hybrid search and performs the search considering both word matching and search in the vector space. The only difference is that a small value is already set for traditional_weight.

Sample use-case

  • Searching for shoes (sneakers, boots, sandals, etc.) and preferring white-colored ones.

Sample code

Sample codes using Relevance-AI SDK and Python requests for semantic search endpoint are shown below.

from relevanceai import Client

dataset_id = 'ecommerce-demo'

client = Client()

query = "white shoes"
query_vec = client.services.encoders.text(text=query)

semantic_search = client.services.search.semantic(
    # dataset name
    dataset_id=dataset_id,
    
    multivector_query=[
        {
            "vector": query_vec["vector"],
           
          # list of vector fields against which to run the query
            "fields": ["description_default_vector_"],
        }
    ],
    
    # text fields against which to match the query
    fields=["description"],
    
    # query text
    text=query,
    
    # number of returned results
    page_size=5,
)

This search is rather quick and provides you with both word matching and search in context. Use this search when the importance of vector search outweighs the importance of word matching. Same as vector search, semantic search relies on machine learning techniques for vectorizing and similarity detection. Therefore, a vectorizer is needed. It is possible to use multiple models for vectorizing and combine them all in search (i.e multivector_query in the request body).


Did this page help you?