Multi-vector search with multiple fields

Guide to using multi-vector search with multiple fields

Multi-vector search means

  • Vector search with multiple models (e.g. searching with a text vectorizer and an image vectorizer)
  • Vector search across multiple vector fields (e.g. searching across title and description with different weightings)

Both of these can be combined to offer a powerful, flexible search.

πŸ“˜

Multi vector search allows us to combine multiple vectors and vector spaces!

Multi-vector search offers a more powerful and more flexible search by combining several vectors across different fields and vectorizers, allowing us to experiment with more combinations of models and configurations.

Multi-vector search with multiple models

In this section, we present a step-by-step guide on how to perform search via three sets of vectors:

  1. produced by a model trained on pure text data in English (called it default in this guide)
  2. produced by a model trained on pure text data from a multi-language dataset (called it textmulti in this guide)
  3. produced by a model trained on combined text and image (called it imagetext in this guide).

Step 1. Vectorizing the dataset

Search via vector type X is possible only if the dataset includes data vectorized by model X. This means if we want to search against fields such as title and description, we need to vectorize them using the available models (i.e. default, textmulti, and imagetext in our example). Please refer to a full guide on how to create and upload a database and how to use vectorizers to update a dataset with vectors at How to vectorize.

Step 2. Vectorizing the query

To make a search against vectors of type X, the query must be of the same type. Sample code showing how to use the vectorizer endpoints is provided below. Keep it in mind that, first RelevanceAI must be installed and a client object must be instantiated:

pip install RelevanceAI
from relevanceai import Client 

"""
Running this cell will provide you with 
the link to sign up/login page where you can find your credentials.
Once you have signed up, click on the value under `Authorization token` 
in the API tab
and paste it in the appreared Auth token box below
"""

client = Client()

And vectorizing your text query.

query = "white sneakers"  # query text

# a text vectorizer
query_vec_txt = client.services.encoders.text(text=query)

Step 3. Vector search

As it was mentioned earlier, Relevance AI has provided you with a variety of vector search endpoints with different use-cases; please see guide pages such as Better text Search for more information on each search endpoint.

3.2. Vector search against multiple fields

Another great sample of multivector search in the Relevance AI platform is how multiple vector fields can be used in search, with possibly different importance through weighing. In the example below, we are looking for white sneakers in name and description vector fields. As can be seen the more important field can be identified with a larger weight under the fields argument.

from relevanceai import Client

client = Client()

query = "white sneakers"  # query text

# three vectorizers
query_vec_txt = client.services.encoders.text(text=query)

dataset_id = 'ecommerce-sample-dataset'

vector_search = client.services.search.vector(
        # dataset name
    dataset_id=dataset_id,
        # fields to use for a vector search
    multivector_query=[
        {
            "vector": query_vec_txt["vector"],
            "fields": {"name_vector_":0.6, "description_vector_":0.3}, "alias":"text"},
        },
    ],
    # number of returned results
    page_size=5,
)

Note: Another great sample of multivector search in the Relevance AI platform is how endpoints such as advanced_multistep_chunk combine normal and chunked vectors. For instance, consider many long descriptions on different items where each one includes on average 8 sentences. Only one-tenth of the entries are on footwear and half of the sentences are about material, with a few on leather. None related entries can be filtered out via vector search. Then if descriptions are broken into 8 chunks (corresponding to the 8 sentences) a more fine-grained search on leather shoes is possible using advanced_multistep_chunk search.

Filtering based on specific features is a great option to perform a more refined search. We will explain Relevance AI's available filtering options on the following page.


Did this page help you?