Data FAQs

Does my team need specific skills to upload data?

You do not need any specific skills or to be a programmer to interact with Relevance AI. Just prepare your data in a valid CSV or JSON format and you can use our no-code platform.
Relevance allows you to upload data in two ways:

  • Upload via the platform (no coding required)
  • Upload via the Python SDK (for programmers)

What size data can Relevance AI handle?

Theoretically, there is no limit to the amount of data that Relevance AI can handle. Your data is stored on the cloud that scales based on user demand, potentially reaching terabytes of stored data.

What types of data can Relevance AI process?

Relevance allows you to work with many data types (structured and unstructured). Common data types are:

  • textual: any string value such as names, descriptions, reviews, email address
  • image: Relevance AI needs a URL to access your image data
  • numerical: numerics are values composed of digits, such as prices, rankings, and phone number
  • date: dates in “yyyy-mm-dd” format can be processed by Relevance AI
  • arrays: arrays are lists of items, such as lists of strings (e.g. lists of keywords), or vectors (i.e. list of numbers).

What are vector fields?

Vector fields are representations of data in another format (list of numbers or vectors to be precise). For instance, if your dataset includes a field/column named "description" which shows the description of items in the dataset in text format, after vectorizing each description value, you have access to the corresponding vectors. These vectors can be saved in the dataset under a vector field (i.e. Description_vector_ in the table below).





a large cup, typically cylindrical with a handle without a saucer

[1.2, -0.23, -0.32, 0.321, ...]


a flat dish, typically circular from which food is eaten or served

[-0.12, 0.223, 0.832, 1.451, ...]


a cylindrical container, typically of metal, used for cooking.

[-0.162, -1.13, -0.752, -0.911, ...]

Can I add to an existing dataset?

  • Yes, you can upsert to an existing dataset via the Dataset section on your dashboard or use the dataset upsert function if you prefer coding. Note that you can add new fields, but field types and formats must remain consistent.

Which preprocessing techniques are recommended, when working on text?

When working with textual data it is recommended (i.e. not required) to apply certain preprocessing steps which can potentially improve the analysis results. Common text pre-processing are:

  • Stop words removal: to remove frequent but not important words used in our language (e.g. the, there).
  • Stemming: replacing words with their word stem (e.g. changes or changing become chang-)
  • Lemmatization: replacing words with their common root (e.g. changes or changing become change)
  • Lowercasing: converting all characters to their lowercase form
  • Text cleaning: this step is completely data specific. Some famous text cleanings are Html, URL or hashtag removal.
  • Breaking into shorter pieces of text: when automatically analyzing text, processing smaller pieces of text (e.g. a sentence vs paragraph) often produces more precise results.

Can datasets be joined in Relevance AI?

Relevance provides an endpoint for combining multiple datasets.
The endpoint requires you to input all the datasets hosted on Relevance AI that you wish to combine, including all the fields that you wish to conserve. You also have the option to rename fields in the process. You also need to choose a name for the new dataset.

What is the minimum number of records that we need to generate output

The workflows you can use to generate results work with even a single sample, however, we recommend a minimum of 200 samples to generate an insightful output

What is the biggest volume we have done?

At the moment, we are serving 3M+ end-users weekly, 100 M+ API calls per week, with our largest dataset consisting of 2.5M samples

Do I need to rerun the analysis every time I add a new data source?

When new entries are added to your dataset or existing data is edited in fields that have been used for any analysis, you will need to rerun the workflows, so that the new items or the updates are included in the analysis.

How much time does it take to upload a dataset?

It depend on the size of the dataset and can vary from a few seconds to minutes.

Did this page help you?