Clustering groups items so that those in the same group/cluster have meaningful similarities (i.e. specific features or properties). Clustering facilitates informed decision-making by giving significant meaning to data through the identification of different patterns.
Why clustering data can be beneficial?
Clustering groups items so that those in the same group/cluster have meaningful similarities. Thus, clustering is a great tool to unravel hidden patterns in the data.
This clustering workflow combines different methods of clustering to group entries in a dataset.
Note: The accuracy of this technique is highly dependent on the data.
Follow the steps in the setup wizard:
- Select the text which to be used for clustering
- Select the vector field to be used for clustering
- Select your desired method for clustering (Kmeans, DBScan or both combined)
Note: K-means Clustering is more efficient for large datasets. DBSCan Clustering can not efficiently handle high dimensional datasets since it discard items that are far from the rest of the data points.
- specify the minimum and maximum number of clusters
- Type in a name for the new field which gets added to your dataset to store the results
- Execute the workflow
When the workflow is finalized, go to Datasets and you can see the resulting field is added to your dataset.
Updated about 6 hours ago