One of the most critical and most challenging aspects of improvement for companies is understanding the clients’ feedback. It becomes a lot more challenging when a company scales. Thousand of feedback on a variety of aspects, positive comments, negative comments, different categories, different requests, etc.
In this article, we are using Relevance AI’s platform to analyze 40.000 Steam customer reviews on the recently released video game Elden Ring.
On platforms such as Steams, people can purchase video games as well as write reviews based on their experience playing the game. Analysis of such reviews provides us with a better understanding of user experience in addition to the identification of both sources of complaints and game features enjoyed by players.
The data scraped from Steam normally contains:
- the content of the review (text format)
- the number of replies/comments for each review
- the number of upvotes from other users.
Relevance AI’s Explorer app is a great tool for data visualization and analysis. It is composed of a variety of features such as
- Various embedded analysis tools including different metrics and aggregations that can be set on the whole data or a selected subset
- Various visualization tools including diagrams, tables, word clouds, time series
- Two data views: general and cluster view
- Prepare our data as a valid CSV/JSON file; have a look at Data FAQs for tips on data prepration.
- Upload our data to the Relevance AI’s platform
- Vectorize your data to benefit from AI and machine learning via Relevance AI’s no-code tool.
- Cluster your data (i.e. automatically group the data based on their similarities) using the no-code relevance clustering tool. Clustering helps understand hidden patterns in your data; for instance, all reviews with similar content related to errors or unexpected behaviour in the video game will be grouped together.
- Set up the Explorer App.
Any data in valid CSV or JSON format can be uploaded to Relevance AI within a few clicks. After that you can explore your data under Datasets as shown in the image below:
As can be seen, the main fields that we will be using for our analysis are:
- The Review field: the content of each review. This is text data that will be vectorized later.
- Votes up, Weightedvote, commentcount fields: numerical data that we can use as metrics. The votes_up metric shows the number of upvotes that every review has received from other users on the platform. Weighted_vote, instead, is a score given by steam to measure the impact of each review, while comment_count is the number of comments in response to each review.
- label_review field: A list of keywords extracted from the review.
To benefit from the magic of AI and machine learning, we will vectorize the reviews (i.e. converting each review into a vector, where vectors represent conceptual relationships). This can be done by Vectorize under workflows, on the menu located on the left.
After the vectorizing is completed, we can cluster our vectorized data. This can be done by clicking on Cluster under the app on the menu on the left. There are two options:
- The KMeans algorithm in which we should decide on the number of clusters, or
- Community Detection where the algorithm is to find the optimal number of clusters. In our case, after visually checking the data, we decided to group our data into 240 clusters. Note that we can have as many clustering experiments as we need on the platform.
The Explorer app is composed of four main sections:
- The Name, save, share bar on the top
- Search, sort and filters
- The overall view containing different Metrics and aggregation
- The Data and Cluster view
All these sections and their components are configurable making data visualization, data analysis, and insight extraction a lot easier and straightforward.
For instance, as shown in the image above, in our experiment, in the overall view of the clusters, which provides us with a summary of all the clusters extracted from our data, we can see how the three largest clusters within our data are gameplay, game, and stuttering.
Further down on the Explorer page, we can see “cluster cards”, where we can explore each individual cluster. Each cluster contains items with similar content, grouped together using vector-based technology. You can add a label to each cluster as well.
When metrics are set and aggregations are defined for each cluster, the following three sections are seen on the cluster cards:
- Metrics at the top with values calculated based on items in the cluster. As we can see from the image below, we have chosen to show the total number of comments and upvotes.
- Fields to view on the left. In this case, we chose to only show the content of the reviews, but we could add more fields to be visualized.
- The section on the right shows an aggregation based on our parameters. In this case, we chose the field label_review to be represented as a word cloud for the entire cluster.
Different visualizations are available on the platform which can meet different individual preferences.
One of the features on Relevance AI’s Explorer app is the ability to see a summary of all the clusters and the metrics. This will provide us with an easy-to-access overview and as an initial analysis tool. Cluster comparison is accessible via the “View cluster comparison” link, which opens up a table view of the clusters as shown in the image below.
Now that all our data is grouped into clusters, one way to extract meaningful insights from our data is to sort clusters by different metrics and study the two sides (e.g. the largest cluster and the smallest cluster(s) when sorted by average number of upvotes). Playing with different metrics under different sort directions, helps us see the data from completely different perspectives.
For example, in our experiment, when descendingly sorting the data by cluster size, we can immediately see the content of the biggest clusters showing the most common/frequent trend in the reviews.
Sort can be done based on the defined metrics too. The following example shows two of the clusters with the least number of comments based on the sum_comments metric:
We can see that the reviews receiving the least reactions are numerical ratings and reviews that cherished a game.
After looking at the data from a different perspective, we can summarize the following insights:
- The most common reviews about Elden Ring described the game as being “amazing”
- The most common reviews that were unable to draw any user attention (lack of upvotes) stated that the game deserved 10/10 and that it was “great”
- The reviews that drew more user interest were about “Performance” and “stutter”. This shows how users take more time in engaging with reviews that signal bugs and outline a problem they are facing themselves.
- Reviews that were mostly ignored were about jokes and fun details about the game (fingers, goats, animals…)
Updated 3 months ago