Let’s say you have some data on a company that shows their revenue and marketing spend. You might want to see if there’s a relationship between the marketing spend and specific demographics (like specific stores.) In other words, “what stores are worth putting marketing dollars towards?”
Tableau (Revenue and Marketing Spend)
I grabbed some data sets over at SuperDataScience.com (incidentally they have some amazing courses on data science there, and this topic I learned from one of their courses.)
In the chart to the left, I’ve added the SUM of Marketing Spent vs. the SUM of Revenue.
Now, this only produces one measly datapoint. We’ll fix that.
Digging out the Details
Dragging over an element like “city” or “store id” from the dataset onto the “Details” in the Mark elements will produce a breakdown of revenue vs. marketing expenditures.
There’s some very clear that some stores (or cities, if you used cities) are not responsive to marketing spent. In other words, spending more money on marketing in those cities/stores will not increase revenue.
From Wikipedia, the definition of k-means clustering is:
k–means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster – Wikipedia
By clustering, we will identify cluster groups for further analytics. How this is done in Tableau, is by doing the following:
On the left side of the Tableau worksheet, is a tab called “Analytics.” Clicking that reveals various statistical analysis elements we can utilize. In the Model sub-category, there is one called “Cluster.” This will use the k-means algorithm to create cluster groups.
Simply drag “Cluster” onto your figure and it will be applied.
Once applied the figure will update. Clusters will be formed using the computational algorithm (k-means) as described above.
Below is an example of the result of applying clustering to this sample data of Marketing Spend vs. Revenue:
Automatically the clustering has color-coded the clusters (orange and blue.) We also get a popup when the “cluster” is dragged onto the figure. The popup is notifying us of what attributes are used as variables in this calculation (Sum of Marketing Spend and Sum of Revenue.)
If you want to get deeper analysis (such as the statistical details), right-click the “cluster” model in the “Mark” section.
From there choose, “Describe Clusters.” A popup will appear (as seen below) with the relevant information (this can be copied to clipboard for adding into reports, etc.)
Tightening Things Up
I modified the colors of the groups a bit (green for those stores/cities that marketing dollars directly boosts revenue and red for those stores/cities that doesn’t increase revenue at all.)
Another change to the final figure, was that of adding a Highlighter. In the example below, I added a highlighter based on State data. Now, in the highlighter dropdown, if you select a State like Georgia, it fades out all other data. This allows a deeper dive into the granular state-level dynamics of which cities/stores are profitable and those that aren’t.