“Data-mining”, “drilling down”, “insight extraction.” You might think this sounds like an awful lot like hard work and in reality, it is! Getting insights from a customer data set requires a technically skilled analyst ...with a lot of patience…. and a lot of time. The similarities to the lone miner searching for black gold are striking (geddit). The tools may be different (excel rather than a pick axe) but the realities are the same. It is an inefficient way of finding opportunities.

In the 21st century, seismic survey technology helps us to find oil and now we must look to machine learning to help us find the gold buried in our data. And what is truly exciting is that this technology exists today: let me explain how Qubit ML can help uncover the true value in your data.

The aim of opportunity mining is to identify underperforming segments of your users, and then value the size of the opportunity, in order for you to prioritise where to focus your personalization efforts. With this aim in mind, we can break the job or ‘opportunity mining’ into three nice, neat tasks:

  1. Create a segment of users
  2. Establish whether this group of users is underperforming
  3. If it is deemed to be underperforming, put a value on this opportunity

As you can imagine, getting a human to do this across all the many different combinations of visitor attributes and behaviors possible, would be an arduous task to say the least!

Fortunately this whole process can be automated and made to be statistically robust with Qubit ML. Now there are some obvious questions to address. 

How do we create segments of users?

We traverse a tree of all possible combinations of visitor attributes used in the model and prune away groups that we deem too small, too big or too similar to other groups.

How do we establish whether they are underperforming?

We use revenue per visitor (RPV) as our metric for performance. This is a nice metric because it combines conversion rate with revenue per converter into one metric that can be easily compared across segments of users. On the flip side, it can equally be seen as a problematic metric due to its ugly distributional properties which make performing statistical tests more difficult. Luckily these technical problems can be overcome, and we can use the data we have collected to compare the expected distribution of RPV with what we have actually observed for the segment we have created. If there is a difference between our expectation and the reality and crucially if this difference passes a stringent statistical test for significance, then and only then are we comfortable labelling this as an ‘opportunity.’

How do you value this underperformance?

This is a bit of a trick question. As we’ve used revenue per visitor as our metric for performance, the valuation of the opportunity for the segment of users is simply the difference between the expected and observed revenue per visitor, multiplied by the number of users in the segment.

It is important to note that although this is a very useful tool for prioritizing and identifying potential opportunities, not all of the opportunities found may be easy to fix. Qubit ML may well find your business has an opportunity to improve the performance of your Spanish users however it wouldn’t know that your business’s name might be lost in translation. There is often complex context and domain knowledge required to understand the cause of the underperformance. Where this is not immediately apparent we can call on our technically skilled analysts to spend their time investigating opportunities rather than spending precious time searching for them.  

In short, the fun bit of data analysis is understanding and influencing customers, let’s leave the machines to do the boring bits.


If you would like to discuss how Qubit ML can help your business, send us an email to info@qubit.com or get in touch on the form below:






Will Browne

Read More

Subscribe to stay up to date.

Receive the personalization newsletter directly to your inbox.