Smartly optimizing paperback folders through ML

About ColliShop

Smartly optimizing paperback folders through Machine Learning

ColliShop is the online web-shop of Colruyt offering more than 20,000 non-food products to be searched and bought online and delivered at home or your closest Colruyt shop. ColliShop started in 1984 with an offline catalogue of non-food products. Over time, its assortment grew into garden furniture, electronics, cooking, bedroom, travel and much more.

The challenge: high cost & non-ecological communication of Paperback folders

Every month, ColliShop - the non-food e-commerce of Colruyt Group - sends out paperback folders to their customers. Sending these - on average - 200k pieces a month makes this a very cost-intensive and non-ecological communication channel.

Smartly optimizing paperback folders through Machine Learning

The objective: Smartly optimizing paperback folders

In a push to optimizing the number of folders sent and to determine the target audiences efficiently, we decided to leverage machine learning: more specifically, a clustering algorithm to help us divide customers into different target groups.

In this customer case study we will cover how customer clustering (leveraging machine learning) helped ColliShop to optimize their customer communication using paperback folders. What it is, how it works, and actual results are discussed in the following paragraphs.

What is customer clustering

As you might expect, clustering algorithms divide a population in different groups or clusters. The goal of the algorithm is to build groups where populations within each group are as homogeneous as possible, while keeping diversity across groups at a maximum. The algorithm bases its decision on several different customer metrics. These vary from classic metrics such as recency, buying frequency and revenue, to more specific product features, such as the product price range a customer buys in. The algorithm can be fed with all metrics that you feel relevant for your business case.

To initiate the clustering algorithm, you only have to define how many clusters you want to obtain. Then the clustering will start its learning process and come up with the ideal clusters for your business case. Afterwards, these clusters can be analyzed, interpreted and described. Finally, various personas can be identified from these clusters.

If you want to know more, you can continue reading about Customer Segmentation.

How is it done

To illustrate our approach on segmentation, an example is provided below.

Imagine you only know the age of your customer. From your order history you also know what products they have bought previously. Each product falls into a specific category. Let’s say that one of those main categories are school supplies (such as backpacks for children, pens, crayons, etc.). You could easily calculate for every customer the percentage of purchases that were school supply products. If we would plot those two information points on a scatterplot it might look something like the image below. Every dot represents a different customer.

Smartly optimizing paperback folders through Machine Learning

A business user could relatively easily see that 3 big groups can be defined. With some business logic we could argue that these groups are Parents with young children, Grandparents with young grandchildren and Others. If we would add additional information such as average order value, we might still be able to identify different groups with our own eyes, but it will already be more challenging as it would need to be visualized in three dimensions. When we would go up to 4 dimensions, our human brains start to falter, and similar analysis becomes too hard. A computer however, can use mathematical models to identify different groups in n dimensions.

We use machine learning to identify different clusters with customers that have similar behaviour. We could easily add tens of different dimensions. The algorithm allocates every customer to the most suitable cluster and leaves it up to us to define what differentiates each cluster.

For each cluster, a persona can be made based on the characteristics of the customers in that cluster. These characteristics can be written down on a card that specifies that persona, and resembles an ID card of that persona. In order to make it even more interpretable, we add the characteristics of an average customer. This reference is important to interpret the cluster in the correct way. A simplified example of such a card is shown below.

Having a card for each cluster makes the selection of folder recipients much easier for marketeers. They end up with just 20 cards to choose from and pick the best ones until they reach the correct number of folder recipients.

Smartly optimizing paperback folders through Machine Learning

Note: The information in this example is completely random

Note: This approach is using a centroid model in which each cluster is represented by a single mean vector. There are other approaches that are valuable as well but for the sake of simplicity we stick to this model in this article.

What was the impact for ColliShop

As mentioned in our introduction, ColliShop wanted to optimize the number of folders sent to their customers on a monthly basis. Knowing that a proportion of those folders end up unread, they aimed to target only those customers that would in fact be interested in receiving a folder at the time of sending. With the help of element61 & machine learning, ColliShop created around 20 customer clusters to choose from when selecting recipients for a particular folder.

By means of illustration, we present an example of a folder that ColliShop sent out in fall 2019. By selecting only the customers from relevant clusters, ColliShop sent out 35% less folders. Assuming the conversion rate of the folders would stay equal, the number of customers that received a folder and bought something would be expected to drop with 35% as well. However, that amount only dropped by 11%, meaning that ColliShop succeeded in excluding more irrelevant recipients than in a random selection. As a result, the revenue per folder increased with 34%.

Smartly optimizing paperback folders through Machine Learning

Since the introduction of these clusters at ColliShop, they have been used in all folders selections, with similar and even better results. Using machine learning does not have to be complex to achieve great results. Customer clustering can be a great start for your company to get insights in your customer base and cut costs efficiently.

Do you want to know more

Are you interested in clustering, machine learning or if you just have a question, feel free to contact one of our element61 team members.

Do you want to know more about customer segmentation and other smart applications of data in Marketing:

Continue reading about defining a Data Strategy in our organization
Continue reading about our methodology in building a Machine Learning Proof of Concept