Which aspects need to be considered when recommendation systems are implemented? Which algorithms are used and which recommender is the tool of choice for which use case? While my colleague Dr. Timo Böhm explains the motivation and principles of recommendation systems in part 1 of the series on recommendation systems, this article is dedicated to an algorithmic overview, and intended to help make decisions on the use of recommendation systems. Part 3 ultimately deals with the topics of personalized recommendation systems, machine learning and evaluation.
Selection of recommender systems to use
Once the decision has been made to introduce a recommendation system in a company, the next step is to select the type of recommender based on the business value for that company. In practice, it often turns out that it is meaningful to use not just one, but several recommendation systems with different functionalities. This is frequently done on websites of e-commerce companies.
There are different types of classifications of recommendation systems. The most intuitive is distinction according to the customer's purpose.
Important types of recommenders are:
- Suggestions for supplemental items which are useful for a given product. For example, the appropriate case could be offered to the buyer of a new smartphone. This type can also increase cross-selling and up-selling.
- Alternative suggestions for a given item. In particular, this assists in the decision-making by customers who have repeatedly looked at products such as washing machines or services like a mobile phone contract, but have not yet made a purchase. This is accomplished by proposing further, similar products to the ones already considered.
- References to products already viewed but not yet bought, or reminders for regular, repeated purchases of consumer products such as food or razor blades.
- Personalized recommendations which can incorporate a variety of factors and allow learning of customers' preferences based on usage behaviour. For example, if a user usually listens to rock tracks via a music streaming service, songs of this genre are also primarily recommended to the user. However, if the user unexpectedly types "Mozart" into the search bar during a visit, the system learns about this novel interest, and then also proposes classical music, either alternately with rock songs, or organized according to genre.
The following chart demonstrates the recommendation types 1 - 3 for a user who last viewed a smartphone. For the purchase of the phone, the recommendation of a matching cable seems reasonable (1.). However, it is also meaningful to present alternative mobile phones (2.). Furthermore, since the user was interested in a mixer before without actually buying one, it is also advisable to remind him or her, or to provide a recommendation for other mixers (3.). Personalized recommendations will be discussed based on another chart in the next blog post.
Algorithms and their implementation
The complexity of implementation depends, in addition to the algorithm itself, significantly on existing IT infrastructure. As a rule of thumb, the biggest effort is due to the connection of different data sources and the processing of data, as well as the integration into the live system (e.g. via API). Some approaches are presented below as examples:
- Products frequently purchased together
For the first type listed above, which isoften labeled as products frequently purchased together, methods of market basket analysis and association rules are available as a simple solution which can be implemented with little effort. For example, the apriori algorithm provides an efficient solution. Because transactions by the same user made at different times can involve completely different items, this algorithm considers data at the session level. Certain extensions of this type exist, for example, dealing with products purchased at a later stage (such as larger sizes of children's clothing).
- Simple approaches based on sequences and clickstream logs
Simple procedures based on sequences and clickstream data are basically very similar to the first type. In the case of insufficient transaction data, items similar to a given item can first be recommended via attribute matching (as described in the next point). However, this ignores a number of factors, such as the popularity with customers, and alternative products of different types (for example, Thermomix instead of a pressure cooker). These product alternatives can be identified through a more detailed analysis of the customer's remaining journey. In contrast to products frequently purchased together, in which case data about actions (such as purchases stored in an ERP system or play records of videos or songs) are sufficient, approaches based on sequences and clickstream data require the complete clickstream log. For this purpose, more resources are often needed. Big data technologies such as Spark can therefore be useful here. Once the items with subsequent actions have been identified, one can use either simple association rule-based methods or more complex machine-learning methods, as discussed in the next blog post.
- Content-based approaches
Content-based methods are often used in addition to the methods that identify products frequently purchased together. These methods are helpful especially when observations are scarce (data sparsity), for example, because of an insufficient number of purchases. These content-based approaches can be done, for example, by matching product attributes. If these are not present in the company's product management system, they can often be extracted as text from the product descriptions. For example, methods of information extraction from natural language processing can be used for this purpose. More advanced deep learning approaches (such as RNNs and attention models) also offer potential for this application. Last but not least, approaches which consider user attributes in a similar way are available too.
The types of recommender system discussed so far do not require any knowledge about the current user. This means that they also work for users browsing a website anonymously. Part 3 of our blog series on recommendation systems takes a look at algorithms for personalized recommendation systems.