Collaborative and Content-Aware Approaches to Recommendation
Recommender systems have become extremely common in recent years, and are applied in a variety of applications. The goal of a Recommender System is to generate meaningful recommendations to users for items or products that might interest them. Suggestions for books on Amazon, or movies on Netflix, are real world examples of recommender systems.
Recommendation systems take into account the user’s past behavior while making recommendation. Past behavior consists of data about users web-site interactions in the past like what items the user searched for, what items the user bought, did he comment on or rate some item explicitly etc. The systems try to predict other items of user interests based on other users with similar interests or based on attributes of the items from user’s history or both.
There are several prediction techniques based on what data they work on and based on the technique / algorithm/s itself. Every technique has its own strengths and weaknesses. The accuracy and usability of a technique largely depends on the kind of data that is available in the user profiles, the number of users and the metadata that is available for the items.
Approaches to recommendation
This approach works by identifying similarities in users or items based on user behavior. User behavior is mostly captured as clicking, rating, buying etc. This approach requires a significant amount of user data, ideally millions of data points or more to be able to make recommendations with confidence.
This approach suffers from cold-start problem, that is, the inability to make recommendation when there does not exist enough user profile. But, this approach has the advantage of not requiring an understanding of the content itself. The implementation of this approach is generic to all domains and item types. That is also why, most open source and OTS recommendation systems are collaborative in nature.
Studies show that collaborative filtering offers best results as it is based on human behavior. The classic example is the case of beer and diaper. Beer and diaper, as items, have no similarity to each other. But, all analysis show that the users who buy beer also buy diaper.
User Similarity based recommendation
Using the data of user preferences, user similarity matrix is computed. Keeping a threshold of similarity score and/or retaining only closest N users, the system would arrive at a user Neighborhood of size N. The preferences of neighborhood are used as recommendations to this user. This does not take into account item attributes or even the user’s current query.
Items are adjudged similar if they are frequently bought together, or more generically, frequently appear together in preference of the users. Item-based recommenders base recommendation not on user similarity, but on item similarity. In theory these are about the same approach to the problem, just from different angles. However the similarity of two items is relatively fixed, more so than the similarity of two users. So, item-based recommenders can use pre-computed similarity values in the computations, which make them much faster. For large data sets, item-based recommenders are more appropriate.
Item Aware Approaches
Content-based recommendation systems try to recommend items similar to those a given user has liked in the past. Indeed, the basic process performed by a content-based recommender consists in matching up the attributes of a user profile in which preferences and interests are stored, with the attributes of a content object (item), in order to recommend to the user new interesting items. The content-based approach requires deep knowledge of the products. Each item needs be profiled based on its characteristics. For a very large inventory, this process must be automatic, which can prove difficult depending on the nature of the items. If the items are well structured, and a good amount of meta-data exists for each item, then this part is simplified.
The best example of successful item based recommender implementation is ‘Pandora’. It maintains a collection of about 400 musical attributes that collectively essentially describe a song. It took more than 5 years to build item profile for its inventory of items.
There are different types of content-aware recommendation systems depending upon what content similarity they reason after. Below are the most common ones –
All characteristics of all items are identified and recorded. The attributes could be location, price, experience, domain, author, income level, age etc. The items are filtered based on preferences recorded in the User Profile, and the filtered (and ranked) items are shown as recommendations to the user.
Hierarchical classification based
The content is organized into a hierarchical structure. Items belonging to same place in hierarchy – medical/nursing/oncology – as preferred articles are recommended to user.
Text Similarity based or Content Based
This is also referred to as cognitive filtering, recommends items based on a comparison between the content of the items and a user profile. The content of each item is represented as a set of descriptors or terms, typically the words that occur in a document. The user profile is represented with the same terms and built up by analyzing the content of items which have been seen by the user.
Try to solve the new Formula Cube! It works exactly like a Rubik's Cube but it is only $2, from China. Learn to solve it with the tutorial on rubiksplace.com or use the solver to calculate the solution in a few steps.