Slope One

Collaborative filtering is a technique used by recommender systems to combine different users' opinions and tastes in order to achieve personalized recommendations. There are at least two classes of collaborative filtering: user-based techniques are derived from similarity measures between users and item-based techniques compare the ratings given by different users. Slope One is a family of algorithms used for Collaborative filtering introduced in Slope One Predictors for Online Rating-Based Collaborative Filtering by Daniel Lemire and Anna Maclachlan. Arguably, it is the simplest form of non-trivial item-based collaborative filtering. Their simplicity makes it especially easy to implement them efficiently while their accuracy is often on par with more complicated and expensive algorithms.

Item-based collaborative filtering of rated resources

Item-based Collaborative Filtering ^[1]^[2] predicts the ratings on one item based on the ratings on another item, typically using linear regression ( $f(x)=ax+b$ ). Hence, if there are 1000 items, there could be up to 1000000 linear regressions to be learned, and so, up to 2000000 regressors! This approach may suffer from severe overfitting^[3] unless we select only the pairs of items for which several users have rated both items.

Item-based collaborative filtering of purchase statistics

We are not always given ratings: when the users provide only binary data (the item was purchased or not), then Slope One and other rating-based algorithm do not apply. Examples of binary item-based collaborative filtering include Amazon's item-to-item patented algorithm^[4] which computes the cosine between binary vectors representing the purchases in a user-item matrix.

Being arguably simpler than even Slope One, the Item-to-Item algorithm offers an interesting point of reference. Let us consider an example.

Sample purchase statistics
Customer	Item 1	Item 2	Item 3
John	Bought it	Didn't buy it	Bought it
Mark	Didn't buy it	Bought it	Bought it
Lucy	Didn't buy it	Bought it	Didn't buy it

In this case, the cosine between items 1 and 2 is

${\frac {(1,0,0)\cdot (0,1,1)}{\Vert (1,0,0)\Vert \Vert (0,1,1)\Vert }}=0$ ,

the cosine between items 1 and 3 is

${\frac {(1,0,0)\cdot (1,1,0)}{\Vert (1,0,0)\Vert \Vert (1,1,0)\Vert }}={\frac {1}{\sqrt {2}}}$ ,

whereas the cosine between items 2 and 3 is

${\frac {(0,1,1)\cdot (1,1,0)}{\Vert (0,1,1)\Vert \Vert (1,1,0)\Vert }}={\frac {1}{2}}$ .

Hence, a user visiting item 1 would receive item 3 as a recommendation, a user visiting item 2 would receive item 3 as a recommendation, and finally, a user visiting item 3 would receive item 1 (and then item 2) as a recommendation. The model uses a single parameter per pair of item (the cosine) to make the recommendation. Hence, if there are n items, up to n² cosines need to be computed and stored.

Slope one collaborative filtering for rated resources

To drastically reduce overfitting, improve performance and ease implementation, the Slope One family of easily implemented Item-based Rating-Based Collaborative Filtering algorithms was proposed. Essentially, instead of using linear regression from one item's ratings to another item's ratings ( $f(x)=ax+b$ ), it uses a simpler form of regression with a single free parameter ( $f(x)=x+b$ ). The free parameter is then simply the average difference between the two items' ratings. It was shown to be much more accurate than linear regression in some instances^[3], and it takes half the storage or less.

Example:

Joe gave a 1 to Dion and an 1.5 to Lohan.
Jill gave a 2 to Dion.
How do you think Jill rated Lohan?
The Slope One answer is to say 2.5 (1.5-1+2=2.5).

For a more realistic example, consider the following table.

Sample rating database
Customer	Item 1	Item 2	Item 3
John	5	3	2
Mark	3	4	Didn't rate it
Lucy	Didn't rate it	2	5

In this case, the average different in ratings between item 2 and 1 is (2-1)/2=0.5. Hence, on average, item 1 is rated above item 2 by 0.5. Similarly, the average difference between item 3 and 1 is 3. Hence, if we attempt to predict the rating of Lucy for item 1 using her rating for item 2, we get 2+0.5 = 2.5. Similarly, if we try to predict her rating for item 1 using her rating of item 3, we get 5+3=8.

If a user rated several items, the predictions are simply combined using a weighted average where a good choice for the weight is the number of users having rated both items. In the above example, we would predict the following rating for Lucy on item 1:

${\frac {2\times 2.5+1\times 8}{2+1}}={\frac {13}{3}}=4.33$

Recommender Systems using Slope One

hitflip a DVD recommender system
inDiscover a MP3 recommender system
RACOFI Composer a generic recommender system by the National Research Council
Starfrosch an open MP3 blog community
Value Investing News

Open Source software implementing Slope One

Python:

A well documented Python implementation together with a tutorial

Java:

Taste: a java-based collaborative library with support for Enterprise Java Beans (code sample)
A standalone Java class implementing Slope One.
The Cofi: A Java-Based Collaborative Filtering Library supports Slope One algorithms (documentation is so-so)

PHP:

The Vogoo library supports Slope One algorithms (PHP)
There is PHP source code accompanying a technical report^[5] on Slope One algorithms
A module for Drupal CMS that implements Slope One.

Erlang:

Philip Robinson implemented Slope One in Erlang.

Footnotes

^ Slobodan Vucetic, Zoran Obradovic: Collaborative Filtering Using a Regression-Based Approach. Knowl. Inf. Syst. 7(1): 1-22 (2005)
^ Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John Riedl: Item-based collaborative filtering recommendation algorithms. WWW 2001: 285-295
^ ^a ^b Daniel Lemire, Anna Maclachlan, Slope One Predictors for Online Rating-Based Collaborative Filtering, In SIAM Data Mining (SDM'05), Newport Beach, California, April 21-23, 2005.
^ Greg Linden, Brent Smith, Jeremy York, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering," IEEE Internet Computing, vol. 07, no. 1, pp. 76-80, Jan/Feb, 2003
^ Daniel Lemire, Sean McGrath, "Implementing a Rating-Based Item-to-Item Recommender System in PHP/SQL", Technical Report D-01, January 2005.

[1] Slobodan Vucetic, Zoran Obradovic: Collaborative Filtering Using a Regression-Based Approach. Knowl. Inf. Syst. 7(1): 1-22 (2005)

[2] Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John Riedl: Item-based collaborative filtering recommendation algorithms. WWW 2001: 285-295

[lemire2005-3] Daniel Lemire, Anna Maclachlan, Slope One Predictors for Online Rating-Based Collaborative Filtering, In SIAM Data Mining (SDM'05), Newport Beach, California, April 21-23, 2005.

[4] Greg Linden, Brent Smith, Jeremy York, "Amazon.com Recommendations: Item-to-Item Collaborative Filtering," IEEE Internet Computing, vol. 07, no. 1, pp. 76-80, Jan/Feb, 2003

[5] Daniel Lemire, Sean McGrath, "Implementing a Rating-Based Item-to-Item Recommender System in PHP/SQL", Technical Report D-01, January 2005.

[1]

[2]

[3]

[4]

[5]