RecSys11: OrdRec: an ordinal model for predicting personalized item rating distributions

Recommendation system paper challenge (9/50)

Paper Link

Why this paper?

RecSys’11 Best Paper.

What problem do they solve?

A top-n recommender system with rating feedback

What others solve this problem?

Most researchers utilized rating as numerical view. And predict the rating considering weighted (temporal effect, similar users). However, in several common scenarios, there is no direct link between the user feedback and numerical values, even though the feedback is richer than a binary “like-vs-dislike” indication.

They argue

purchasing the product > bookmarking or wish-listing > search and browse

and rating does not mean too much as we expect.

Another scenario is when users are asked to enter their feedback by a comparative ranking of a set of products.

They argue that user rating depends on user internal scale.

What is their model?

They viewing user feedback on products as ordinal as OrdRec model, a point-wise ordinal approach, letting it scale linearly with data size.

An important property of OrdRec is an ability to output a full probability distribution of the scores.

They introduce S − 1 ordered thresholds, associated with each of the rating values besides the last.

First a random score z_ui is generated from a normal distribution centered at the internal score

They replace accumulated normal distribution to logistic function.


What is the Data?

What metric?

RMSE (root mean squared error):

Decent solutions in RMSE terms can contain no personalization power ranking-wise. For example, on the Netflix dataset a predictor explaining only rating biases could get much better RMSE than others.

Yet, All user-dependent biases play no role when ranking items for a single user, while the item-related biases are not personalized. Thus, it will yield the same item ranking for all users.

FCP (Fraction of Concordant Pairs):

It indicates that the ratio of correctly order. A measure that generalizes the known AUC metric into non-binary ordered outcomes

The Result


  1. It can improve user trust in the system and altering user behavior by adding confidence
  2. When the system pick among several items with the same expected rating, it can favor the item with higher confidence

In OrdRec, since the model output the distribution, we can easily employ standard deviation, entropy, or Gini for that distribution.

They formulate the binary classification to evaluate the level of confidence in the predictions of the model. For example, in the Netflix dataset, if the model’s prediction is 3.5 stars and the true rating is 4 stars, then the model is within 1 rating level, whereas if the true rating is 5 stars, then it is not.

By adding the feature from OrdRec, the AUC becomes much better.

What is their contribution?

  • A CF framework treating user ratings as ordinal rather than numerical, thereby being directly applicable to a wider variety of systems.
  • Flexibly associating different semantics to the available scores, depending on the user.
  • Predicting the full probability distribution of the scores rather than a single score.
  • Enhancing and integrating with many known CF methods.
  • New methods and evaluation metrics for assessing confidence in recommendations

Other related blogs:

Trust-aware recommender systems

Performance of recommender algorithms on top-n recommendation tasks

Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering

A Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks

Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text

Beyond Clicks: Dwell Time for Personalization

RecSys’15: Context-Aware Event Recommendation in Event-based Social Networks

RecSys’11: Utilizing related products for post-purchase recommendation in e-commerce

Best paper in RecSys:

My Website:

An machine learning engineer in Bay Area in the United States

Love podcasts or audiobooks? Learn on the go with our new app.

What is Deep Learning?

(Fluids Blog 07) AlphaPilot Competition — Implementing Convolutional Neural Networks for Object…

Announcing MMRazor: OpenMMLab Model Compression Toolbox and Benchmark.

Part 1 — Intro to Machine Learning with TensorFlow simplified.

Pickling Machine Learning Models

Imputation with Machine Learning Algorithms

Optimization : Boltzmann Machines & Deep Belief Nets

Using multivariate LSTM Forecast Model

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arthur Lee

Arthur Lee

An machine learning engineer in Bay Area in the United States

More from Medium

Did Stacking Improve My PySpark Churn Prediction Model?

Center of excellence impact measuring

Loan Defaulting Tendency Prediction — End-to-End ML implementation

What are people talking about on Yelp? Can they give accurate ratings of their true feelings?