KDD 19': Sampling-bias-corrected neural modeling for large corpus item recommendations

Arthur Lee
3 min readJan 3, 2022


🤗 Recommendation system paper challenge (29/50)

paper link

🤔 What problem do they solve?

Large Scale item recommendation

Given a query x, we would like to recommend items y from M items and we can observe reward (watch time) for each pair (x, y)

Label: reward r (watch time)

Train data: a pair (query x, item y, reward r)

Model: two-tower model by optimizing the loss from in-batch negatives

😮 What are the challenges?

Data is hugely skewed, if we randomly sample negatives, we would sample the popular items more frequently.

In-batch loss comes from sampling bias when we have skewed data distribution. In this case, we will overly penalized the popular items

😎 Solution: Sampling Bias Corrected

We can update batch-SoftMax function with modified function to avoid overly penalized popular items

How do we get the whole frequency efficiently?

naively, we can count the whole training set and save them in a global hash table or apply Count–min sketch. But we can do better!

We can estimate the frequency within the batch.

Instead of counting frequency in a batch, they estimate by calculating the interval between two hints.

For example, if an item hits every 10 steps, we can confidently estimate the frequency is 0.1.

However, it is streaming, we can not get all information at the first time, what we can do is just like Bayesian (prior v.s. posterior), we have a prior first and when we see more and more data, we can gradually update our estimation.

The whole algorithm will be the following.

Basically idea, we only modify the batch-SoftMax function and keep other parts same

In this paper, they also prove how good is the estimator by proving it satisfying the Consistency.

🙃 Other related blogs:

KDD 19': Heterogeneous Graph Neural Network

KDD 19': Applying Deep Learning To Airbnb Search

KDD 18': Real-time Personalization using Embeddings for Search Ranking at Airbnb

KDD 18': Notification Volume Control and Optimization System at Pinterest

KDD 19': PinText: A Multitask Text Embedding System in Pinterest

CVPR19' Complete the Look: Scene-based Complementary Product Recommendation

NAACL’19: Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence

NIPS’2017: Attention Is All You Need (Transformer)

KDD’19: Learning a Unified Embedding for Visual Search at Pinterest

BMVC19' Classification is a Strong Baseline for Deep Metric Learning

KDD’18: Graph Convolutional Neural Networks for Web-Scale Recommender Systems

WWW’17: Visual Discovery at Pinterest

🤩 Conference

ICCV: International Conference on Computer Vision

CVPR: Conference on Computer Vision and Pattern Recognition

KDD 2020

Top Conference Paper Challenge:

My Website:



Arthur Lee

An machine learning engineer in Bay Area in the United States