[Eng blog: Pinterest-3] How Pinterest Leverages Realtime User Actions in Recommendation to Boost Homefeed Engagement Volume
🤗 Machine learning Engineer blog challenge (3/100)
Pinterest — 2022
🤔 What problem do they solve?
他們想要找到一個single去抓出short-term user interest並且把它納入目前的Pins HomeFeed ranking model裡面,有了short-term user interest搭配ML application可以實現responsive的功能(app可以快速反應使用者的短期篇好,例如點了日本Feed,過幾秒刷新Feed,內容會多了日本Feed)
這邊的short-term也可以順便解決cold-user的問題,因為short-term代表只看少部分的資訊,而cold user剛好也只有少數資訊
以下是high level,HomeFeed怎麼ranking內容
以下是目前Ranking Model,可以看到他們想要多加”user sequence signal”
😎 Proposal solution: Realtime User Action Sequence
他們採取的作法是考慮到一個model taking realtime user action sequence
而realtime user action sequence又包含了三個資訊: Action (click, skip, like), Item/Pin, time, 其中pin/item都有一個engaged pin embedding,而action可以做成action embedding,這樣一來就不用hard code action了,也剛好每個pin都有一個action
具體作法是把三個sequence (action embedding, engaged pin embedding sequence, candidate pin embedding)結合再一起,再跑一個encoder,最後在flatten,這方法是V1.0
如下圖
而V1.1則是多了兩個步驟, 如下圖
- 用了兩個transformer encoder (多用一個)
- 用max pooling用在output上 取得long term interest,再取latest 10,如此一來只有11個 -> 減少output 方便feed into 其他model (DCN v2)
🤔 結果
More Sensitive Model
因為加入了recent information,導致model更sensitive (如下圖),可能原因在於focus on更短期更少數,更容易有noise
解決辦法是:retrain! 但是Retrain 不是個簡單的事情,尤其是對於production來說
Higher model serving latency
由於更複雜的signal加入, latency也增加了
解決辦法是:GPU!
Positive Feedback(Full Traffic)
當traffic持續變高,他們發現metric也正向成長,主要驅動來自於,model更sensitive recent,外加上retrain,產生正面循環
🤔 心得
這篇blog蠻不錯的
一方面modeling的創新也蠻不簡單的,如何train一個model,如何pooling
另一方面, service infra支持也非常重要,像是最直接的如何serve更複雜的signal+model,是否infra支援這種latency,是否GPU容易serving?
再來是ML infra, 如何retrain,如果只更新user signal部分可以嗎?還是需要整個model都一起retrain?一起retrain花多少時間,stable or not?在其他engineer一起做AB testing時候,是否也要跟進retrain?
ML infra可能稍微複雜,每個公司operation可能也都不一樣
🙃 Other related blogs:
KDD 21':Learning to Embed Categorical Features without Embedding Tables for Recommendation (google)
KDD 19': Sampling-bias-corrected neural modeling for large corpus item recommendations
KDD 18': Notification Volume Control and Optimization System at Pinterest
CVPR19' Complete the Look: Scene-based Complementary Product Recommendation
NAACL’19: Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence
KDD’19: Learning a Unified Embedding for Visual Search at Pinterest
BMVC19' Classification is a Strong Baseline for Deep Metric Learning
KDD’18: Graph Convolutional Neural Networks for Web-Scale Recommender Systems
🤩 Conference
ICCV: International Conference on Computer Vision
http://iccv2019.thecvf.com/submission/timeline
CVPR: Conference on Computer Vision and Pattern Recognition
Top Conference Paper Challenge:
https://medium.com/@arthurlee_73761/top-conference-paper-challenge-2d7ca24115c6
My Website: