KDD 21':Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning (google)

🤗 Recommendation system paper challenge (31/50)

paper link Google KDD 2021

🤔 What problem do they solve?

現有的MTL (multi-task learning) 沒有針對fairness進行優化,只有STL (single task learning)針對fairness進行優化

所以這篇paper在於解決 fairness on MTL的問題

😎 Contribution

  • ARFG: 對每個task去計算MTL中的FPRGap跟STL的FPRGap的比值,在平均起來
  • ARE: 對每個task去計算MTL中的error rate跟STL的error rate的比值,在平均起來

與MTL類似,只是多了要optimize fairness loss

how to measure fairness loss?

  • minimizing the correlation between group membership and the predictions over negative examples (讓每個task中的negative sample的正確性之間的相關性最小)
  • kernel-based distribution matching through Maximum Mean Discrepancy (MMD) over negative examples (讓每個task的MMD最大)
  • minimizing FPR gap directly in the loss (goal)


其實只是在baseline的方法加上了task-specific parameter training

baseline 跟MTA-F比較

可以發現只要只是Fairness的F改了,對於shared layer我們想用大家共同的data去train,對於task-specific layer,我們只想用task-specific data去跑

舉例來說, 從下表中,如果我們針對task1去找shared-negative-data跟specific-negative-data

  • specific-negative-data: 只有他negative,其他task是positive,所以是 E
  • shared-negative-data: task1 negative而且扣掉specific-negative-data,所以是 (EFGH — E) = FGH



😮 Background:

在很早期,我們只對一個task去train deep learning,但隨著業務要求還有data因素,我們開始有了multi-task learning

  • 業務(given a image,我們不只想知道這是男生女生,還想知道他快樂與否)
  • data (我們有task A, taskB, 也有一個大data可能有些沒有label for task A, or for task B), 因此想到我們也許可以把這個data直接餵給task A, task B一起去optimize


  • all subgroups receive the same proportion of positive outcomes: 通常在Demographic parity的case
  • equal opportunity and equalized odds: equal TPR (true positive rates) and FPR (false positive rates) across different subgroups, 更加務實,此篇論文用這metric
  • fairer representation learning
  • single-task learning setting (pre-processing the data embeddings for downstream job, post-processing model’s prediction)
  • intervening the model training process has also been popular, including adding fairness constraints or regularization

shared subset parameters


  • exploits task relatedness with inductive bias learning (learning a shared representation across related tasks is beneficial for harder tasks or tasks with limited training examples.) 對難的問題跟限制data很有幫助
  • forcing tasks to share model capacity (regularization) ->避免overfitting,更general
  • proving compact and efficient form of modeling which enables training and serving multiple prediction quantities for large-scale systems -> serving time更scalable system


  • 可能讓task1好但task2不好
  • 現存方法是reducing the task training conflicts 還有improving the Pareto frontier

隨著MTL越來越熱門,開始很多人在研究fairness in MTL

  • fairness in multi-task regression models and uses a rank-based non-parametric independence test to improve fairness in ranking
  • multi-task learning enhanced with fairness constraints to jointly learn classifiers that leverage information between sensitive groups


  • fairness跟accuracy有 a trade-off 關係 for single task
  • 各個accuracy trade-off cross tasks

幾個key questions

  • How does inductive transfer in multi-task learning implicitly impacts fairness?
  • How do we measure the fairness-accuracy trade-off for multi-task learning
  • Are we able to achieve a better Pareto efficiency in fairness across multiple tasks, by exploiting task relatedness and shared architecture that’s specific to multi-task learning?


MTL越來越熱門,很多大公司也開始使用,不僅是NLP跟CV領域,像是傳統推薦系統也會考慮MTL(click, comment, skip rate)抑或是新聞推薦系統(engagement v.s. freshness),從業務上出發,擁有很多不同的目標要達成


這兩種trade-off都是很具有挑戰性的,作者試著把這兩個trade-off去結合成一個新的問題,再提出metric跟方法來解決,屬於modeling paper (比較沒有涉及serving, production的部分),對於學習modeling來說,是個不錯的範本


看完了這篇paper收穫蠻多的,主要是對於MTL跟fairness這兩個新領域背景的了解,machine learning領域真的演進非常快速,每隔幾年就發現世界已經不一樣了,如何衡量fairness也非常有趣

🙃 Other related blogs:

KDD 21':Learning to Embed Categorical Features without Embedding Tables for Recommendation (google)

KDD 19': Sampling-bias-corrected neural modeling for large corpus item recommendations

KDD 18': Notification Volume Control and Optimization System at Pinterest

CVPR19' Complete the Look: Scene-based Complementary Product Recommendation

NAACL’19: Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence

NIPS’2017: Attention Is All You Need (Transformer)

KDD’19: Learning a Unified Embedding for Visual Search at Pinterest

BMVC19' Classification is a Strong Baseline for Deep Metric Learning

KDD’18: Graph Convolutional Neural Networks for Web-Scale Recommender Systems

🤩 Conference

ICCV: International Conference on Computer Vision

CVPR: Conference on Computer Vision and Pattern Recognition

KDD 2020

Top Conference Paper Challenge:

My Website:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arthur Lee

Arthur Lee

An machine learning engineer in Bay Area in the United States