KDD 21':Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning (google)
🤗 Recommendation system paper challenge (31/50)
paper link Google KDD 2021
🤔 What problem do they solve?
現有的MTL (multi-task learning) 沒有針對fairness進行優化,只有STL (single task learning)針對fairness進行優化
所以這篇paper在於解決 fairness on MTL的問題
😎 Contribution
定義新的metric:
- ARFG: 對每個task去計算MTL中的FPRGap跟STL的FPRGap的比值,在平均起來
- ARE: 對每個task去計算MTL中的error rate跟STL的error rate的比值,在平均起來
baseline 方法: Per-Task Fairness Treatment
與MTL類似,只是多了要optimize fairness loss
how to measure fairness loss?
- minimizing the correlation between group membership and the predictions over negative examples (讓每個task中的negative sample的正確性之間的相關性最小)
- kernel-based distribution matching through Maximum Mean Discrepancy (MMD) over negative examples (讓每個task的MMD最大)
- minimizing FPR gap directly in the loss (goal)
直覺想法,讓每個task的正確性預測越相近,他們的FPR就越接近
提出方法: Multi-Task-Aware Fairness Treatment (MTA-F)
其實只是在baseline的方法加上了task-specific parameter training
可以發現只要只是Fairness的F改了,對於shared layer我們想用大家共同的data去train,對於task-specific layer,我們只想用task-specific data去跑
舉例來說, 從下表中,如果我們針對task1去找shared-negative-data跟specific-negative-data
- specific-negative-data: 只有他negative,其他task是positive,所以是 E
- shared-negative-data: task1 negative而且扣掉specific-negative-data,所以是 (EFGH — E) = FGH
用了以上的data,我們就可以跑training
以下是詳細的algorithm
😮 Background:
在很早期,我們只對一個task去train deep learning,但隨著業務要求還有data因素,我們開始有了multi-task learning
- 業務(given a image,我們不只想知道這是男生女生,還想知道他快樂與否)
- data (我們有task A, taskB, 也有一個大data可能有些沒有label for task A, or for task B), 因此想到我們也許可以把這個data直接餵給task A, task B一起去optimize
Fairness metric
有兩個常用的metric
- all subgroups receive the same proportion of positive outcomes: 通常在Demographic parity的case
- equal opportunity and equalized odds: equal TPR (true positive rates) and FPR (false positive rates) across different subgroups, 更加務實,此篇論文用這metric
Fairer representation learning
- fairer representation learning
Fairness mitigation
- single-task learning setting (pre-processing the data embeddings for downstream job, post-processing model’s prediction)
- intervening the model training process has also been popular, including adding fairness constraints or regularization
Multi-task learning
shared subset parameters
pros:
- exploits task relatedness with inductive bias learning (learning a shared representation across related tasks is beneficial for harder tasks or tasks with limited training examples.) 對難的問題跟限制data很有幫助
- forcing tasks to share model capacity (regularization) ->避免overfitting,更general
- proving compact and efficient form of modeling which enables training and serving multiple prediction quantities for large-scale systems -> serving time更scalable system
cons:
- 可能讓task1好但task2不好
- 現存方法是reducing the task training conflicts 還有improving the Pareto frontier
Fairness in multi-task learning
隨著MTL越來越熱門,開始很多人在研究fairness in MTL
- fairness in multi-task regression models and uses a rank-based non-parametric independence test to improve fairness in ranking
- multi-task learning enhanced with fairness constraints to jointly learn classifiers that leverage information between sensitive groups
Fairness-accuracy trade-off and Pareto fairness
兩種trade-off
- fairness跟accuracy有 a trade-off 關係 for single task
- 各個accuracy trade-off cross tasks
幾個key questions
- How does inductive transfer in multi-task learning implicitly impacts fairness?
- How do we measure the fairness-accuracy trade-off for multi-task learning
- Are we able to achieve a better Pareto efficiency in fairness across multiple tasks, by exploiting task relatedness and shared architecture that’s specific to multi-task learning?
心得
MTL越來越熱門,很多大公司也開始使用,不僅是NLP跟CV領域,像是傳統推薦系統也會考慮MTL(click, comment, skip rate)抑或是新聞推薦系統(engagement v.s. freshness),從業務上出發,擁有很多不同的目標要達成
而fairness也越來越受到重視,希望model盡可能學習不同subgroup的metric讓我們的model不會太bias在某些group上
這兩種trade-off都是很具有挑戰性的,作者試著把這兩個trade-off去結合成一個新的問題,再提出metric跟方法來解決,屬於modeling paper (比較沒有涉及serving, production的部分),對於學習modeling來說,是個不錯的範本
結語
看完了這篇paper收穫蠻多的,主要是對於MTL跟fairness這兩個新領域背景的了解,machine learning領域真的演進非常快速,每隔幾年就發現世界已經不一樣了,如何衡量fairness也非常有趣
🙃 Other related blogs:
KDD 21':Learning to Embed Categorical Features without Embedding Tables for Recommendation (google)
KDD 19': Sampling-bias-corrected neural modeling for large corpus item recommendations
KDD 18': Notification Volume Control and Optimization System at Pinterest
CVPR19' Complete the Look: Scene-based Complementary Product Recommendation
NAACL’19: Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence
NIPS’2017: Attention Is All You Need (Transformer)
KDD’19: Learning a Unified Embedding for Visual Search at Pinterest
BMVC19' Classification is a Strong Baseline for Deep Metric Learning
KDD’18: Graph Convolutional Neural Networks for Web-Scale Recommender Systems
🤩 Conference
ICCV: International Conference on Computer Vision
http://iccv2019.thecvf.com/submission/timeline
CVPR: Conference on Computer Vision and Pattern Recognition
KDD 2020
Top Conference Paper Challenge:
https://medium.com/@arthurlee_73761/top-conference-paper-challenge-2d7ca24115c6
My Website: