Private Shapley Value Computation
In Federated Learning, multiple individuals train a single model together in a privacy-friendly way, i.e., their underlying datasets remain hidden from the other participants. Yet, many sensitive information can be learned from the individual model updates, so an additional privacy-preserving technique called Secure Aggregation is frequently used, which prevents the server from accessing individual-level information.
On the other hand, contribution-measuring techniques (such as the Shapley value) usually require access to such information when accurate estimation is required. Hence, it is an open question, to what extent can the Shapley Value be approximated with only aggregated information? More generally, what is the precise connection between the granularity of such information (i.e., the aggregation level) and the accuracy of the corresponding contribution estimation?
The student's task is to get familiar with Contribution Score Computation techniques and empirically measure with experiments how the aggregation level affects the Shapley approximation.
For similar topics, please visit https://crysys.hu/member/pejo#projects