Testing Data Inference
In Machine Learning, the underlying data is separated into training and testing, where the model is trained on the former to make an accurate prediction. Yet, besides general patterns, the model could learn explicit information corresponding to specific training data points, which could lead to privacy leakage. For instance, it is possible to determine whether a particular sample was used for training via a Membership Inference Attack. It takes advantage of the confidence values of the prediction: a model predicts more confidently on samples it was trained than on samples it has never seen before.
The student's task is to learn about state-of-the-art privacy attacks, which single-mindedly aim at the training data. Inspired by them, it is desired to adopt or create novel techniques to infer details about the dataset used for testing (if possible).
For similar topics, please visit https://crysys.hu/member/pejo#projects