Robust Federated Learning
In a classical Machine Learning scenario, data (e.g., textual) is fed into the model (e.g., next-word predictor) for training. The larger this dataset, the greater the prediction power of the trained model. However, in many real-world applications, no central dataset is available for training, but smaller pieces are scattered among various entities (e.g., on mobile phones). A model trained on small local datasets might be biased or have low accuracy. Hence, it is desired to train a joint model collaboratively.
Due to various privacy regulations, directly sharing sensitive data (e.g., text messages) is not feasible. Federated Learning tackles such a distributed setup where multiple entities (e.g., mobile devices) train a single machine learning model together (e.g., next-word predictor) based on their overall dataset, which is never shared with other participants.
On the other hand, the participants could be malicious. Their goal could vary from performance degradation to increasing/decreasing the occurrence of desired strings (e.g., Adidas). There are several techniques to mitigate the effect of the malicious participants in Federated Learning; some work in the client selection phase, while others work in the aggregation phase.
The student's task is to explore these techniques, experiment with them and propose an optimal solution that combines several.
For similar topics, please visit https://crysys.hu/member/pejo#projects