Trade-offs in Machine Learning

Dr. Pejó Balázs

There are several open problems with ML. A handful of these is concerning privacy (such as the protection of the data used for training) and game theory (such as setting the parameters for rational agents). Within this project, the student will get familiar with ML techniques, and depending on the topic (should you choose to accept), either privacy-preserving mechanisms or game-theoretical models. 

  • Improving Machine Learning by Preclassification: Machine Learning (ML) algorithm performs better on bigger datasets, so it is a good idea to use more data in general. On the other hand, not all data was created equal: could the model's accuracy be improved by carefully selecting different training data for each phase of the learning?
  • Testing Data Inference: For every ML model, the underlying data is separated into training and testing. While Membership Inference aims to determine whether a particular data point was part of a training set, currently, there are no known techniques to indicate a data point in the test set. Is it even possible?
  • Privacy-Security-Accuracy Triangle: There is a clear connection between privacy and accuracy within ML. However, more privacy (e.g., noise) could decrease the robustness of the model as it would be easier to fool it (e.g., misclassification). Could this trade-off be measured, and based on some incentives optimized?
  • Privacy-Honesty-Accuracy Trade-off: Privacy protection has an explicit effect on accuracy (e.g., more noise, less accurate model). With more privacy comes a higher chance of cheating (since the actual contribution is more and more hidden). Hence there is an implicit effect as well. How could this relationship be modeled?