Measuring Robustness through Contribution
This thesis explores the role of data quality in enhancing the safety, trustworthiness, and robustness of machine learning models. Unlike traditional approaches that prioritize accuracy, this research focuses on identifying and quantifying the importance of individual data points in relation to model robustness—particularly against naturally occurring noise. The study will develop and evaluate techniques for assessing how specific samples contribute to model stability and investigate the relationship between samples that improve robustness versus those that optimize accuracy. The proposed methods will be validated through extensive experimentation using multiple datasets and model architectures, with the goal of guiding data selection and curation strategies for more reliable AI systems.