Towards Mitigating Spurious Correlations in Image Classifiers with Simple Yes-no Feedback

Poster Paper, Artificial Intelligence & Human-Computer Interaction Workshop at ICML (AI&HCI), 2023

Seongmin Lee

Ali Payani

Duen Horng (Polo) Chau

Project

PDF

Abstract

Modern deep learning models have achieved remarkable performance. However, they often rely on spurious correlations between data and labels that exist only in the training data, resulting in poor generalization performance. We present CRAYON (Correlation Rectification Algorithms by Yes Or No), effective, scalable, and practical solutions to refine models with spurious correlations using simple yes-no feedback on model interpretations. CRAYON addresses key limitations of existing approaches that heavily rely on costly human intervention and empowers popular model interpretation techniques to mitigate spurious correlations in two distinct ways: CRAYON-ATTENTION guides saliency maps to focus on relevant image regions, and CRAYON-PRUNING prunes irrelevant neurons to remove their influence. Extensive evaluation on three benchmark image datasets and three state-of-the-art methods demonstrates that our methods effectively mitigate spurious correlations, achieving comparable or even better performance than existing approaches that require more complex feedback.

BibTeX

					
@inproceedings{lee2023towards,
  title={Towards Mitigating Spurious Correlations in Image Classifiers with Simple Yes-no Feedback},
  author={Lee, Seongmin and Payani, Ali and Chau, Duen Horng (Polo)},
  booktitle={AI & HCI Workshop at the 40th International Conference on Machine Learning (ICML)},
  year={2023},
}