Don’t be fooled: Avoiding misleading conclusions in machine learning

NYU Center for Data Science
2 min readMay 10, 2023

The development of a new “distribution-aware” method by a team of NYU researchers was named a notable paper at AISTATS

Feature attribution methods are used in machine learning to help identify which features have the largest impact on a label. While insightful, a team of NYU researchers demonstrated in a recently published paper that popular “class-dependent” feature attribution methods — such as SHAP, LIME, and Grad-CAM — can sometimes be misleading, making a specific class appear to be more likely than it actually is. To avoid this pitfall, the team also developed new “distribution-aware” methods.

Authored by MD/PhD candidate at the NYU Grossman School of Medicine Neil Jethani, PhD candidate at the Courant Institute Adriel Saporta, and CDS Assistant Professor Rajesh Ranganath, “Don’t be fooled: label leakage in explanation methods and the importance of their quantitative evaluation” was recently named a notable paper at the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023. The in-person event was held in Valencia, Spain from April 25th through the 27th.

In the paper, the authors encourage the use of distribution-aware methods, which focus on the full distribution of the label across all classes, as opposed to for a single class. They present two baseline distribution-aware methods, SHAP-KL and FastSHAP-KL, which they demonstrate are more effective than other feature attribution methods on three different clinical datasets (images, biosignals, and text).

The authors note that despite the possibility of leading to wrong conclusions, class-dependent feature attribution methods are often used in high-stakes contexts such as healthcare.

“Explaining the complex relationships hidden in healthcare data can help make medical discoveries, but healthcare practitioners need to be careful that they understand what an explanation is actually telling them about their data or model,” said Neil Jethani. “Our work shows that the most popular explanation methods can result in potentially harmful confirmation bias.”

This work represents an important step towards improving the quality of explainability methods for AI.

By Meryl Phair



NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.