Understanding Variation in Breast Cancer Screening Results: AI Vs. Radiologists
The question of whether AI can perform as well as humans — if not more so — is often raised in public discussion, whether it’s a dialogue about self-driving cars or strategies for policing social media. “Differences between human and machine perception in medical diagnosis”, led by CDS PhD student Taro Makino, attempts to answer a similar question when it comes to breast cancer screenings. The paper observes that radiologists and AI produce different results in analyzing breast cancer medical images and proposes a potential solution to understand this gap in outcome. Additional collaborators on this project include former CDS affiliated postdoctoral researcher, Stanisław Jastrzębski, CDS Associate Professor of Computer Science and Data Science, Kyunghyun Cho, CDS affiliated professor, Krzysztof Geras, and others.
The team’s work sought to better understand why AI renders different results than human radiologists and how leveraging AI can potentially enhance cancer detection. Specifically, they analyze if deep neural networks (DNNs) use different features of analysis than humans, who are less likely to incur nonsensical mistakes, due to their techniques being rooted in medical science. Ultimately, though DNNs show promise in image-based medical diagnosis, they cannot be “fully trusted since they can fail for reasons unrelated to underlying pathology.” To address this, the team proposes a framework for comparing human and machine perception in medical diagnosis. It looks at the comparison in terms of perturbation robustness and performs subgroup analysis to diminish the occurrence of Simpson’s paradox. They demonstrate this by analyzing microcalcifications and soft tissue lesions. Their findings seem to indicate that DNNs rely on high-frequency components to detect soft tissue lesions, which are disregarded by radiologists. Moreover, these spurious features tend to be located outside of the image area that radiologists find most concerning.
The project has recently received quite a bit of attention, notably having been featured in the NYU News and DotMed News. It was supported by grants from the NSF (HDR-1922658) and the National Institutes of Health (P41EB017183, R21CA225175).
To read the paper in its entirety, please visit the “Differences between…” Nature profile page.
By Ashley C. McDonald