Sitemap

Medical AI Works Better When It’s Simpler, Study Shows

3 min readSep 3, 2025
Press enter or click to view image in full size

Most artificial intelligence researchers assume that more sophisticated models perform better than simple ones, especially when adapting systems designed for everyday photographs to medical imaging. CDS PhD student Yanqi Xu and her collaborators discovered the opposite: simplified AI architectures consistently outperform their complex counterparts when detecting abnormalities in medical scans.

Xu’s team studied Detection Transformer (DETR), a popular AI model originally designed to identify objects in natural images like those found in everyday photographs. The model uses complex engineering strategies — multiple feature extraction layers, multi-scale feature analysis, and iterative refinement decoding techniques — to handle the challenging characteristics of natural scenes: overlapping objects, varying sizes, different viewing angles, and cluttered backgrounds.

Medical images present entirely different challenges. X-rays, CT scans, and mammograms are captured under standardized protocols, resulting in consistent anatomical positioning and minimal background variation. Objects of interest, such as tumors or lesions, are typically small and few in number, but distinguishing between normal and abnormal tissue requires detecting subtle differences.

“The motivation was to take a step back before we blindly apply models that have been optimized for general settings and think about whether these are really the optimal settings for medical images,” Xu said.

The research, published in Machine Learning for Biomedical Imaging under the title “Understanding differences in applying DETR to natural and medical images,” tested these simplified configurations on two medical datasets: the NYU Breast Cancer Screening Dataset for mammography and LUNA16 for lung nodule detection in chest CT scans. Xu collaborated with CDS Visiting Scholar and Adjunct Assistant Professor, Department of Radiology Krzysztof J. Geras, NYU Grossman School of Medicine Assistant Professor and CDS PhD alumnus Yiqiu Shen, CDS Associate Professor Carlos Fernandez-Granda, and Grossman Associate Professor Laura Heacock. Xu was also invited to present this work at the Medical Imaging with Deep Learning (MIDL) 2025 conference.

The team systematically reduced the complexity of standard DETR models by using fewer processing layers, single-scale feature maps instead of multi-scale fusion, and eliminating sophisticated decoding enhancements. These simplified versions achieved comparable or better detection performance while reducing computational requirements by up to 40%.

The key insight emerged from understanding what makes medical imaging fundamentally different from natural image analysis. In natural photographs, localization — finding where objects are — poses the primary challenge due to diverse object sizes, overlapping elements, and varying perspectives. Medical images make localization relatively straightforward due to standardized positioning and consistent scales.

“For medical images, it’s the opposite,” Xu explained. “Localizing the object is easy. However, the difficulty is determining whether this object is abnormal or not — that’s the more difficult thing to do.”

The simplified models maintained detection performance within 1% of complex versions while substantially reducing training time and computational overhead. For the mammography dataset, the team achieved improvements of up to 9.8% in detection metrics when using appropriately simplified architectures.

The work has broader implications for medical AI development. Rather than automatically applying complex models designed for natural images, researchers should consider the specific characteristics of medical data: high resolution images, standardized acquisition protocols, small regions of interest, and the primacy of classification over localization challenges.

The findings suggest that effective medical AI systems may require fundamentally different architectural approaches than those optimized for general computer vision tasks. This research provides a framework for developing more efficient and accurate medical imaging AI by aligning model complexity with task-specific requirements.

By Stephen Thomas

Have feedback on our content? Help us improve our blog by completing our (super quick) survey.

--

--

NYU Center for Data Science
NYU Center for Data Science

Written by NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.

No responses yet