AI Models Trained on Healthy Subjects Can Detect Disease Severity

NYU Center for Data Science
3 min readAug 14, 2024

--

Artificial intelligence models trained exclusively on data from healthy individuals can accurately assess the severity of medical conditions they’ve never encountered. This surprising finding comes from a new study, “Quantifying impairment and disease severity using AI models trained on healthy subjects,” published in Nature Digital Medicine by CDS PhD student Boyang Yu and her colleagues.

The researchers developed a novel framework called COnfidence-Based chaRacterization of Anomalies (COBRA) that leverages the decreased confidence of AI models when presented with data from impaired or diseased patients. This approach allowed them to quantify deviations from healthy baselines for two distinct medical conditions: stroke-induced upper body impairment and knee osteoarthritis.

“Our idea is to capture normal movement patterns in healthy people,” Yu explained. “When the model encounters an unfamiliar scenario, like a stroke patient accidentally spilling water, it produces a low confidence prediction. This sensitivity allows us to characterize the anomaly, which in this case is the impairment from the stroke.”

For stroke patients, the team trained AI models to predict upper extremity motions from video and wearable sensor data collected from healthy individuals performing everyday tasks. When applied to data from stroke patients, the models’ confidence scores strongly correlated with the gold-standard Fugl-Meyer Assessment of motor impairment.

Importantly, the COBRA score can be computed in under a minute without expert input, potentially enabling more frequent and accessible monitoring of patient recovery. This contrasts sharply with the Fugl-Meyer Assessment, which requires in-person administration by a trained clinician and takes 30–45 minutes to complete.

To demonstrate the framework’s versatility, the researchers also applied it to knee osteoarthritis severity assessment using magnetic resonance imaging (MRI) scans. They trained an AI model to segment different knee tissues in MRIs of healthy knees, then used it to analyze scans from patients with varying degrees of osteoarthritis. The resulting COBRA scores showed a strong correlation with independent clinical assessments of disease severity.

CDS Interim Director and Associate Professor of Mathematics and Data Science Carlos Fernandez-Granda, the study’s senior author, created a YouTube video explaining the paper’s findings. In it, he highlights the potential of this approach to enable automatic assessment of impairment and disease severity across a wide range of medical conditions.

The study’s implications extend beyond the specific conditions examined. “Even just gathering all the data from healthy people to create a large database can be very helpful,” Yu noted. “Once we know what’s normal, we can detect anomalies.”

This work opens up new possibilities for leveraging AI in healthcare, particularly in scenarios where large-scale databases of patients with varying degrees of impairment or disease severity are challenging to assemble. As the field progresses, population-based data collection efforts focused on healthy individuals could become increasingly valuable for developing powerful, unsupervised foundation models for healthcare applications.

The researchers have open-sourced their models, data, and code, paving the way for further exploration and potential clinical applications of this novel approach to disease assessment. The code and data can be found on the authors’ website for the project.

By Stephen Thomas

--

--

NYU Center for Data Science
NYU Center for Data Science

Written by NYU Center for Data Science

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.