Training machine learning to benefit the low vision community
A capstone paper by CDS students exploring how AI models can better recognize objects to aid those with blindness was accepted at the 2023 IEEE-EMBC Conference
Artificial intelligence has shown significant potential for developing technologies to benefit people with blindness or low vision. These object detection models are typically trained on generic data, instead of datasets specific for the needs of people with blindness. A team of NYU researchers including CDS masters students Tharangini Sankarnarayanan, Lev Paciorkowski, and Khevna Parikh tackled this issue for their capstone project, developing a dataset of objects regularly encountered by those with low vision.
The paper “Training AI to Recognize Objects of Interest to the Blind and Low Vision Community” is scheduled to be published in PubMed and was accepted into the 45th Annual International Conference of the IEEE Engineering in Medical Biology Society (IEEE-EMBC) held in Sydney from July 24 through the 27th.
Co-authors on the work include Postdoctoral Fellow at NYU Grossman School of Medicine Giles Hamilton-Fletcher, Assistant Professor at NYU Tandon Chen Feng, masters student at NYU Tandon Diwei Sheng, Research Assistant Professor at NYU Grossman Todd E. Hudson, Ilse Melamid Associate Professor of Rehabilitation Medicine at NYU Grossman John-Ross Rizzo, and Assistant Professor and Director of the Neuroimaging and Visual Science Laboratory at NYU Grossman Kevin C. Chan. The work was additionally supported by the U.S. Department of Defense Vision Research Program and a grant from Research to Prevent Blindness to NYU Langone Health Department of Ophthalmology.
By utilizing user-centric feedback, the researchers identified thirty-five objects essential to people with blindness. They gathered images of the objects from publicly available datasets and trained a YOLOv5x model to recognize the selected items. Through running the model, they found it was significantly better at identifying objects such as coffee mugs, knives, forks, and glasses then previous models. The researchers also found having a balanced number of different types of objects in the training dataset also improved the model’s ability to detect objects as well as its speed.
“It’s exciting to see the rapid advances in this kind of computer vision technology and its potential for the blind community,” said the CDS authors. “The biggest challenge currently seems to be in acquiring high-quality training datasets that accurately represent the real-world environments that such a model would be deployed into. Going forward we think more attention should be paid to which training images are used and where they come from.”
Along with improving available data for the development of machine learning models that aid those with blindness, the research demonstrates the importance of curating training data for assistive technologies that cater to individual users’ needs.
By Meryl Phair