Skoltech researchers have trained a neural network to search for lung pathologies on X-ray images and come up with brief verbal descriptions to accompany them. Such captions are now compiled by the physician, whom it tends to take several minutes. According to the creators of the artificial intelligence solution, it lowers this time to about 30 seconds when no considerable text revision is required. In most cases the radiologist indeed merely has to confirm the suggested diagnosis — e.g., fibrosis, enlarged heart, or a suspected malignant tumor — or absence thereof. The study came out in the Nature Portfolio journal Scientific Reports.
The solution relies on modern machine vision and computer linguistics models, including GPT-3 small — the predecessor of the wildly popular GPT-3.5 and GPT-4 models available via the ChatGPT bot.
“Regular models merely classify, but our neural network leverages advanced machine vision and computer linguistics models to automatically describe X-ray images in words,” one of its creators, Skoltech Research Scientist Oleg Rogov, commented.
The neural network is trained on data composed of image-text pairs. “We compiled our own radiological dictionary to make the model more accurate, specifically where radiological terms and their usage in texts are concerned. Naturally, we also put together a large integrated database of X-ray images for use as training data,” Rogov added, emphasizing that the neural network is only “aware” of those diagnoses that can actually manifest themselves on lung X-rays. The training set was balanced in terms of which diseases are represented.
Possibilities for further development of the system include its application to MRI and CT scans, as well as incorporating active learning. The latter refers to models improving their predictions by taking into account what edits human reviewers make. The solution could also be combined with another neural network, which would graphically highlight the areas of interest mentioned in the caption.