AI performs just as well as human doctors in detecting breast cancer

OAK BROOK, Ill. – Artificial intelligence (AI) may soon assist in detecting breast cancer with the same accuracy as a human doctor. Researchers have demonstrated that AI performs comparably to medical professionals when interpreting mammogram screenings.

This research, in collaboration with the National Health Service’s Breast Cancer Screening Program (NHSBSP), revealed that the AI algorithm was marginally superior to human professionals in diagnosing breast cancer from 120 mammogram examinations. Scientists are optimistic that continued research will lead to AI becoming a standard tool in breast cancer screening to support doctors’ diagnoses.

Ideally, mammograms should be interpreted by two separate readers to minimize the chances of false-positive diagnoses. However, due to the scarcity of radiologists, this dual-reading approach is often not feasible. This British study, featured in the Radiological Society of North America (RSNA) journal Radiology, compared a commercially available AI algorithm to human evaluators from the NHS in interpreting mammograms.

The screening process involves a mammographer taking multiple X-rays of each breast to identify signs of breast cancer that might be too tiny to detect manually. Under the NHS Breast Screening Program, women typically receive their inaugural mammographic screening invitation between the ages of 50 and 53, with subsequent screenings every three years until age 70.

Doctor viewing mammogram breast cancer screening
(© okrasiuk – stock.adobe.com)

Nevertheless, this method doesn’t detect every case of breast cancer. False-positive results can lead to unnecessary further imaging or even biopsies. Research from the University of Nottingham indicates that reading the mammogram results twice elevates cancer detection rates by six to 15 percent and maintains low recall rates. However, such a strategy is labor-intensive and challenging due to a worldwide shortage of qualified readers.

The study, spearheaded by Professor Yan Chen, utilized tests from the Personal Performance in Mammographic Screening (PERFORMS) quality assurance assessment — a tool employed by the NHSBSP. Each PERFORMS test comprises 60 complex mammogram exams from the NHSBSP, showcasing benign, normal, and abnormal results. The scores of the NHS readers were then contrasted with the AI outcomes.

Using data from two sequential PERFORMS test sets (120 screening mammograms in total), the team also evaluated the AI algorithm’s efficiency with the same dataset.

Upon comparing the AI’s test scores with those of 552 human evaluators — including 315 board-certified radiologists, 206 radiographers, and 31 breast clinicians — the team observed negligible differences in performance. The human readers achieved an average sensitivity of 90 percent and specificity of 76 percent, whereas the AI surpassed them slightly with 91 percent sensitivity and 77 percent specificity.

Woman undergoing a mammogram
Photo by National Cancer Institute on Unsplash

“There is a lot of pressure to deploy AI quickly to solve these problems, but we need to get it right to protect women’s health,” says Yan Chen, Ph.D., professor of digital screening at the University of Nottingham, in a media release.

“The 552 readers in our study represent 68% of readers in the NHSBSP, so this provides a robust performance comparison between human readers and AI,” Prof. Chen continues.

“The results of this study provide strong supporting evidence that AI for breast cancer screening can perform as well as human readers.”

“It’s really important that human readers working in breast cancer screening demonstrate satisfactory performance,” the researcher adds. “The same will be true for AI once it enters clinical practice.”

Prof. Chen did warn, though, that further research is necessary before AI is introduced as a second reader in clinical breast cancer screenings, adding that performance can drift over time and algorithms can be affected by changes in the operating environment.

“I think it is too early to say precisely how we will ultimately use AI in breast screening,” the study author says. “The large prospective clinical trials that are ongoing will tell us more. But no matter how we use AI, the ability to provide ongoing performance monitoring will be crucial to its success.”

“It’s vital that imaging centers have a process in place to provide ongoing monitoring of AI once it becomes part of clinical practice,” Chen concludes. “There are no other studies to date that have compared such a large number of human reader performance in routine quality assurance test sets to AI, so this study may provide a model for assessing AI performance in a real-world setting.”

You might also be interested in: 

South West News Service writer James Gamble contributed to this report.

YouTube video