1. Home
  2. Medical News
  3. Radiology
advertisement

AI in Radiology: Transforming Ovarian Tumor Diagnosis with Ultrasound

ai in radiology transforming ovarian tumor diagnosis with ultrasound
10/22/2025

A multicenter study published in Insights into Imaging reports that deep learning (DL) models based on ultrasound (US) images can improve the diagnostic accuracy and consistency of ovarian tumor classification among primary care radiologists.

Ovarian tumors, the most common neoplasms of the female reproductive system, are clinically categorized as benign, borderline, or malignant. Accurate preoperative classification is essential for determining appropriate treatment. Existing diagnostic systems such as the Ovarian-Adnexal Reporting and Data System (O-RADS) and the IOTA ADNEX model rely on expert interpretation and have limitations, particularly in cases of borderline ovarian tumors (BOTs), for which clear sonographic criteria are lacking.

To address this, researchers developed a DL-based diagnostic system combining an image-processing model with a large language model (LLM), GPT-4o. The retrospective study included 1,417 US images from 997 women treated at five hospitals in China between 2014 and 2023. Three image-based architectures were evaluated: ResNet-50, VGG-16, and Vision Transformer (ViT).

Among the tested models, ResNet-50 achieved the highest classification accuracy on the external test set: 91.8% for benign tumors, 84.6% for borderline, and 82.6% for malignant. This model outperformed both VGG-16 and ViT in validation testing.

The study also assessed the impact of combining the visual model with text outputs generated by the LLM. The LLM produced descriptive summaries of tumor features based on model predictions, without providing a diagnostic label. When five primary radiologists with 3–5 years of experience used the integrated DL system during image review, their diagnostic accuracy on the test set increased from 76.6–79.2% to 90.9–95.9%. Inter-reader agreement, measured using kappa statistics, also improved across all users.

Compared with the ADNEX model, which achieved 61.2% accuracy on the test set, the DL model showed higher accuracy (87.8%) using grayscale US images alone. The study also demonstrated the model’s ability to distinguish BOTs—an area where existing systems and radiologists often face diagnostic uncertainty.

The data were collected from five institutions across different regions in China and included US images from multiple vendors. While the dataset was anonymized and static in nature, the authors noted that the external test set was fully disjoint from the training data, supporting the model's generalizability under these study conditions.

Limitations include the lack of Doppler imaging, the retrospective nature of the study, and the geographic homogeneity of the patient population. Additionally, BOTs were excluded from the human-assisted evaluation component due to the absence of standardized diagnostic guidelines.

According to the authors, the DL model may serve as a diagnostic aid for clinicians with varying levels of experience, particularly in classifying US images of ovarian tumors. Further prospective studies are needed to validate performance in real-time clinical environments.

Register

We’re glad to see you’re enjoying ReachMD…
but how about a more personalized experience?

Register for free