AI Foundation Models Reshape Ophthalmic Diagnosis

Key Takeaways
- The review covered vision and vision-language foundation models in ophthalmology, with retinal disease diagnosis as the most common application area.
- Strong performance was reported across diabetic retinopathy, age-related macular degeneration, glaucoma, ocular surface tumors, and rare fundus diseases.
- Several models showed few-shot and zero-shot capacity, while translation remained limited by data diversity, bias, interpretability, interoperability, computational demands, and validation needs.
Following PRISMA guidance, the authors searched PubMed, Web of Science, Scopus, and Google Scholar for studies published between January 2020 and July 2025. Ten studies met inclusion criteria in a review focused on diagnostic performance, clinical potential, interpretability, fairness, deployment barriers, and future directions. The authors contrasted these systems with traditional ophthalmic AI tools, which are often built for one disease, dataset, or task. They also noted that foundation models may connect images, clinical language, and patient data. It was a broad survey of multimodal ophthalmic AI use rather than a test of one platform.
Retinal disease diagnosis was the most common application area, especially diabetic retinopathy, age-related macular degeneration, and diabetic macular edema. Within that landscape, the review cited RETFound, FLAIR, VisionFM, EyeCLIP, FMUE, MetaGP, MINIM, RETFound-DE, RetiZero, and OSPM as representative examples across ophthalmic tasks. These models were discussed in relation to retinal screening, glaucoma assessment, ocular surface tumor recognition, and identification of uncommon eye presentations, including rare fundus diseases. Together, the named systems span both vision-only and vision-language approaches across common and uncommon ophthalmic presentations.
Selected results illustrated that range. RETFound achieved an AUC of 0.94 on EyePACS for diabetic retinopathy detection, while VisionFM reached an AUC of 0.974 for age-related macular degeneration in external validation. RETFound-DE had an AUC of 0.902 on REFUGE-2 for glaucoma, and OSPM produced AUC values of about 0.986 to 0.993 for ocular surface tumors. RetiZero reached 75.6% top-five accuracy across more than 400 rare fundus diseases. Several models also showed few-shot and zero-shot learning capacity, suggesting adaptability to new diagnostic tasks when labeled data are limited. The examples ranged from common screening targets to rare-disease classification.
The review also outlined several barriers to translation, including limited data diversity, algorithmic bias, overfitting, high computational demands, limited interpretability, EHR interoperability challenges, and insufficient clinical validation. Future priorities included larger and more representative datasets, as well as explainable AI tools such as saliency maps, SHAP, and counterfactual reasoning.
The authors also highlighted post-deployment monitoring for fairness and performance drift as part of the development pathway. They further suggested that foundation models may support earlier diagnosis, improved referral decisions, expanded access to specialist-level eye care, and safer, more scalable AI-assisted ophthalmic workflows. Even with strong research performance, the potential of these models remained tied to transparency, careful validation, and deployment readiness.