Performance of Large Language Models for Structured Recognition and Refractive Prediction
Head-to-Head Evaluation of ChatGPT 4o, GPT-5, and DeepSeek for Structured Extraction, Toric IOL Recommendation, and Refractive Prediction
Jin Yang
100 participants
Aug 1, 2025
OBSERVATIONAL
Conditions
Summary
We conducted a single-center, retrospective observational study to evaluate large language models (ChatGPT 4o, GPT-5, DeepSeek) for automated interpretation of de-identified IOLMaster 700 reports provided as raster images. Models produced structured biometric extraction, toric IOL recommendation, and refractive predictions (sphere, cylinder, axis). Primary outcomes included parameter-level agreement and refractive error metrics; secondary outcomes included decision-support performance for toric IOL selection and agreement on ordered T-codes. No clinical intervention was performed.
Eligibility
Inclusion Criteria1
- postoperative corrected distance visual acuity (CDVA) of 0.10 logMAR or better -an absolute IOL rotational stability of less than 10∘ at the 1-month follow-up examination
Exclusion Criteria4
- incomplete biometric data on the examination report;
- a history of previous ocular surgery or ocular trauma
- the occurrence of intraoperative complications, such as an anterior capsular tear or posterior capsular rupture
- the development of significant postoperative complications, including but not limited to severe intraocular infection or inadequate pupillary dilation.
Interested in this trial?
Get notified about updates and connect with the research team.
Locations(1)
View Full Details on ClinicalTrials.gov
For the most up-to-date information, visit the official listing.
NCT07183891