RecruitingNCT07378358

Evaluation of AI Large Models for Diagnosis and Treatment in Real-World Cases: Multicenter Retrospective Study


Sponsor

First Affiliated Hospital of Fujian Medical University

Enrollment

800 participants

Start Date

Jan 1, 2026

Study Type

OBSERVATIONAL

Conditions

Summary

This multicenter retrospective study aims to evaluate the diagnostic and therapeutic performance of three large language models-ChatGPT, Gemini and Deepseek-using 800 archived inpatient medical records from urology departments across four tertiary hospitals. The study will focus on the accuracy and applicability of these models in disease recognition, preliminary diagnosis and treatment recommendation generation, in order to explore their potential value and limitations in supporting clinical decision-making in real-world settings.


Eligibility

Min Age: 18 Years

Inclusion Criteria6

  • The case data is sourced from the four hospitals involved in the study, with complete and authentic diagnosis and treatment records.
  • Patients must be 18 years or older, with no gender restrictions.
  • Complete medical records, including the following core information: patient' s basic information, present illness history, past medical history, physical examination, and auxiliary examinations (including laboratory and imaging tests).
  • A clear discharge diagnosis and treatment plan (including therapeutic measures and follow-up arrangements).
  • Medical records have been archived, with objective and accurate information that has not been altered.
  • The patient or their legal representative has provided informed consent, agreeing to the use of their anonymized medical data for research analysis.

Exclusion Criteria6

  • Medical records with significant missing information, such as key clinical details (present illness history, diagnostic or treatment records, etc.).
  • Cases where the diagnosis or treatment plan is unclear, or where treatment has not been fully completed for an initial diagnosis.
  • Cases where the primary diagnosis is not urological.
  • Cases with major errors or inconsistencies in the records that could affect further assessment.
  • Medical records in special formats or images that are not readable (e.g., handwritten notes, non-standard documentation).
  • Patients who have not signed the informed consent form or who refuse to allow their medical data to be used for research.

Interested in this trial?

Get notified about updates and connect with the research team.

Interventions

OTHERLarge Language Model Assessment (ChatGPT, Gemini, DeepSeek)

De-identified inpatient medical records were retrospectively collected from the urology departments of four tertiary hospitals (200 cases per site, 800 in total). Each case included standardized clinical information such as demographics, chief complaint, history of present illness, past medical history, physical examination, laboratory and imaging findings, discharge diagnosis and treatment plan. To simulate the role of an AI system in a "first-visit physician" scenario, all diagnostic conclusions, differential diagnoses and treatment plans were removed before being input into the models. Three large language models (ChatGPT, Gemini and DeepSeek) were prompted with a standardized instruction: "Based on the above clinical information, provide your preliminary diagnosis, differential diagnoses and treatment recommendations." Each model generated outputs including (i) primary and secondary diagnoses, (ii) differential diagnosis lists with reasoning and (iii) preliminary treatment suggesti


Locations(1)

The First Affiliated Hospital of Fujian Medical University

Fuzhou, China

View Full Details on ClinicalTrials.gov

For the most up-to-date information, visit the official listing.

Visit

NCT07378358


Related Trials