Srihith Narahari, BS
University of Florida College of Medicine
Medical students are increasingly turning towards artificial intelligence to support their education. Medical students report using AI models to answer questions and explain difficult concepts. However, the ability of large language models to answer questions posed by medical students has not been sufficiently explored. By assessing the accuracy and helpfulness of LLM responses to medical student questions, we can better understand the capacity for AI to be used as a tool in undergraduate medical education. We performed the study during the Cardiovascular System block at a US-based medical program. Students submitted questions relating to course material, which were inputted into the LLM OpenEvidence. LLM-generated responses were then evaluated by faculty and students for accuracy and helpfulness. Students were also asked to give qualitative feedback regarding the LLM responses. The results demonstrated that students and faculty agreed most LLM responses were relatively accurate and helpful. Qualitative feedback revealed some students found LLM responses to be too detailed or difficult to understand, especially without a corresponding figure. Overall, we concluded that LLMs are effective in answering questions posed by medical students, generating responses that were accurate and helpful.
