Published on in Vol 11 (2023)
Preprints (earlier versions) of this paper are
available at
https://preprints.jmir.org/preprint/49995, first published
.
Journals
- Infante A, Gaudino S, Orsini F, Del Ciello A, Gullì C, Merlino B, Natale L, Iezzi R, Sala E. Large language models (LLMs) in the evaluation of emergency radiology reports: performance of ChatGPT-4, Perplexity, and Bard. Clinical Radiology 2024;79(2):102 View
- Ćirković A, Katz T. Exploring the Potential of ChatGPT-4 in Predicting Refractive Surgery Categorizations: Comparative Study. JMIR Formative Research 2023;7:e51798 View
- Sallam M, Barakat M, Sallam M. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interactive Journal of Medical Research 2024;13:e54704 View
- Reis F, Lenz C. Performance of Artificial Intelligence (AI)-Powered Chatbots in the Assessment of Medical Case Reports: Qualitative Insights From Simulated Scenarios. Cureus 2024 View
- Wang L, Chen X, Deng X, Wen H, You M, Liu W, Li Q, Li J. Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs. npj Digital Medicine 2024;7(1) View
- Xue Z, Zhang Y, Gan W, Wang H, She G, Zheng X. Quality and Dependability of ChatGPT and DingXiangYuan Forums for Remote Orthopedic Consultations: Comparative Analysis. Journal of Medical Internet Research 2024;26:e50882 View
- Jindal A, Brandao-de-Resende C, Neo Y, Melo M, Day A. Enhancing Ophthalmic Triage: identification of new clinical features to support healthcare professionals in triage. Eye 2024;38(13):2536 View
- Sheikh M, Barreto E, Miao J, Thongprayoon C, Gregoire J, Dreesman B, Erickson S, Craici I, Cheungpasitporn W. Evaluating ChatGPT's efficacy in assessing the safety of non-prescription medications and supplements in patients with kidney disease. DIGITAL HEALTH 2024;10 View
- Frosolini A, Catarzi L, Benedetti S, Latini L, Chisci G, Franz L, Gennaro P, Gabriele G. The Role of Large Language Models (LLMs) in Providing Triage for Maxillofacial Trauma Cases: A Preliminary Study. Diagnostics 2024;14(8):839 View
- Andreadis K, Newman D, Twan C, Shunk A, Mann D, Stevens E. Mixed methods assessment of the influence of demographics on medical advice of ChatGPT. Journal of the American Medical Informatics Association 2024;31(9):2002 View
- Harada Y, Sakamoto T, Sugimoto S, Shimizu T. Longitudinal Changes in Diagnostic Accuracy of a Differential Diagnosis List Developed by an AI-Based Symptom Checker: Retrospective Observational Study. JMIR Formative Research 2024;8:e53985 View
- Pressman S, Borna S, Gomez-Cabello C, Haider S, Forte A. AI in Hand Surgery: Assessing Large Language Models in the Classification and Management of Hand Injuries. Journal of Clinical Medicine 2024;13(10):2832 View
- Yazaki M, Maki S, Furuya T, Inoue K, Nagai K, Nagashima Y, Maruyama J, Toki Y, Kitagawa K, Iwata S, Kitamura T, Gushiken S, Noguchi Y, Inoue M, Shiga Y, Inage K, Orita S, Nakada T, Ohtori S. Emergency Patient Triage Improvement through a Retrieval-Augmented Generation Enhanced Large-Scale Language Model. Prehospital Emergency Care 2025;29(3):203 View
- Hoppe J, Auer M, Strüven A, Massberg S, Stremmel C. ChatGPT With GPT-4 Outperforms Emergency Department Physicians in Diagnostic Accuracy: Retrospective Analysis. Journal of Medical Internet Research 2024;26:e56110 View
- Silverman A, Shung D, Stidham R, Kochhar G, Iacucci M. How Artificial Intelligence Will Transform Clinical Care, Research, and Trials for Inflammatory Bowel Disease. Clinical Gastroenterology and Hepatology 2025;23(3):428 View
- Scott I, Miller T, Crock C. Using conversant artificial intelligence to improve diagnostic reasoning: ready for prime time?. Medical Journal of Australia 2024;221(5):240 View
- Tong L, Zhang C, Liu R, Yang J, Sun Z. Comparative performance analysis of large language models: ChatGPT-3.5, ChatGPT-4 and Google Gemini in glucocorticoid-induced osteoporosis. Journal of Orthopaedic Surgery and Research 2024;19(1) View
- Ghilzai U, Fiedler B, Ghali A, Singh A, Cass B, Young A, Ahmed A. ChatGPT provides acceptable responses to patient questions regarding common shoulder pathology. Shoulder & Elbow 2024 View
- Colakca C, Ergın M, Ozensoy H, Sener A, Guru S, Ozhasenekler A. Emergency department triaging using ChatGPT based on emergency severity index principles: a cross-sectional study. Scientific Reports 2024;14(1) View
- Bedi S, Liu Y, Orr-Ewing L, Dash D, Koyejo S, Callahan A, Fries J, Wornow M, Swaminathan A, Lehmann L, Hong H, Kashyap M, Chaurasia A, Shah N, Singh K, Tazbaz T, Milstein A, Pfeffer M, Shah N. Testing and Evaluation of Health Care Applications of Large Language Models. JAMA 2025;333(4):319 View
- Wu A. Chatting together: Using AI chatbots to improve diagnostic excellence. Journal of Patient Safety and Risk Management 2024;29(5):222 View
- Hayat J, Lari M, AlHerz M, Lari A. The Utility and Limitations of Artificial Intelligence-Powered Chatbots in Healthcare. Cureus 2024 View
- Ho C, Tian T, Ayers A, Aaron R, Phillips V, Wolf R, Mathioudakis N, Dai T, Klonoff D. Qualitative metrics from the biomedical literature for evaluating large language models in clinical decision-making: a narrative review. BMC Medical Informatics and Decision Making 2024;24(1) View
- Jin H, Kim E. Performance of GPT-3.5 and GPT-4 on the Korean Pharmacist Licensing Examination: Comparison Study. JMIR Medical Education 2024;10:e57451 View
- Cano-Besquet S, Rice-Canetto T, Abou-El-Hassan H, Alarcon S, Zimmerman J, Issagholian L, Salomon N, Rojas I, Dhahbi J, Neeki M. ChatGPT4’s diagnostic accuracy in inpatient neurology: A retrospective cohort study. Heliyon 2024;10(24):e40964 View
- Brochu B, Mirsky N, Thaller S. Evaluating ChatGPT’s efficacy in addressing common patient questions in plastic surgery consultations. Artificial Intelligence Surgery 2024;4(4):411 View
- Arslan B, Nuhoglu C, Satici M, Altinbilek E. Evaluating LLM-based generative AI tools in emergency triage: A comparative study of ChatGPT Plus, Copilot Pro, and triage nurses. The American Journal of Emergency Medicine 2025;89:174 View
- Naved B, Luo Y. Contrasting rule and machine learning based digital self triage systems in the USA. npj Digital Medicine 2024;7(1) View
- Oztermeli A. Is ChatGPT a Reliable Tool for Explaining Medical Terms?. Cureus 2025 View
- Vaira L, Lechien J, Abbate V, Gabriele G, Frosolini A, De Vito A, Maniaci A, Mayo‐Yáñez M, Boscolo‐Rizzo P, Saibene A, Maglitto F, Salzano G, Califano G, Troise S, Chiesa‐Estomba C, De Riu G. Enhancing AI Chatbot Responses in Health Care: The SMART Prompt Structure in Head and Neck Surgery. OTO Open 2025;9(1) View
- Kareemi H, Yadav K, Price C, Bobrovitz N, Meehan A, Li H, Goel G, Masood S, Grant L, Ben‐Yakov M, Michalowski W, Vaillancourt C. Artificial intelligence–based clinical decision support in the emergency department: A scoping review. Academic Emergency Medicine 2025;32(4):386 View
- Huo B, Boyle A, Marfo N, Tangamornsuksan W, Steen J, McKechnie T, Lee Y, Mayol J, Antoniou S, Thirunavukarasu A, Sanger S, Ramji K, Guyatt G. Large Language Models for Chatbot Health Advice Studies. JAMA Network Open 2025;8(2):e2457879 View
- Yun H, Bickmore T. Online Health Information–Seeking in the Era of Large Language Models: Cross-Sectional Web-Based Survey Study. Journal of Medical Internet Research 2025;27:e68560 View
- Langmann E, Henking T, Joos S, Klemmt M, Müller R, Preiser C, Ranisch R, Koch R, Rieger M, Wetzel A, Wiesing U, Ehni H. Handlungsempfehlungen zum Einsatz von Symptom-Checker-Apps im Gesundheitskontext – basierend auf den Ergebnissen aus dem Projekt CHECK.APP. Ethik in der Medizin 2025 View
- Tanaka C, Kinoshita T, Okada Y, Satoh K, Homma Y, Suzuki K, Yokobori S, Oda J, Otomo Y, Tagami T. Medical validity and layperson interpretation of emergency visit recommendations by the GPT model: A cross‐sectional study. Acute Medicine & Surgery 2025;12(1) View
- Suga T, Uehara O, Abiko Y, Toyofuku A. Evaluating Large Language Models for Burning Mouth Syndrome Diagnosis. Journal of Pain Research 2025;Volume 18:1387 View
- Kopka M, von Kalckreuth N, Feufel M. Accuracy of online symptom assessment applications, large language models, and laypeople for self–triage decisions. npj Digital Medicine 2025;8(1) View
- Takita H, Kabata D, Walston S, Tatekawa H, Saito K, Tsujimoto Y, Miki Y, Ueda D. A systematic review and meta-analysis of diagnostic performance comparison between generative AI and physicians. npj Digital Medicine 2025;8(1) View
- Morjaria L, Gandhi B, Haider N, Mellon M, Sibbald M. Applications of Generative Artificial Intelligence in Electronic Medical Records: A Scoping Review. Information 2025;16(4):284 View
- Akbasli I, Birbilen A, Teksam O. Leveraging large language models to mimic domain expert labeling in unstructured text-based electronic healthcare records in non-english languages. BMC Medical Informatics and Decision Making 2025;25(1) View
- Schmieding M, Kopka M, Bolanaki M, Napierala H, Altendorf M, Kuschick D, Piper S, Scatturin L, Schmidt K, Schorr C, Thissen A, Wäscher C, Heintze C, Möckel M, Balzer F, Slagman A. Impact of a Symptom Checker App on Patient-Physician Interaction Among Self-Referred Walk-In Patients in the Emergency Department: Multicenter, Parallel-Group, Randomized, Controlled Trial. Journal of Medical Internet Research 2025;27:e64028 View
- Meyer N, Meyer J. A Practical Guide to the Utilization of ChatGPT in the Emergency Department: A Systematic Review of Current Applications, Future Directions, and Limitations. Cureus 2025 View
- Alanazi H. Role of artificial intelligence in advancing immunology. Immunologic Research 2025;73(1) View
- Zou Y, Ye R, Gao Y, Zhou J, Li Y, Chen W, Zha F, Wang Y. Comparison of triage performance among DRP tool, ChatGPT, and outpatient rehabilitation doctors. Scientific Reports 2025;15(1) View
- Shan G, Chen X, Wang C, Liu L, Gu Y, Jiang H, Shi T. Comparing Diagnostic Accuracy of Clinical Professionals and Large Language Models: Systematic Review and Meta-Analysis. JMIR Medical Informatics 2025;13:e64963 View
- Giuffrè M, You K, Pang Z, Kresevic S, Chung S, Chen R, Ko Y, Chan C, Saarinen T, Ajcevic M, Crocè L, Garcia-Tsao G, Gralnek I, Sung J, Barkun A, Laine L, Sekhon J, Stadie B, Shung D. Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology. npj Digital Medicine 2025;8(1) View
Conference Proceedings
- Yun H, Bickmore T. Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. Framing Health Information: The Impact of Search Methods and Source Types on User Trust and Satisfaction in the Age of LLMs View