Pilot Development of a 'Clinical Performance Examination (CPX) Practicing Chatbot' Utilizing Prompt Engineering

Jundong Kim; Hye-Yoon Lee; Ji-Hwan Kim; Chang-Eop Kim; Kim, Jundong; Lee, Hye-Yoon; Kim, Ji-Hwan; Kim, Chang-Eop

doi:10.13048/jkm.24013

JKM > Volume 45(1); 2024 > Article

Kim, Lee, Kim, and Kim: Pilot Development of a ‘Clinical Performance Examination (CPX) Practicing Chatbot’ Utilizing Prompt Engineering

Original Article

The Journal of Korean Medicine 2024; 45(1): 200-212.

Published online: March 1, 2024

DOI: https://doi.org/10.13048/jkm.24013

Pilot Development of a ‘Clinical Performance Examination (CPX) Practicing Chatbot’ Utilizing Prompt Engineering

Jundong Kim¹

, Hye-Yoon Lee²

, Ji-Hwan Kim³

, Chang-Eop Kim¹

¹Department of Physiology, College of Korean Medicine, Gachon University

²Division of Humanities and Social Medicine, School of Korean Medicine, Pusan National University

³Department of Sasang Constitutional Medicine, College of Korean Medicine, Gachon University

Correspondence to: Chang-Eop Kim, Tel: +82-10-4186-2985, E-mail: eopchang@gachon.ac.kr

Correspondence to: Ji-Hwan Kim, Tel: +82-10-7159-4375, E-mail: jani77@gachon.ac.kr

Received February 5, 2024 Revised February 13, 2024 Accepted February 16, 2024

Abstract

Objectives

In the context of competency-based education emphasized in Korean Medicine, this study aimed to develop a pilot version of a CPX (Clinical Performance Examination) Practicing Chatbot utilizing large language models with prompt engineering.

Methods

A standardized patient scenario was acquired from the National Institute of Korean Medicine and transformed into text format. Prompt engineering was then conducted using role prompting and few-shot prompting techniques. The GPT-4 API was employed, and a web application was created using the gradio package. An internal evaluation criterion was established for the quantitative assessment of the chatbot’s performance.

Results

The chatbot was implemented and evaluated based on the internal evaluation criterion. It demonstrated relatively high correctness and compliance. However, there is a need for improvement in confidentiality and naturalness.

Conclusions

This study successfully piloted the CPX Practicing Chatbot, revealing the potential for developing educational models using AI technology in the field of Korean Medicine. Additionally, it identified limitations and provided insights for future developmental directions.

Keywords: CPX(Clinical Performance Examination), AI(Artificial Intelligence), LLM(Large language model), Chatbot, Medical Education

Fig. 1

Flowchart for developing CPX Practicing Chatbot. CPX; clinical performance exam, LLM; large language model.

Table 1

Overview of CPX Practicing Chatbot Prompt.

	Contents
Instruction 1	- Explanation of the Role of the CPX Practicing Chatbot (Role prompting) - Brief description of the contents and usage methods of Context 1 and Context 2.
Context 1	- Standardized Patient Scenario (Chatting Format)
Context 2	- Five Examples of Q&A between the Chatbot and Student Doctor (Few-shot Prompting)
Instruction 2	- Eight Guidelines for Appropriate Responses
Instruction 3	- Five Procedures (Steps) to Refer to for Appropriate Responses
Instruction 4	- Reminder of the Role of the CPX Practicing Chatbot - Request for a summary of the situational instructions to initiate the mock consultation.
Total Token Count	3,853 tokens

Table 2

Examples of CPX Practicing Chatbot Screens

Chatbot Screen Example 1. Scenario Guidelines	Chatbot Screen Example 2. Dialogue Window
A page that guides scenario guidelines and precautions for using the chatbot to practice CPX.	An example of actual usage where the user is practicing CPX.

Table 3

Examples of five criteria on a chatbot evaluation form

correctness	hallucination	confidentiality	naturalness	compliance
					Q1	Hello, patient. I’m student KMD Jundong Kim. It must have been hot and tough getting here, right? (“안녕하세요 환자분. 학생 한의사 김준동이라고 합니다. 오시느라 덥고 힘드셨죠?”)
1	0	1	1	1	Chatbot	No, it’s okay. I came to see you, student KMD. (“아니요, 괜찮아요. 학생 한의사님 만나러 왔으니까요.”)
					Q2	Could you please tell me your name and date of birth? (“환자분 성함과 생년월일을 말씀해주세요.”)
1	0	1	1	1	Chatbot	Hi, my name is Park Minyoung. I was born on June 3, 1976. (“네 안녕하세요. 이름은 박민영이구요. 생년월일은 1976 년 6 월 3 일입니다.”)
					Q3	Alright. May I ask what brings you in today? (“예. 혹시 어디가 불편해서 오셨나요?”)
1	0	1	1	1	Chatbot	I recently had a health checkup at work, and it showed that my blood pressure was high. (“얼마 전 직장에서 건강검진을 했는데, 혈압이 높게 나와서 왔습니다.”)
					Q4	Can you tell me more about it? (“예 조금 더 자세히 말씀해주시겠어요?”)
1	0	0	1	1	Chatbot	When I measured it at home recently, my blood pressure was often high too. Especially in the last six months, I’ve been under a lot of stress at work. I’ve been having headaches and dizziness for the past month, so I came here. (“얼마 전에 집에서 측정해 볼 때도 혈압이 종종 높아요 특히 최근 6 개월동안 직장에서 스트레스가 심했거든요. 한달 전부터는 머리도 아프고 어지러워서 왔어요.”;)
					Q5	Have you ever measured your blood pressure at home, not just at the hospital? (“병원이 아닌 집에서도 혈압을 재본 적 있습니까?”)
1	0	1	1	1	Chatbot	Yes, I’ve occasionally checked it at home with a blood pressure monitor, and it’s generally high, usually in the range of 130 to 140 mmHg. (“네, 가끔씩 집에서 혈압기로 측정해봤는데요, 대체적으로 130~140mmHg 범위로 높게 나옵니다.”)

The actual conversations with the chatbot were conducted in Korean, and these real Korean conversations are indicated within quotation marks (" "). In instances where the scenario was excessively disclosed to a student KMD in bold text, it was considered a breach of confidentiality, resulting in a score of 0 being awarded.

참고문헌

1. Gwak D, Kim M, Kwon J, Yin C. Trends in Korean Medical education research from 2003 to 2022. The Journal of Korean Medical History. 2023; 36:89–98.

2. Jeong S-H, Kim J-P, Kang Y-J, Jeong HI, Kim KH. A Survey of Recognitions and Satisfaction with Education in Traditional Korean Medicine. Journal of Society of Preventive Korean Medicine. 2020; 24:49–56.

3. Cho E, Kim Y, Ha J, Park J, Jung H-J. Developing a manual for clinical practice on real patients in Korean Medicine. J Kor Med Edu. 2023; 1:15–22.

4. Han SY, Lee SH, Chae H. Developing a best practice framework for clinical competency education in the traditional East-Asian medicine curriculum. BMC Med Educ. 2022; 22

5. Lim C, Han H, Hong J, Kang Y. 2016 Competency Modeling for Doctor of Korean Medicine & Application Plans. Journal of Korean Medicine. 2016; 37:101–13.

6. Han C-Y, Kang D-W, Park J-G, Kim B-H, Kim K-S, Kim Y-B, et al. An Analysis of Clerkship Satisfaction in College of Korean Medicine : Focusing on Doctor-patient Role-play and mock CPX. J Korean Med Ophthalmol Otolaryngol Dermatol. 2020; 33:12–24.

7. Jo H, Min S. The current status and future operations of Clinical Performance Evaluation (CPX) in the nationwide colleges (graduate schools) of Traditional Korean Medicine. The Journal of Korean Medical History. 2020; 33:9–21.

8. Jo H-J, Roh J-D, Sung HK, Park J-S. A Survey on Students’ Perception of Clinical Performance Examination (CPX) in College of Korean Medicine Using Student Standardized Patients. Journal of Society of Preventive Korean Medicine. 2020; 24:1–13.

9. Park S, Kim C. Enhancing Korean Medicine Education with Large Language Models: Focusing on the Development of Educational Artificial Intelligence. Journal of Physiology & Pathology in Korean Medicine. 2023; 37:134–8.

10. Chen B, Zhang Z, Langrené N, Zhu S. Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review. 2023.

11. Renze M, Guven E. The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models. 2024.

12. Institute of Medical Education and Training. Revised Clinical Diagnosis. 2nd edition. Seoul: Seoul National University Press;2015.

13. Bae H, Lee S, Lee CY, Kim CE. A Novel Framework for Understanding the Pattern Identification of Traditional Asian Medicine From the Machine Learning Perspective. Front Med (Lausanne). 2022; 8:763533

14. Jang D, Yun T-R, Lee C-Y, Kwon Y-K, Kim C-E. GPT-4 can pass the Korean National Licensing Examination for Korean Medicine Doctors. PLOS Digital Health. 2023; 2:e0000416

15. Banerjee D, Singh P, Avadhanam A, Srivastava S. Benchmarking LLM powered Chatbots: Methods and Metrics. 2023.