News Release

Can artificial intelligence match medical interview assessments by clinicians?

Researchers report how artificial intelligence-based scoring of interview transcripts is comparable to clinicians’ scores while reducing evaluation time

Peer-Reviewed Publication

Juntendo University Research Promotion Center

Artificial Intelligence- vs Human-Based Assessment of Medical Interview Transcripts

image: 

Artificial intelligence can evaluate medical interview transcripts with accuracy comparable to expert clinicians while enabling faster and more scalable feedback in medical training

view more 

Credit: Professor Toshio Naito from Department of General Medicine, Juntendo University Faculty of Medicine, Japan

Clinical interviewing is one of the most important skills physicians develop during their training. It forms the foundation for accurate diagnosis and effective patient care. However, evaluating these skills is often time-intensive, requiring repeated observations and detailed feedback from experienced clinicians. As medical education continues to expand, this growing assessment burden has become a significant challenge. The incorporation of generative artificial intelligence (AI) has the potential to significantly improve the assessment of interviewing skills; however, its efficiency compared to standard evaluation systems is not well understood.

To fill this gap, researchers from Japan explored whether artificial intelligence could help address this issue by evaluating medical interview transcripts. Their findings were published on February 17, 2026 in Volume 12 of the journal JMIR Medical Education. The research team led by Dr. Hiromizu Takahashi (corresponding author) and Professor Toshio Naito, both from the Department of General Medicine, Juntendo University Faculty of Medicine, Japan, examined whether AI-based assessment (ABA) could match traditional human-based assessment (HBA).

“Our central message is that AI may help make medical training fairer, faster, and more scalable,” explains Prof. Naito.

To evaluate ABA vs HBA systems, the researchers designed a cross-sectional validation study using a virtual patient system. Seven participants, including medical students, resident physicians, and attending physicians, conducted clinical interviews with an AI-simulated patient presented with bilateral leg weakness. These conversations were automatically recorded and converted into transcripts. The transcripts were then evaluated using the Master Interview Rating Scale, a standardized tool that assesses various aspects of clinical communication, such as information gathering, organization, and empathy. For the ABA system, AI models, specifically GPT-o1 Pro and GPT-5 Pro, were used to assess the transcripts. On the other hand, five experienced clinical instructors independently evaluated the same transcripts comprising the HBA approach.

According to the researchers, ABA showed strong agreement with clinician evaluations, with only minimal differences in scores. At the same time, AI demonstrated greater consistency across repeated evaluations. Importantly, the use of AI also reduced the time required to assess each transcript by more than half, highlighting its potential to ease the workload of educators. “Rather than replacing teachers, this research suggests a practical ‘AI-first, faculty-verified’ model in which AI handles the first pass and educators focus their time on coaching, judgment, and high-stakes decisions,” says Dr. Takahashi.

These results have important implications for medical education. In many training programs, delays in feedback can limit opportunities for students to improve their communication skills. By providing rapid and consistent evaluations, AI could make repeated practice more accessible, particularly in settings with limited faculty resources. “Students could interview an AI-simulated patient and receive feedback almost immediately instead of waiting days or weeks,” Prof. Naito adds, highlighting the potential for more timely learning experiences.

At the same time, the researchers emphasize that AI should be used with care. While AI performed well in this study, it was based on a small number of participants and a single clinical scenario. In addition, transcript-based evaluation cannot capture nonverbal cues, tone, or cultural nuances that are often important in real-world patient interactions. Prof. Naito and Dr. Takahashi note with caution, “AI should be used with human oversight, because text-only scoring can miss nuances such as tone, nonverbal communication, and cultural context.”

Overall, this study highlights the growing role of AI in medical education. By combining the speed and consistency of AI with the expertise and judgment of clinicians, it may be possible to create more efficient and scalable training systems. As the demand for high-quality medical education continues to rise, such approaches could help ensure that future clinicians receive the best training while reducing the burden on educators.

 

***

 

Reference
DOI: 10.2196/81673

 

Authors: Hiromizu Takahashi1, Kiyoshi Shikino2, Takeshi Kondo3,4, Yuji Yamada5, Yoshitaka Tomoda6, Minoru Kishi7, Yuki Aiyama8, Sho Nagai9, Akiko Enomoto9, Yoshinori Tokushima10, Takahiro Shinohara11, Fumiaki Sano1, Takeshi Matsuura12, Rikiya Watanabe13, and Toshio Naito1

 

Affiliations
1Department of General Medicine, Faculty of Medicine, Juntendo University, Tokyo, Japan

2Department of Community-Oriented Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan

3Center for Postgraduate Clinical Training and Career Development, Nagoya University Hospital, Nagoya, Japan

4The School of Health Professions Education, Maastricht University, Maastricht, The Netherlands

5Brookdale Department of Geriatrics and Palliative Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States 6Department of General Internal Medicine, Itabashi Chuo Medical Center, Tokyo, Japan

7Department of Internal Medicine, Nishiwaki Municipal Hospital, Hyogo, Japan

8Anesthesiology and Critical Care Medicine, Tenri Hospital, Nara, Japan

9Department of Nursing, School of Nursing, University of Human Environments, Aichi, Japan

10Department of General Medicine, Saga University Hospital, Saga, Japan

11Department of General Medicine, Graduate School of Medical and Dental Sciences, Institute of Science Tokyo, Tokyo, Japan 12Department of General Medicine, Bibai City Hospital, Hokkaido, Japan

13Department of General Internal Medicine, Kita-Harima Medical Center, Hyogo, Japan

 

About Professor Toshio Naito
Dr. Toshio Naito, MD, PhD, MBA, is a Professor in the Department of General Medicine at Juntendo University Faculty of Medicine, Tokyo, Japan. With over 30 years of clinical and academic experience, his research focuses on general medicine, infectious diseases, HIV, and medical education. He has authored 112 original articles and 4 review articles, achieving an h-index of 23 and 1,799 citations. His contributions have significantly advanced both clinical practice and medical training.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.