News Release 14-Apr-2026

Can artificial intelligence match medical interview assessments by clinicians?

Researchers report how artificial intelligence-based scoring of interview transcripts is comparable to clinicians’ scores while reducing evaluation time

Peer-Reviewed Publication

Juntendo University Research Promotion Center

Artificial Intelligence- vs Human-Based Assessment of Medical Interview Transcripts — **image:**
**Artificial intelligence can evaluate medical interview transcripts with accuracy comparable to expert clinicians while enabling faster and more scalable feedback in medical training**
view more

Credit: Professor Toshio Naito from Department of General Medicine, Juntendo University Faculty of Medicine, Japan

Clinical interviewing is one of the most important skills physicians develop during their training. It forms the foundation for accurate diagnosis and effective patient care. However, evaluating these skills is often time-intensive, requiring repeated observations and detailed feedback from experienced clinicians. As medical education continues to expand, this growing assessment burden has become a significant challenge. The incorporation of generative artificial intelligence (AI) has the potential to significantly improve the assessment of interviewing skills; however, its efficiency compared to standard evaluation systems is not well understood.

To fill this gap, researchers from Japan explored whether artificial intelligence could help address this issue by evaluating medical interview transcripts. Their findings were published on February 17, 2026 in Volume 12 of the journal JMIR Medical Education. The research team led by Dr. Hiromizu Takahashi (corresponding author) and Professor Toshio Naito, both from the Department of General Medicine, Juntendo University Faculty of Medicine, Japan, examined whether AI-based assessment (ABA) could match traditional human-based assessment (HBA).

“Our central message is that AI may help make medical training fairer, faster, and more scalable,” explains Prof. Naito.

To evaluate ABA vs HBA systems, the researchers designed a cross-sectional validation study using a virtual patient system. Seven participants, including medical students, resident physicians, and attending physicians, conducted clinical interviews with an AI-simulated patient presented with bilateral leg weakness. These conversations were automatically recorded and converted into transcripts. The transcripts were then evaluated using the Master Interview Rating Scale, a standardized tool that assesses various aspects of clinical communication, such as information gathering, organization, and empathy. For the ABA system, AI models, specifically GPT-o1 Pro and GPT-5 Pro, were used to assess the transcripts. On the other hand, five experienced clinical instructors independently evaluated the same transcripts comprising the HBA approach.

According to the researchers, ABA showed strong agreement with clinician evaluations, with only minimal differences in scores. At the same time, AI demonstrated greater consistency across repeated evaluations. Importantly, the use of AI also reduced the time required to assess each transcript by more than half, highlighting its potential to ease the workload of educators. “Rather than replacing teachers, this research suggests a practical ‘AI-first, faculty-verified’ model in which AI handles the first pass and educators focus their time on coaching, judgment, and high-stakes decisions,” says Dr. Takahashi.

These results have important implications for medical education. In many training programs, delays in feedback can limit opportunities for students to improve their communication skills. By providing rapid and consistent evaluations, AI could make repeated practice more accessible, particularly in settings with limited faculty resources. “Students could interview an AI-simulated patient and receive feedback almost immediately instead of waiting days or weeks,” Prof. Naito adds, highlighting the potential for more timely learning experiences.

At the same time, the researchers emphasize that AI should be used with care. While AI performed well in this study, it was based on a small number of participants and a single clinical scenario. In addition, transcript-based evaluation cannot capture nonverbal cues, tone, or cultural nuances that are often important in real-world patient interactions. Prof. Naito and Dr. Takahashi note with caution, “AI should be used with human oversight, because text-only scoring can miss nuances such as tone, nonverbal communication, and cultural context.”

Overall, this study highlights the growing role of AI in medical education. By combining the speed and consistency of AI with the expertise and judgment of clinicians, it may be possible to create more efficient and scalable training systems. As the demand for high-quality medical education continues to rise, such approaches could help ensure that future clinicians receive the best training while reducing the burden on educators.

***

Reference
DOI: 10.2196/81673

Authors: Hiromizu Takahashi¹, Kiyoshi Shikino², Takeshi Kondo^3,4, Yuji Yamada⁵, Yoshitaka Tomoda⁶, Minoru Kishi⁷, Yuki Aiyama⁸, Sho Nagai⁹, Akiko Enomoto⁹, Yoshinori Tokushima¹⁰, Takahiro Shinohara¹¹, Fumiaki Sano¹, Takeshi Matsuura¹², Rikiya Watanabe¹³, and Toshio Naito¹

Affiliations
¹Department of General Medicine, Faculty of Medicine, Juntendo University, Tokyo, Japan

²Department of Community-Oriented Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan

³Center for Postgraduate Clinical Training and Career Development, Nagoya University Hospital, Nagoya, Japan

⁴The School of Health Professions Education, Maastricht University, Maastricht, The Netherlands

⁵Brookdale Department of Geriatrics and Palliative Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, United States ⁶Department of General Internal Medicine, Itabashi Chuo Medical Center, Tokyo, Japan

⁷Department of Internal Medicine, Nishiwaki Municipal Hospital, Hyogo, Japan

⁸Anesthesiology and Critical Care Medicine, Tenri Hospital, Nara, Japan

⁹Department of Nursing, School of Nursing, University of Human Environments, Aichi, Japan

¹⁰Department of General Medicine, Saga University Hospital, Saga, Japan

¹¹Department of General Medicine, Graduate School of Medical and Dental Sciences, Institute of Science Tokyo, Tokyo, Japan ¹²Department of General Medicine, Bibai City Hospital, Hokkaido, Japan

¹³Department of General Internal Medicine, Kita-Harima Medical Center, Hyogo, Japan

About Professor Toshio Naito
Dr. Toshio Naito, MD, PhD, MBA, is a Professor in the Department of General Medicine at Juntendo University Faculty of Medicine, Tokyo, Japan. With over 30 years of clinical and academic experience, his research focuses on general medicine, infectious diseases, HIV, and medical education. He has authored 112 original articles and 4 review articles, achieving an h-index of 23 and 1,799 citations. His contributions have significantly advanced both clinical practice and medical training.

Journal

JMIR Medical Education

DOI

10.2196/81673

Method of Research

Observational study

Subject of Research

People

Article Title

AI- vs Human-Based Assessment of Medical Interview Transcripts in a Generative AI–Simulated Patient System: Cross-Sectional Validation Study

Article Publication Date

17-Feb-2026

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.