PolyU-led research reveals that sensory and motor inputs help large language models represent complex concepts (IMAGE)
Caption
a,b, Spearman correlations between the human-generated and LLM-generated ratings for all analysed words. The x axis represents the Spearman correlation coefficients between the aggregated word ratings generated by LLMs including GPT-3.5, GPT-4 (a), PaLM and Gemini (b) and the corresponding human ratings. The y axis lists the different dimensions being evaluated, along the non-sensorimotor, sensory and motor dimensions. The error bars depict the 95% confidence intervals, estimated by bootstrap resampling 1,000 samples of word ratings from aggregated human participants and LLMs. The central value represents the estimated correlation coefficient between the lower and upper confidence bounds. c, Radar plots showing the aggregated ratings of human, ChatGPT (GPT-3.5 and GPT-4) and Google LLMs (PaLM and Gemini) on each dimension for two individual concepts: ‘flower’ (a concrete word) and ‘justice’ (an abstract word). The numbers along the radial axis denote the rating ranges for these dimensions. Additional examples are provided in Supplementary Figs. 2 and 3.
Credit
© 2025 Research and Innovation Office, The Hong Kong Polytechnic University. All Rights Reserved.
Usage Restrictions
nil
License
Original content