Understanding Emotional Body Expressions via Large Language Models

Emotion recognition based on body movements is vital in human-computer interaction. However, existing emotion recognition methods predominantly focus on enhancing classification accuracy, often neglecting the provision of textual explanations to justify their classifications. In this paper, we propose an Emotion-Action Interpreter powered by Large Language Model (EAI-LLM), which not only recognizes emotions but also generates textual explanations by treating 3D body movement data as unique input tokens within large language models (LLMs). Specifically, we propose a multi-granularity skeleton tokenizer designed for LLMs, which separately extracts spatio-temporal tokens and semantic tokens from the skeleton data. This approach allows LLMs to generate more nuanced classification descriptions while maintaining robust classification performance. Furthermore, we treat the skeleton sequence as a specific language and propose a unified skeleton token module. This module leverages the extensive background knowledge and language processing capabilities of LLMs to address the challenges of joint training on heterogeneous datasets, thereby significantly enhancing recognition accuracy on individual datasets. Experimental results demonstrate that our model achieves recognition accuracy comparable to existing methods. More importantly, with the support of background knowledge from LLMs, our model can generate detailed emotion descriptions based on classification results, even when trained on a limited amount of labeled skeleton data.

翻译：基于身体动作的情感识别在人机交互中至关重要。然而，现有情感识别方法主要侧重于提高分类准确性，往往忽视提供文本解释来论证其分类结果。本文提出一种由大语言模型驱动的情感-动作解释器（EAI-LLM），该模型不仅能识别情感，还能通过将三维身体运动数据作为大语言模型（LLMs）中的独特输入标记来生成文本解释。具体而言，我们提出一种专为LLMs设计的多粒度骨骼标记器，可从骨骼数据中分别提取时空标记和语义标记。该方法使LLMs在保持稳健分类性能的同时，能够生成更细致的分类描述。此外，我们将骨骼序列视为特定语言，并提出统一的骨骼标记模块。该模块利用LLMs丰富的背景知识和语言处理能力，应对异构数据集联合训练的挑战，从而显著提升在单个数据集上的识别准确率。实验结果表明，我们的模型达到了与现有方法相当的识别准确率。更重要的是，在LLMs背景知识的支持下，即使仅使用少量带标签的骨骼数据进行训练，我们的模型也能基于分类结果生成详细的情感描述。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日