We introduce Interactive Intelligence, a novel paradigm of digital human that is capable of personality-aligned expression, adaptive interaction, and self-evolution. To realize this, we present Mio (Multimodal Interactive Omni-Avatar), an end-to-end framework composed of five specialized modules: Thinker, Talker, Face Animator, Body Animator, and Renderer. This unified architecture integrates cognitive reasoning with real-time multimodal embodiment to enable fluid, consistent interaction. Furthermore, we establish a new benchmark to rigorously evaluate the capabilities of interactive intelligence. Extensive experiments demonstrate that our framework achieves superior performance compared to state-of-the-art methods across all evaluated dimensions. Together, these contributions move digital humans beyond superficial imitation toward intelligent interaction.
翻译:本文提出交互式智能这一新型数字人范式,其具备人格对齐表达、自适应交互与自我进化的能力。为实现该目标,我们提出Mio(多模态交互式全能化身)——一个由五个专用模块构成的端到端框架:思维模块、对话模块、面部动画模块、身体动画模块与渲染模块。该统一架构将认知推理与实时多模态具身表现相结合,实现了流畅且一致的交互体验。此外,我们建立了新的基准测试体系以严格评估交互式智能的能力。大量实验表明,本框架在所有评估维度上均优于现有最先进方法。这些成果共同推动数字人技术从表层模仿迈向智能交互的新阶段。