Creating compelling 3D character animations typically requires either expert use of professional software or expensive motion capture systems operated by skilled actors. We present DancingBox, a lightweight, vision-based system that makes motion capture accessible to novices by reimagining the process as digital puppetry. Instead of tracking precise human motions, DancingBox captures the approximate movements of everyday objects manipulated by users with a single webcam. These coarse proxy motions are then refined into realistic character animations by conditioning a generative motion model on bounding-box representations, enriched with human motion priors learned from large-scale datasets. To overcome the lack of paired proxy-animation data, we synthesize training pairs by converting existing motion capture sequences into proxy representations. A user study demonstrates that DancingBox enables intuitive and creative character animation using diverse proxies, from plush toys to bananas, lowering the barrier to entry for novice animators.
翻译:创作引人入胜的3D角色动画通常需要专业软件的专家操作,或借助昂贵动作捕捉系统由专业演员表演。本文提出DancingBox——一种基于视觉的轻量级系统,通过将动作捕捉流程重塑为数字木偶戏,使新手也能轻松实现动作捕捉。该系统无需追踪精准人体运动,仅通过单摄像头捕捉用户操控日常物体的近似运动轨迹。这些粗粒度代理运动随后通过基于边界框表征的条件生成运动模型,结合从大规模数据集中学习的人体运动先验知识,被修正为逼真的角色动画。为克服代理-动画配对数据匮乏的问题,我们通过将现有动作捕捉序列转换为代理表征来合成训练数据对。用户研究表明,DancingBox支持用户使用从毛绒玩具到香蕉等多样代理进行直观且富有创意的角色动画创作,降低了新手动画师的技术门槛。