Meeting online is becoming the new normal. Creating an immersive experience for online meetings is a necessity towards more diverse and seamless environments. Efficient photorealistic rendering of human 3D dynamics is the core of immersive meetings. Current popular applications achieve real-time conferencing but fall short in delivering photorealistic human dynamics, either due to limited 2D space or the use of avatars that lack realistic interactions between participants. Recent advances in neural rendering, such as the Neural Radiance Field (NeRF), offer the potential for greater realism in metaverse meetings. However, the slow rendering speed of NeRF poses challenges for real-time conferencing. We envision a pipeline for a future extended reality metaverse conferencing system that leverages monocular video acquisition and free-viewpoint synthesis to enhance data and hardware efficiency. Towards an immersive conferencing experience, we explore an accelerated NeRF-based free-viewpoint synthesis algorithm for rendering photorealistic human dynamics more efficiently. We show that our algorithm achieves comparable rendering quality while performing training and inference 44.5% and 213% faster than state-of-the-art methods, respectively. Our exploration provides a design basis for constructing metaverse conferencing systems that can handle complex application scenarios, including dynamic scene relighting with customized themes and multi-user conferencing that harmonizes real-world people into an extended world.
翻译:在线会议正成为新常态。为营造更具多样性与无缝衔接的环境,打造沉浸式在线会议体验势在必行。高效逼真地渲染人体三维动态是沉浸式会议的核心。当前主流应用虽能实现实时会议,但受限于二维空间或使用缺乏参与者间真实互动的虚拟化身,无法呈现逼真的人体动态。神经辐射场(NeRF)等神经渲染领域的最新进展,为元宇宙会议带来更高真实感的潜力,但其渲染速度缓慢对实时会议构成挑战。我们构想了一种未来扩展现实元宇宙会议系统的流水线,通过单目视频采集与自由视点合成提升数据与硬件效率。为追求沉浸式会议体验,我们探索了一种基于加速NeRF的自由视点合成算法,以更高效地渲染逼真人体动态。实验表明,与现有最先进方法相比,该算法在保持相当渲染质量的同时,训练速度提升44.5%,推理速度提升213%。本探索为构建能处理复杂应用场景(包括带自定义主题的动态场景重光照、以及将真实世界人物和谐融入扩展世界的多用户会议)的元宇宙会议系统提供了设计基础。