Dynamic NeRFs for Soccer Scenes

from arxiv, Accepted at the 6th International ACM Workshop on Multimedia Content Analysis in Sports. 8 pages, 9 figures. Project page: https://soccernerfs.isach.be

The long-standing problem of novel view synthesis has many applications, notably in sports broadcasting. Photorealistic novel view synthesis of soccer actions, in particular, is of enormous interest to the broadcast industry. Yet only a few industrial solutions have been proposed, and even fewer that achieve near-broadcast quality of the synthetic replays. Except for their setup of multiple static cameras around the playfield, the best proprietary systems disclose close to no information about their inner workings. Leveraging multiple static cameras for such a task indeed presents a challenge rarely tackled in the literature, for a lack of public datasets: the reconstruction of a large-scale, mostly static environment, with small, fast-moving elements. Recently, the emergence of neural radiance fields has induced stunning progress in many novel view synthesis applications, leveraging deep learning principles to produce photorealistic results in the most challenging settings. In this work, we investigate the feasibility of basing a solution to the task on dynamic NeRFs, i.e., neural models purposed to reconstruct general dynamic content. We compose synthetic soccer environments and conduct multiple experiments using them, identifying key components that help reconstruct soccer scenes with dynamic NeRFs. We show that, although this approach cannot fully meet the quality requirements for the target application, it suggests promising avenues toward a cost-efficient, automatic solution. We also make our work dataset and code publicly available, with the goal to encourage further efforts from the research community on the task of novel view synthesis for dynamic soccer scenes. For code, data, and video results, please see https://soccernerfs.isach.be.

翻译：新颖视角合成这一长期存在的问题具有许多应用，尤其在体育广播领域。足球动作的逼真新颖视角合成对广播行业具有巨大吸引力。然而，目前仅有少数工业解决方案被提出，且能达到接近广播级合成回放质量的方案更是凤毛麟角。除了在场边部署多台静态摄像机的设置外，最优秀的专有系统对其内部运作细节几乎未作披露。利用多台静态摄像机完成此类任务确实面临一个文献中鲜少涉及的挑战（因缺乏公开数据集）：重构一个以静止元素为主、包含微小快速运动物体的大规模场景。近年来，神经辐射场的出现推动了许多新颖视角合成应用的惊人进展，其利用深度学习原理在最具挑战性的环境下生成逼真结果。本研究探讨了基于动态NeRF（即旨在重构通用动态内容的神经模型）构建解决方案的可行性。我们构建了合成的足球环境，并通过多次实验识别出利用动态NeRF重构足球场景的关键组件。研究表明，尽管该方法无法完全满足目标应用的质量要求，但为开发低成本、自动化的解决方案指明了有前景的方向。我们还将研究数据集和代码公开共享，旨在鼓励研究界进一步探索动态足球场景新颖视角合成的任务。相关代码、数据及视频结果请参见https://soccernerfs.isach.be。