Controllable video editing has demonstrated remarkable potential across diverse applications, particularly in scenarios where capturing or re-capturing real-world videos is either impractical or costly. This paper introduces a novel and efficient system named Place-Anything, which facilitates the insertion of any object into any video solely based on a picture or text description of the target object or element. The system comprises three modules: 3D generation, video reconstruction, and 3D target insertion. This integrated approach offers an efficient and effective solution for producing and editing high-quality videos by seamlessly inserting realistic objects. Through a user study, we demonstrate that our system can effortlessly place any object into any video using just a photograph of the object. Our demo video can be found at https://youtu.be/afXqgLLRnTE. Please also visit our project page https://place-anything.github.io to get access.
翻译:可控视频编辑在多种应用场景中展现出巨大潜力,尤其是在真实世界视频的拍摄或重拍不切实际或成本高昂的情况下。本文介绍一种名为"Place-Anything"的新型高效系统,该系统仅基于目标物体或元素的图片或文字描述,即可将任意物体插入任意视频中。该系统包含三个模块:3D生成、视频重建和3D目标插入。这种集成方法通过无缝插入逼真物体,为制作和编辑高质量视频提供了高效且有效的解决方案。通过用户研究,我们证明该系统仅需一张物体照片,即可轻松将任意物体放入任意视频中。我们的演示视频见https://youtu.be/afXqgLLRnTE,更多信息请访问项目主页https://place-anything.github.io。