Due to the widespread adoption of "work-from-home" policies, videoconferencing applications (e.g., Zoom) have become indispensable for remote communication. However, they often lack immersiveness, leading to the so-called "Zoom fatigue" and degrading communication efficiency. The recent debut of Apple Vision Pro, a mobile headset that supports "spatial persona", aims to offer an immersive telepresence experience. In this paper, we conduct a first-of-its-kind in-depth and empirical study to analyze the performance of immersive telepresence with Apple FaceTime, Cisco Webex, Microsoft Teams, and Zoom on Vision Pro. We find that only FaceTime provides a truly immersive experience with spatial personas, whereas others still operate 2D personas. Our measurement results reveal that (1) FaceTime delivers semantic data to optimize bandwidth consumption, which is even lower than that of 2D persona for other applications, and (2) it employs visibility-aware optimizations to reduce rendering overhead. However, the scalability of FaceTime remains limited, with a simple server-allocation strategy that potentially leads to high network delay for users.
翻译:由于"居家办公"政策的广泛采用,视频会议应用(如Zoom)已成为远程沟通不可或缺的工具。然而,这些应用通常缺乏沉浸感,导致所谓的"Zoom疲劳"并降低沟通效率。近期推出的Apple Vision Pro作为一款支持"空间角色"功能的移动头显,旨在提供沉浸式远程呈现体验。本文首次通过深度实证研究,分析了Apple FaceTime、Cisco Webex、Microsoft Teams和Zoom在Vision Pro平台上沉浸式远程呈现的性能表现。研究发现,仅FaceTime通过空间角色功能提供真正的沉浸式体验,而其他应用仍采用2D角色模式。测量结果表明:(1) FaceTime通过传输语义数据优化带宽消耗,其数据量甚至低于其他应用的2D角色模式;(2) 采用可见性感知优化技术以降低渲染开销。然而,FaceTime的可扩展性仍存在局限,其简单的服务器分配策略可能导致用户面临较高的网络延迟。