Adaptive 360° video streaming for teleoperation faces two coupled challenges: viewport prediction under uncertain gaze patterns and bitrate adaptation over fluctuating wireless channels. While Deep Reinforcement Learning (DRL) methods achieve high Quality of Experience (QoE), their lack of interpretability and dependence on offline training limit deployment in safety-critical systems. We propose OrbitStream, a training-free framework that formulates viewport prediction as a Gravitational Viewport Prediction (GVP) problem, where semantic objects generate potential fields that attract operator gaze, and employs a Saturation-Based Proportional-Derivative (PD) Controller for buffer regulation. On object-rich teleoperation traces, OrbitStream achieves 94.7% zero-shot viewport prediction accuracy without user-specific profiling, approaching trajectory-extrapolation baselines (~98.5%). Across 3,600 Monte Carlo simulations, it ranks second among 12 algorithms (QoE 2.71 vs. BOLA-E's 2.80), outperforming FastMPC (1.84), with 1.01 ms decision latency and minimal rebuffering.
翻译:面向遥操作的自适应360°视频流面临两个耦合挑战:在不确定视线模式下的视口预测,以及波动无线信道上的码率自适应。尽管深度强化学习方法能实现高体验质量,但其缺乏可解释性且依赖离线训练,限制了在安全关键系统中的部署。我们提出OrbitStream,一种免训练框架,将视口预测建模为引力视口预测问题——语义对象生成吸引操作员视线的势场,并采用基于饱和度的比例-微分控制器进行缓冲管理。在富含物体的遥操作轨迹上,OrbitStream无需用户特定建模即实现94.7%的零样本视口预测准确率,逼近轨迹外推基线方法(~98.5%)。在3600次蒙特卡洛仿真中,该框架在12种算法中排名第二(体验质量2.71对比BOLA-E的2.80),优于FastMPC(1.84),决策延迟1.01毫秒且极少缓冲。