Remote teaching has become popular recently due to its convenience and safety, especially under extreme circumstances like a pandemic. However, online students usually have a poor experience since the information acquired from the views provided by the broadcast platforms is limited. One potential solution is to show more camera views simultaneously, but it is technically challenging and distracting for the viewers. Therefore, an automatic multi-camera directing/editing system, which aims at selecting the most concerned view at each time instance to guide the attention of online students, is in urgent demand. However, existing systems mostly make simple assumptions and focus on tracking the position of the speaker instead of the real lecture semantics, and therefore have limited capacities to deliver optimal information flow. To this end, this paper proposes an automatic multi-purpose editing system based on the lecture semantics, which can both direct the multiple video streams for real-time broadcasting and edit the optimal video offline for review purposes. Our system directs the views by semantically analyzing the class events while following the professional directing rules, mimicking a human director to capture the regions of interest from the viewpoint of the onsite students. We conduct both qualitative and quantitative analyses to verify the effectiveness of the proposed system and its components.
翻译:远程教学因其便利性与安全性而日益普及,尤其在疫情等极端情况下。然而,在线学生通常体验不佳,因为从直播平台提供的画面中获取的信息有限。一种潜在的解决方案是同时展示更多摄像机视角,但这在技术上具有挑战性且易使观众分心。因此,亟需一种自动多摄像机导播/编辑系统,旨在实时选择最受关注的视角以引导在线学生的注意力。然而,现有系统大多基于简单假设,侧重于跟踪演讲者位置而非真实的讲座语义,因此传递最优信息流的能力有限。为此,本文提出一种基于讲座语义的自动多用途编辑系统,既能实时导播多路视频流,也能为复习目的离线编辑最优视频。本系统通过语义分析课堂事件并遵循专业导播规则来切换视角,模拟人类导播从现场学生视角捕捉兴趣区域。我们进行了定性与定量分析,验证了所提系统及其组件的有效性。