Indoor 360 panoramas have two essential properties. (1) The panoramas are continuous and seamless in the horizontal direction. (2) Gravity plays an important role in indoor environment design. By leveraging these properties, we present PanelNet, a framework that understands indoor environments using a novel panel representation of 360 images. We represent an equirectangular projection (ERP) as consecutive vertical panels with corresponding 3D panel geometry. To reduce the negative impact of panoramic distortion, we incorporate a panel geometry embedding network that encodes both the local and global geometric features of a panel. To capture the geometric context in room design, we introduce Local2Global Transformer, which aggregates local information within a panel and panel-wise global context. It greatly improves the model performance with low training overhead. Our method outperforms existing methods on indoor 360 depth estimation and shows competitive results against state-of-the-art approaches on the task of indoor layout estimation and semantic segmentation.
翻译:室内360度全景图具有两个关键特性:(1)全景图像在水平方向上具有连续性与无缝性;(2)重力在室内环境设计中起着重要作用。利用这些特性,我们提出PanelNet框架,该框架通过一种新颖的360度图像面板表征来理解室内环境。我们将等距柱状投影图表示为连续的垂直面板,并对应建立三维面板几何结构。为降低全景畸变的负面影响,我们引入面板几何嵌入网络,该网络可同时编码面板的局部与全局几何特征。为捕捉室内设计中的几何上下文信息,我们提出局部到全局Transformer模型,该模型能聚合面板内部局部信息与面板维度全局上下文,以极低的训练开销显著提升模型性能。在室内360度深度估计任务中,本方法优于现有技术,同时在室内布局估计与语义分割任务中展现出与最先进方法相媲美的性能。