Autoencoding, which aims to reconstruct the input images through a bottleneck latent representation, is one of the classic feature representation learning strategies. It has been shown effective as an auxiliary task for semi-supervised learning but has become less popular as more sophisticated methods have been proposed in recent years. In this paper, we revisit the idea of using image reconstruction as the auxiliary task and incorporate it with a modern semi-supervised semantic segmentation framework. Surprisingly, we discover that such an old idea in semi-supervised learning can produce results competitive with state-of-the-art semantic segmentation algorithms. By visualizing the intermediate layer activations of the image reconstruction module, we show that the feature map channel could correlate well with the semantic concept, which explains why joint training with the reconstruction task is helpful for the segmentation task. Motivated by our observation, we further proposed a modification to the image reconstruction task, aiming to further disentangle the object clue from the background patterns. From experiment evaluation on various datasets, we show that using reconstruction as auxiliary loss can lead to consistent improvements in various datasets and methods. The proposed method can further lead to significant improvement in object-centric segmentation tasks.
翻译:自编码(Autoencoding)旨在通过瓶颈潜在表征重构输入图像,是一种经典的特征表征学习策略。它被证明可作为半监督学习的有效辅助任务,但随着近年来更复杂方法的提出,其应用已逐渐式微。本文重新审视将图像重建作为辅助任务的思路,并将其融入现代半监督语义分割框架。令人惊讶的是,我们发现这一半监督学习中的经典思路能够产生与最先进语义分割算法相媲美的结果。通过可视化图像重建模块的中间层激活,我们表明特征图通道能与语义概念良好关联,这解释了为何联合训练重建任务有助于分割任务。受此观察启发,我们进一步对图像重建任务提出改进,旨在将物体线索与背景模式更清晰分离。通过对多种数据集的实验评估,我们证明使用重建作为辅助损失能在不同数据集和方法中带来持续改进。所提出的方法还能在面向物体的分割任务中带来显著提升。