Image segmentation is a clustering task whereby each pixel is assigned a cluster label. Remote sensing data usually consists of multiple bands of spectral images in which there exist semantically meaningful land cover subregions, co-registered with other source data such as LIDAR (LIght Detection And Ranging) data, where available. This suggests that, in order to account for spatial correlation between pixels, a feature vector associated with each pixel may be a vectorized tensor representing the multiple bands and a local patch as appropriate. Similarly, multiple types of texture features based on a pixel's local patch would also be beneficial for encoding locally statistical information and spatial variations, without necessarily labelling pixel-wise a large amount of ground truth, then training a supervised model, which is sometimes impractical. In this work, by resorting to label only a small quantity of pixels, a new semi-supervised segmentation approach is proposed. Initially, over all pixels, an image data matrix is created in high dimensional feature space. Then, t-SNE projects the high dimensional data onto 3D embedding. By using radial basis functions as input features, which use the labelled data samples as centres, to pair with the output class labels, a modified canonical correlation analysis algorithm, referred to as RBF-CCA, is introduced which learns the associated projection matrix via the small labelled data set. The associated canonical variables, obtained for the full image, are applied by k-means clustering algorithm. The proposed semi-supervised RBF-CCA algorithm has been implemented on several remotely sensed multispectral images, demonstrating excellent segmentation results.
翻译:图像分割是一种聚类任务,即为每个像素分配聚类标签。遥感数据通常包含多波段光谱图像,其中存在具有语义意义的土地覆盖子区域,这些图像与其他源数据(如激光雷达数据,当可用时)进行配准。这表明,为了考虑像素间的空间相关性,与每个像素相关联的特征向量可以是向量化的张量,该张量适当表示多个波段和局部邻域。类似地,基于像素局部邻域的多类纹理特征也有助于编码局部统计信息和空间变化,而无需逐像素标注大量真实数据并训练监督模型——这有时不切实际。本文通过仅标注少量像素,提出了一种新的半监督分割方法。首先,在所有像素上,在高维特征空间中构建图像数据矩阵。接着,t-SNE将高维数据投影至三维嵌入空间。通过使用以标注数据样本为中心的径向基函数作为输入特征,与输出类别标签配对,引入一种改进的典型相关分析算法(称为RBF-CCA),该算法通过少量标注数据学习关联投影矩阵。随后,利用k-means聚类算法对全图获取的关联变量进行聚类。所提出的半监督RBF-CCA算法已在多幅遥感多光谱图像上实现,展现出优异的分割结果。