The advent of satellite-borne machine learning hardware accelerators has enabled the on-board processing of payload data using machine learning techniques such as convolutional neural networks (CNN). A notable example is using a CNN to detect the presence of clouds in hyperspectral data captured on Earth observation (EO) missions, whereby only clear sky data is downlinked to conserve bandwidth. However, prior to deployment, new missions that employ new sensors will not have enough representative datasets to train a CNN model, while a model trained solely on data from previous missions will underperform when deployed to process the data on the new missions. This underperformance stems from the domain gap, i.e., differences in the underlying distributions of the data generated by the different sensors in previous and future missions. In this paper, we address the domain gap problem in the context of on-board hyperspectral cloud detection. Our main contributions lie in formulating new domain adaptation tasks that are motivated by a concrete EO mission, developing a novel algorithm for bandwidth-efficient supervised domain adaptation, and demonstrating test-time adaptation algorithms on space deployable neural network accelerators. Our contributions enable minimal data transmission to be invoked (e.g., only 1% of the weights in ResNet50) to achieve domain adaptation, thereby allowing more sophisticated CNN models to be deployed and updated on satellites without being hampered by domain gap and bandwidth limitations.
翻译:星载机器学习硬件加速器的出现,使得利用卷积神经网络(CNN)等机器学习技术对载荷数据进行星上处理成为可能。一个典型例子是使用CNN检测地球观测(EO)任务中高光谱数据是否存在云层,从而仅下传晴空数据以节省带宽。然而,在新任务部署前,采用新型传感器的任务缺乏足够代表性数据集来训练CNN模型,而仅基于历史任务数据训练的模型在部署至新任务处理数据时性能将下降。这种性能下降源于域差异,即不同传感器生成数据的底层分布存在差异。本文针对星上高光谱云检测中的域差异问题开展研究。主要贡献包括:基于具体地球观测任务提出新型域自适应任务,开发带宽高效有监督域自适应算法,并在可部署于太空的神经网络加速器上验证测试时自适应算法。通过本文方法,仅需最小数据传输量(例如ResNet50中仅1%的权重)即可实现域自适应,从而允许在卫星上部署和更新更复杂的CNN模型,而不受域差异和带宽限制的影响。