Deep convolutional neural networks (CNNs) have been shown to predict poverty and development indicators from satellite images with surprising accuracy. This paper presents a first attempt at analyzing the CNNs responses in detail and explaining the basis for the predictions. The CNN model, while trained on relatively low resolution day- and night-time satellite images, is able to outperform human subjects who look at high-resolution images in ranking the Wealth Index categories. Multiple explainability experiments performed on the model indicate the importance of the sizes of the objects, pixel colors in the image, and provide a visualization of the importance of different structures in input images. A visualization is also provided of type images that maximize the network prediction of Wealth Index, which provides clues on what the CNN prediction is based on.
翻译:深度卷积神经网络已被证实能够从卫星图像中以惊人的准确率预测贫困与发展指标。本文首次尝试详细分析卷积神经网络的响应机制并解释其预测依据。该卷积神经网络模型虽以较低分辨率的昼夜卫星图像进行训练,但在财富指数等级排序任务中表现优于观察高分辨率图像的人类受试者。针对该模型的多项可解释性实验表明,图像中物体尺寸、像素色彩具有关键影响,并实现了输入图像中不同结构重要性的可视化呈现。此外,研究还展示了最大化网络财富指数预测输出的典型图像类型,为理解卷积神经网络预测依据提供了线索。