Neuron labeling is an approach to visualize the behaviour and respond of a certain neuron to a certain pattern that activates the neuron. Neuron labeling extract information about the features captured by certain neurons in a deep neural network, one of which uses the encoder-decoder image captioning approach. The encoder used can be a pretrained CNN-based model and the decoder is an RNN-based model for text generation. Previous work, namely MILAN (Mutual Information-guided Linguistic Annotation of Neuron), has tried to visualize the neuron behaviour using modified Show, Attend, and Tell (SAT) model in the encoder, and LSTM added with Bahdanau attention in the decoder. MILAN can show great result on short sequence neuron captioning, but it does not show great result on long sequence neuron captioning, so in this work, we would like to improve the performance of MILAN even more by utilizing different kind of attention mechanism and additionally adding several attention result into one, in order to combine all the advantages from several attention mechanism. Using our compound dataset, we obtained higher BLEU and F1-Score on our proposed model, achieving 17.742 and 0.4811 respectively. At some point where the model converges at the peak, our model obtained BLEU of 21.2262 and BERTScore F1-Score of 0.4870.
翻译:神经元标注是一种可视化特定神经元对激活其模式的行为与响应的方法。该方法通过提取深度神经网络中特定神经元所捕获的特征信息,其中一种实现方式采用编码器-解码器图像描述架构。编码器可使用基于CNN的预训练模型,解码器则采用基于RNN的文本生成模型。先前研究MILAN(互信息引导的神经元语言标注)通过改进编码器中的Show, Attend, and Tell(SAT)模型,并在解码器中引入附加Bahdanau注意力的LSTM,尝试可视化神经元行为。MILAN在短序列神经元描述任务中表现优异,但在长序列场景下效果欠佳。为此,本研究通过采用不同类型的注意力机制,并将多个注意力结果融合以综合各注意力机制的优势,进一步提升MILAN性能。在复合数据集上的实验表明,我们提出的模型在BLEU和F1-Score指标上分别达到17.742和0.4811。当模型收敛至峰值时,BLEU值提升至21.2262,BERTScore F1-Score达到0.4870。