Neuron analysis provides insights into how knowledge is structured in representations and discovers the role of neurons in the network. In addition to developing an understanding of our models, neuron analysis enables various applications such as debiasing, domain adaptation and architectural search. We present NeuroX, a comprehensive open-source toolkit to conduct neuron analysis of natural language processing models. It implements various interpretation methods under a unified API, and provides a framework for data processing and evaluation, thus making it easier for researchers and practitioners to perform neuron analysis. The Python toolkit is available at https://www.github.com/fdalvi/NeuroX. Demo Video available at https://youtu.be/mLhs2YMx4u8.
翻译:神经元分析能够揭示知识在表征中的组织方式,并探索神经网络中各神经元的作用。除了加深对模型的理解外,神经元分析还支持去偏、领域适应和架构搜索等多种应用。我们提出NeuroX——一个面向自然语言处理模型神经元分析的全面开源工具包。该工具包在统一API下实现了多种解读方法,并提供数据处理和评估框架,从而降低了研究人员和从业者进行神经元分析的门槛。Python工具包下载地址:https://www.github.com/fdalvi/NeuroX。演示视频:https://youtu.be/mLhs2YMx4u8。