Because of the inevitable cost and complexity of transformer and pre-trained models, efficiency concerns are raised for long text classification. Meanwhile, in the highly sensitive domains, e.g., healthcare and legal long-text mining, potential model distrust, yet underrated and underexplored, may hatch vital apprehension. Existing methods generally segment the long text, encode each piece with the pre-trained model, and use attention or RNNs to obtain long text representation for classification. In this work, we propose a simple but effective model, Segment-aWare multIdimensional PErceptron (SWIPE), to replace attention/RNNs in the above framework. Unlike prior efforts, SWIPE can effectively learn the label of the entire text with supervised training, while perceive the labels of the segments and estimate their contributions to the long-text labeling in an unsupervised manner. As a general classifier, SWIPE can endorse different encoders, and it outperforms SOTA models in terms of classification accuracy and model efficiency. It is noteworthy that SWIPE achieves superior interpretability to transparentize long text classification results.
翻译:由于Transformer和预训练模型不可避免的成本和复杂性,长文本分类的效率问题备受关注。同时,在高度敏感的领域(如医疗和法律长文本挖掘中),潜在的模型不信任问题虽被低估且尚未充分探索,却可能引发重大担忧。现有方法通常将长文本分段,用预训练模型编码每个片段,并利用注意力机制或RNN获取长文本表示以进行分类。本文提出一种简单但有效的模型——分段感知多维感知机(SWIPE),以替代上述框架中的注意力机制/RNN。与先前工作不同,SWIPE可通过监督训练有效学习整篇文本的标签,同时以无监督方式感知各片段的标签并估计其对长文本标签标注的贡献。作为一种通用分类器,SWIPE可适配不同编码器,在分类准确率和模型效率方面均优于当前最优模型。值得注意的是,SWIPE实现了卓越的可解释性,使长文本分类结果透明化。