In this work, we introduce a lightweight discourse connective detection system. Employing gradient boosting trained on straightforward, low-complexity features, this proposed approach sidesteps the computational demands of the current approaches that rely on deep neural networks. Considering its simplicity, our approach achieves competitive results while offering significant gains in terms of time even on CPU. Furthermore, the stable performance across two unrelated languages suggests the robustness of our system in the multilingual scenario. The model is designed to support the annotation of discourse relations, particularly in scenarios with limited resources, while minimizing performance loss.
翻译:本文提出了一种轻量级的篇章连接词检测系统。该方法采用基于简单低复杂度特征训练的梯度提升算法,规避了当前依赖深度神经网络的方法所需的高额计算成本。尽管结构简单,该方法在CPU上仍能以显著的时间优势取得具有竞争力的结果。此外,该方法在两种无关语言上表现稳定,表明系统在多语言场景下具有良好的鲁棒性。该模型旨在支持标注篇章关系,特别是在资源受限的场景中,同时最大限度减少性能损失。