Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection. This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self-training method. Our analysis across various datasets reveals that this incremental method excels at adapting PLMs to EDS, outperforming traditional domain adaptation techniques. These findings highlight the importance of continually updating PLMs to ensure their effectiveness in real-world applications, paving the way for future research into PLM robustness against the natural temporal evolution of language.
翻译:将预训练语言模型(PLMs)适应于存在演化域偏移(EDS)的时间序列文本分类,对于在立场检测等应用中保持准确性至关重要。本研究系统评估了演化域自适应(EDA)策略的有效性,包括自训练、域对抗训练和域自适应预训练,并重点关注一种增量自训练方法。我们在多个数据集上的分析表明,该增量方法在使PLMs适应EDS方面表现优异,超越了传统域自适应技术。这些发现强调了持续更新PLMs对于确保其在现实应用中有效性的重要性,为未来研究PLMs应对语言自然时间演化的鲁棒性奠定了基础。