Satire detection and sentiment analysis are intensively explored natural language processing (NLP) tasks that study the identification of the satirical tone from texts and extracting sentiments in relationship with their targets. In languages with fewer research resources, an alternative is to produce artificial examples based on character-level adversarial processes to overcome dataset size limitations. Such samples are proven to act as a regularization method, thus improving the robustness of models. In this work, we improve the well-known NLP models (i.e., Convolutional Neural Networks, Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Units (GRUs), and Bidirectional GRUs) with adversarial training and capsule networks. The fine-tuned models are used for satire detection and sentiment analysis tasks in the Romanian language. The proposed framework outperforms the existing methods for the two tasks, achieving up to 99.08% accuracy, thus confirming the improvements added by the capsule layers and the adversarial training in NLP approaches.
翻译:讽刺检测与情感分析是自然语言处理(NLP)领域中深入探索的任务,旨在研究从文本中识别讽刺语气并提取与其目标相关的情感倾向。在语言资源较少的场景下,一种替代方案是基于字符级对抗过程生成人工样本,以克服数据集规模限制。此类样本被证明可起到正则化作用,从而提升模型的鲁棒性。本研究通过引入对抗训练与胶囊网络,改进了经典的NLP模型(即卷积神经网络、长短期记忆网络、双向LSTM、门控循环单元及双向GRU)。经微调的模型被用于罗马尼亚语讽刺检测与情感分析任务。所提框架在这两项任务上均优于现有方法,最高达到99.08%的准确率,从而验证了胶囊层与对抗训练对NLP方法性能的提升效果。