In this paper, we reported our experiments with various strategies to improve code-mixed humour and sarcasm detection. We did all of our experiments for Hindi-English code-mixed scenario, as we have the linguistic expertise for the same. We experimented with three approaches, namely (i) native sample mixing, (ii) multi-task learning (MTL), and (iii) prompting very large multilingual language models (VMLMs). In native sample mixing, we added monolingual task samples in code-mixed training sets. In MTL learning, we relied on native and code-mixed samples of a semantically related task (hate detection in our case). Finally, in our third approach, we evaluated the efficacy of VMLMs via few-shot context prompting. Some interesting findings we got are (i) adding native samples improved humor (raising the F1-score up to 6.76%) and sarcasm (raising the F1-score up to 8.64%) detection, (ii) training MLMs in an MTL framework boosted performance for both humour (raising the F1-score up to 10.67%) and sarcasm (increment up to 12.35% in F1-score) detection, and (iii) prompting VMLMs couldn't outperform the other approaches. Finally, our ablation studies and error analysis discovered the cases where our model is yet to improve. We provided our code for reproducibility.
翻译:本文报告了为提升语码混合幽默与讽刺检测效果所尝试的多种策略实验。所有实验均针对印地语-英语语码混合场景展开,因我们具备该语言对的专业知识。我们探索了三种方法:(i) 单语样本混合,即在语码混合训练集中加入单语任务样本;(ii) 多任务学习,借助语义相关任务(本研究采用仇恨检测)的单语及语码混合样本进行训练;(iii) 基于提示的超大规模多语言语言模型评估。实验发现:(i) 添加单语样本可提升幽默检测(F1分数最高提升6.76%)与讽刺检测(F1分数最高提升8.64%)性能;(ii) 在多任务学习框架下训练语言模型能同时增强幽默检测(F1分数最高提升10.67%)与讽刺检测(F1分数最高提升12.35%)效果;(iii) 基于提示的超大规模多语言模型未能超越前两种方法。通过消融研究与错误分析,我们明确了模型尚待改进的案例。为保障可复现性,本研究已公开实验代码。