Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.
翻译:语言模型可被训练用于识别文本的道德情感,这为研究道德在人类生活中的作用创造了新的机遇。随着对语言与道德研究的兴趣日益增长,多个包含道德标注的真实数据集已公开发布。然而,这些数据集在数据收集方法、领域、主题、标注者指令等方面存在差异。简单地将此类异构数据集聚合进行训练可能导致模型泛化能力不足。我们提出一种数据融合框架,用于在多个异构数据集上进行训练,以提升模型性能与泛化能力。该模型采用领域对抗训练来对齐特征空间中的数据集,并通过加权损失函数处理标签偏移问题。实验表明,与以往道德推断相关研究相比,所提框架在多个数据集上均达到了最优性能。