Recent work has explored the use of personal information in the form of persona sentences or self-disclosures to improve modeling of individual characteristics and prediction of annotator labels for subjective tasks. The volume of personal information has historically been restricted and thus little exploration has gone into understanding what kind of information is most informative for predicting annotator labels. In this work, we categorize self-disclosures and use them to build annotator models for predicting judgments of social norms. We perform several ablations and analyses to examine the impact of the type of information on our ability to predict annotation patterns. Contrary to previous work, only a small number of comments related to the original post are needed. Lastly, a more diverse sample of annotator self-disclosures did not lead to the best performance. Sampling from a larger pool of comments without filtering still yields the best performance, suggesting that there is still much to uncover in terms of what information about an annotator is most useful for verdict prediction.
翻译:近期研究探索了利用个人信息(以人物描述句或自我披露的形式)来改进个体特征建模及主观任务中标注者标签的预测。历史上个人信息的数量一直受到限制,因此对于何种信息对预测标注者标签最具信息价值的研究甚少。在本研究中,我们对自我披露进行分类,并利用其构建标注者模型以预测社交规范的判断。我们进行了多项消融实验与分析,以考察信息类型对预测标注模式能力的影响。与先前研究相反,仅需少量与原帖相关的评论即可达到效果。最后,更多样化的标注者自我披露样本并未带来最佳性能。从更大规模的评论池中不加筛选地抽样仍能获得最优性能,这表明关于标注者的何种信息对裁决预测最为有效,仍有大量问题有待揭示。