Multi-participant discussions tend to unfold in a tree structure rather than a chain structure. Branching may occur for multiple reasons -- from the asynchronous nature of online platforms to a conscious decision by an interlocutor to disengage with part of the conversation. Predicting branching and understanding the reasons for creating new branches is important for many downstream tasks such as summarization and thread disentanglement and may help develop online spaces that encourage users to engage in online discussions in more meaningful ways. In this work, we define the novel task of branch prediction and propose GLOBS (Global Branching Score) -- a deep neural network model for predicting branching. GLOBS is evaluated on three large discussion forums from Reddit, achieving significant improvements over an array of competitive baselines and demonstrating better transferability. We affirm that structural, temporal, and linguistic features contribute to GLOBS success and find that branching is associated with a greater number of conversation participants and tends to occur in earlier levels of the conversation tree. We publicly release GLOBS and our implementation of all baseline models to allow reproducibility and promote further research on this important task.
翻译:多参与者讨论通常以树状结构而非链式结构展开。分支可能因多种原因产生——从在线平台的异步特性到参与者有意识地脱离部分对话。预测分支并理解创建新分支的原因对于摘要生成、话题线索梳理等下游任务至关重要,可能有助于开发能鼓励用户更有意义地参与在线讨论的线上空间。本研究定义了分支预测这一新任务,并提出全球分支评分模型——一种用于预测分支的深度神经网络。GLOBS在来自Reddit的三个大型讨论论坛上进行评估,相较一系列强基线模型取得显著提升,并展现出更优的迁移能力。我们证实结构性、时间性和语言性特征共同促成了GLOBS的成功,并发现分支与更多参与者数量相关,且倾向于发生在对话树的早期层级。我们公开发布GLOBS及所有基线模型的实现代码,以支持研究可复现性并推动这一重要任务的进一步研究。