In diverse professional environments, ranging from academic conferences to corporate earnings calls, the ability to anticipate audience questions stands paramount. Traditional methods, which rely on manual assessment of an audience's background, interests, and subject knowledge, often fall short - particularly when facing large or heterogeneous groups, leading to imprecision and inefficiency. While NLP has made strides in text-based question generation, its primary focus remains on academic settings, leaving the intricate challenges of professional domains, especially earnings call conferences, underserved. Addressing this gap, our paper pioneers the multi-question generation (MQG) task specifically designed for earnings call contexts. Our methodology involves an exhaustive collection of earnings call transcripts and a novel annotation technique to classify potential questions. Furthermore, we introduce a retriever-enhanced strategy to extract relevant information. With a core aim of generating a spectrum of potential questions that analysts might pose, we derive these directly from earnings call content. Empirical evaluations underscore our approach's edge, revealing notable excellence in the accuracy, consistency, and perplexity of the questions generated.
翻译:在从学术会议到企业财报电话会议等多元化专业场景中,预判听众提问的能力至关重要。传统方法依赖人工评估听众背景、兴趣及专业知识,在面对大规模或异质性群体时常显不足,导致预测不精准且效率低下。尽管自然语言处理技术在文本问题生成领域已取得进展,其研究重心仍集中于学术场景,对财报电话会议等专业领域特有的复杂挑战尚未充分应对。为填补这一空白,本文首创面向财报电话会议场景的多问题生成任务。我们的方法包括系统收集财报电话会议转录文本,并采用创新的标注技术对潜在问题进行分类。此外,我们引入检索增强策略以提取相关信息。该方法的核心目标是从财报电话会议内容中直接生成分析师可能提出的多样化潜在问题。实证评估表明我们的方法具有显著优势,在生成问题的准确性、一致性和困惑度等指标上均表现出卓越性能。