Social media platforms are vital for expressing opinions and understanding public sentiment, yet many analytical tools overlook passive users who mainly consume content without engaging actively. To address this, we introduce UniPoll, an advanced framework designed to automatically generate polls from social media posts using sophisticated natural language generation (NLG) techniques. Unlike traditional methods that struggle with social media's informal and context-sensitive nature, UniPoll leverages enriched contexts from user comments and employs multi-objective optimization to enhance poll relevance and engagement. To tackle the inherently noisy nature of social media data, UniPoll incorporates Retrieval-Augmented Generation (RAG) and synthetic data generation, ensuring robust performance across real-world scenarios. The framework surpasses existing models, including T5, ChatGLM3, and GPT-3.5, in generating coherent and contextually appropriate question-answer pairs. Evaluated on the Chinese WeiboPolls dataset and the newly introduced English RedditPolls dataset, UniPoll demonstrates superior cross-lingual and cross-platform capabilities, making it a potent tool to boost user engagement and create a more inclusive environment for interaction.
翻译:社交媒体平台对于表达观点和理解公众情绪至关重要,然而许多分析工具忽略了主要消费内容而不积极参与的被动用户。为解决这一问题,我们提出了UniPoll,这是一个利用先进的自然语言生成技术,从社交媒体帖子中自动生成投票的高级框架。与难以处理社交媒体非正式和上下文敏感特性的传统方法不同,UniPoll利用用户评论中的丰富上下文,并采用多目标优化来提升投票的相关性和参与度。为应对社交媒体数据固有的噪声问题,UniPoll结合了检索增强生成和合成数据生成技术,确保在真实场景中具有鲁棒性能。该框架在生成连贯且上下文合适的问题-答案对方面超越了包括T5、ChatGLM3和GPT-3.5在内的现有模型。在中文WeiboPolls数据集和新引入的英文RedditPolls数据集上的评估表明,UniPoll展现出卓越的跨语言和跨平台能力,使其成为提升用户参与度和创建更具包容性互动环境的有效工具。