Classification of Consumer Belief Statements From Social Media

Social media offer plenty of information to perform market research in order to meet the requirements of customers. One way how this research is conducted is that a domain expert gathers and categorizes user-generated content into a complex and fine-grained class structure. In many of such cases, little data meets complex annotations. It is not yet fully understood how this can be leveraged successfully for classification. We examine the classification accuracy of expert labels when used with a) many fine-grained classes and b) few abstract classes. For scenario b) we compare abstract class labels given by the domain expert as baseline and by automatic hierarchical clustering. We compare this to another baseline where the entire class structure is given by a completely unsupervised clustering approach. By doing so, this work can serve as an example of how complex expert annotations are potentially beneficial and can be utilized in the most optimal way for opinion mining in highly specific domains. By exploring across a range of techniques and experiments, we find that automated class abstraction approaches in particular the unsupervised approach performs remarkably well against domain expert baseline on text classification tasks. This has the potential to inspire opinion mining applications in order to support market researchers in practice and to inspire fine-grained automated content analysis on a large scale.

翻译：社交媒体为开展市场研究提供了丰富的信息，以满足客户需求。一种常见的研究方式是领域专家将用户生成内容归类为复杂且细粒度的类别结构。在许多此类案例中，少量数据对应着复杂的标注体系，目前尚未完全理解如何有效利用这种特性进行分类。我们研究了专家标注在以下两种场景中的分类准确率：a) 采用大量细粒度类别；b) 采用少量抽象类别。针对场景b)，我们将领域专家给出的抽象类别标签作为基线，并与自动层次聚类方法进行对比。此外，我们还将完整的类别结构与完全无监督聚类方法给出的基线进行对比。通过这种方式，本研究可示例说明复杂专家标注如何在高度特定领域的意见挖掘中发挥潜在优势，并得以最优方式利用。通过一系列技术与实验探索，我们发现自动类别抽象方法（尤其是无监督方法）在文本分类任务中的表现显著优于领域专家基线。这项工作有望启发意见挖掘应用，以支持市场研究者的实际工作，并推动大规模细粒度自动化内容分析的发展。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日