Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution

Many tasks related to Computational Social Science and Web Content Analysis involve classifying pieces of text based on the claims they contain. State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce. In light of this, we propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task. This methodology involves defining the classes as arbitrarily sophisticated taxonomies of claims, and using Natural Language Inference models to obtain the textual entailment between these and a corpus of interest. The performance of these models is then boosted by annotating a minimal sample of data points, dynamically sampled using the well-established statistical heuristic of Probabilistic Bisection. We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection. This approach rivals traditional pre-train/fine-tune approaches while drastically reducing the need for data annotation.

翻译：许多与计算社会科学和网络内容分析相关的任务涉及根据文本中包含的主张对文本片段进行分类。现有先进方法通常需要在大规模标注数据集上微调模型，这会产生高昂的成本。鉴于此，我们提出并发布了一种定性且通用的少样本学习方法论，作为任何基于主张的文本分类任务的通用范式。该方法论包括将类别定义为任意复杂的主张分类体系，并利用自然语言推理模型获取这些主张与目标语料库之间的文本蕴涵关系。随后，通过使用概率二分法这一成熟统计启发式方法动态采样最少的数据点进行标注，进一步提升模型性能。我们在三项任务中展示了该方法的有效性：气候变化怀疑论检测、主题/立场分类以及抑郁相关症状检测。该方法在显著减少数据标注需求的同时，达到了与传统预训练/微调方法相媲美的效果。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日