A Hierarchical Error Framework for Reliable Automated Coding in Communication Research: Applications to Health and Political Communication

from arxiv, Version 2: Enhanced clarification of precision-matching task characteristics and framework applicability conditions. 20 pages, 4 figures, 4 tables. Replication package available at https://doi.org/10.7910/DVN/NDXVLZ

Automated content analysis increasingly supports communication research, yet scaling manual coding into computational pipelines raises concerns about measurement reliability and validity. We introduce a Hierarchical Error Correction (HEC) framework that treats model failures as layered measurement errors (knowledge gaps, reasoning limitations, and complexity constraints) and targets the layers that most affect inference. The framework implements a three-phase methodology: systematic error profiling across hierarchical layers, targeted intervention design matched to dominant error sources, and rigorous validation with statistical testing. Evaluating HEC across health communication (medical specialty classification) and political communication (bias detection), and legal tasks, we validate the approach with five diverse large language models. Results show average accuracy gains of 11.2 percentage points (p < .001, McNemar's test) and stable conclusions via reduced systematic misclassification. Cross-model validation demonstrates consistent improvements (range: +6.8 to +14.6pp), with effectiveness concentrated in moderate-to-high baseline tasks (50-85% accuracy). A boundary study reveals diminished returns in very high-baseline (>85%) or precision-matching tasks, establishing applicability limits. We map layered errors to threats to construct and criterion validity and provide a transparent, measurement-first blueprint for diagnosing error profiles, selecting targeted interventions, and reporting reliability/validity evidence alongside accuracy. This applies to automated coding across communication research and the broader social sciences.

翻译：自动内容分析日益支撑着传播学研究，然而将人工编码扩展至计算流程引发了关于测量信度与效度的担忧。本文提出分层误差校正框架，将模型失效视为分层测量误差，并针对对推断影响最大的误差层进行干预。该框架实施三阶段方法：跨分层进行系统误差剖析、针对主导误差源设计靶向干预措施，以及采用统计检验进行严格验证。通过在健康传播、政治传播及法律任务中评估HEC框架，并使用五个多样化的大语言模型进行验证，结果显示平均准确率提升11.2个百分点，且通过减少系统误分类获得稳定结论。跨模型验证表明改进效果具有一致性，其有效性集中体现在中高基线任务中。边界研究表明在极高基线任务或精度匹配任务中收益递减，从而确立了适用性边界。我们将分层误差映射至构念效度与效标效度的威胁，并提供透明化、测量优先的蓝图，用于诊断误差特征、选择靶向干预措施，并在报告准确率的同时呈现信度与效度证据。该框架可适用于传播研究及更广泛社会科学领域的自动编码任务。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日