Automated content analysis increasingly supports communication research, yet scaling manual coding into computational pipelines raises concerns about measurement reliability and validity. We introduce a Hierarchical Error Correction (HEC) framework that treats model failures as layered measurement errors (knowledge gaps, reasoning limitations, and complexity constraints) and targets the layers that most affect inference. The framework implements a three-phase methodology: systematic error profiling across hierarchical layers, targeted intervention design matched to dominant error sources, and rigorous validation with statistical testing. Evaluating HEC across health communication (medical specialty classification) and political communication (bias detection), and legal tasks, we validate the approach with five diverse large language models. Results show average accuracy gains of 11.2 percentage points (p < .001, McNemar's test) and stable conclusions via reduced systematic misclassification. Cross-model validation demonstrates consistent improvements (range: +6.8 to +14.6pp), with effectiveness concentrated in moderate-to-high baseline tasks (50-85% accuracy). A boundary study reveals diminished returns in very high-baseline (>85%) or precision-matching tasks, establishing applicability limits. We map layered errors to threats to construct and criterion validity and provide a transparent, measurement-first blueprint for diagnosing error profiles, selecting targeted interventions, and reporting reliability/validity evidence alongside accuracy. This applies to automated coding across communication research and the broader social sciences.
翻译:自动内容分析日益支撑着传播学研究,然而将人工编码扩展至计算流程引发了关于测量信度与效度的担忧。本文提出分层误差校正框架,将模型失效视为分层测量误差,并针对对推断影响最大的误差层进行干预。该框架实施三阶段方法:跨分层进行系统误差剖析、针对主导误差源设计靶向干预措施,以及采用统计检验进行严格验证。通过在健康传播、政治传播及法律任务中评估HEC框架,并使用五个多样化的大语言模型进行验证,结果显示平均准确率提升11.2个百分点,且通过减少系统误分类获得稳定结论。跨模型验证表明改进效果具有一致性,其有效性集中体现在中高基线任务中。边界研究表明在极高基线任务或精度匹配任务中收益递减,从而确立了适用性边界。我们将分层误差映射至构念效度与效标效度的威胁,并提供透明化、测量优先的蓝图,用于诊断误差特征、选择靶向干预措施,并在报告准确率的同时呈现信度与效度证据。该框架可适用于传播研究及更广泛社会科学领域的自动编码任务。