This study explores how Bayesian networks (BNs) can improve forecast accuracy compared to logistic regression and recalibration and aggregation methods, using data from the Good Judgment Project. Regularized logistic regression models and a baseline recalibrated aggregate were compared to two types of BNs: structure-learned BNs with arcs between predictors, and naive BNs. Four predictor variables were examined: absolute difference from the aggregate, forecast value, days prior to question close, and mean standardized Brier score. Results indicated the recalibrated aggregate achieved the highest accuracy (AUC = 0.985), followed by both types of BNs, then the logistic regression models. Performance of the BNs was likely harmed by reduced information from the discretization process and violation of the assumption of linearity likely harmed the logistic regression models. Future research should explore hybrid approaches combining BNs with logistic regression, examine additional predictor variables, and account for hierarchical data dependencies.
翻译:本研究基于"良好判断项目"数据,探讨贝叶斯网络相较于逻辑回归及重校准聚合方法在预测精度上的改进效果。研究将正则化逻辑回归模型与基准重校准聚合预测结果同两类贝叶斯网络进行比较:包含预测变量间弧的结构学习型贝叶斯网络和朴素贝叶斯网络。研究考察了四个预测变量:与聚合预测的绝对差异、预测值、问题截止前天数以及平均标准化Brier分数。结果表明,重校准聚合方法获得最高准确率(AUC = 0.985),两类贝叶斯网络次之,逻辑回归模型表现相对较弱。贝叶斯网络的性能可能受离散化过程导致的信息损失影响,而逻辑回归模型则可能因违反线性假设而受损。未来研究应探索贝叶斯网络与逻辑回归相结合的混合方法,考察更多预测变量,并考虑层次化数据依赖关系。