Supervised synthetic CT generation from CBCT requires registered training pairs, yet perfect registration between separately acquired scans remains unattainable. This registration bias propagates into trained models and corrupts standard evaluation metrics. This may suggest that superior benchmark performance indicates better reproduction of registration artifacts rather than anatomical fidelity. We propose physics-based CBCT simulation to provide geometrically aligned training pairs by construction, combined with evaluation using geometric alignment metrics against input CBCT rather than biased ground truth. On two independent pelvic datasets, models trained on synthetic data achieved superior geometric alignment (Normalized Mutual Information: 0.31 vs 0.22) despite lower conventional intensity scores. Intensity metrics showed inverted correlations with clinical assessment for deformably registered data, while Normalized Mutual Information consistently predicted observer preference across registration methodologies (rho = 0.31, p < 0.001). Clinical observers preferred synthetic-trained outputs in 87% of cases, demonstrating that geometric fidelity, not intensity agreement with biased ground truth, aligns with clinical requirements.
翻译:从锥束CT(CBCT)生成监督式合成CT需要配准的训练对,然而分别获取的扫描图像之间的完美配准仍无法实现。这种配准偏差会传递到训练模型中,并破坏标准评估指标。这可能意味着更优的基准性能反映的是对配准伪影的更好复现,而非解剖保真度。我们提出基于物理的CBCT仿真方法,通过构建提供几何对齐的训练对,并结合使用针对输入CBCT(而非有偏差的金标准)的几何对齐指标进行评估。在两个独立的盆腔数据集上,基于合成数据训练的模型实现了更优的几何对齐(归一化互信息:0.31对比0.22),尽管传统强度指标得分较低。强度指标与可变形配准数据的临床评估呈现反向相关性,而归一化互信息在不同配准方法中均能一致预测观察者偏好(ρ=0.31,p<0.001)。临床观察者在87%的案例中更倾向于合成数据训练的输出结果,这表明几何保真度(而非与有偏差金标准的强度一致性)符合临床需求。