Indirect call resolution remains a key challenge in reverse engineering and control-flow graph recovery, especially for stripped or optimized binaries. Static analysis is sound but often over-approximates, producing many false positives, whereas machine-learning approaches can improve precision but may sacrifice completeness and generalization. We present iResolveX, a hybrid multi-layered framework that combines conservative static analysis with learning-based refinement. The first layer applies a conservative value-set analysis (BPA) to ensure high recall. The second layer adds a learning-based soft-signature scorer (iScoreGen) and selective inter-procedural backward analysis with memory inspection (iScoreRefine) to reduce false positives. The final output, p-IndirectCFG, annotates indirect edges with confidence scores, enabling downstream analyses to choose appropriate precision--recall trade-offs. Across SPEC CPU2006 and real-world binaries, iScoreGen reduces predicted targets by 19.2% on average while maintaining BPA-level recall (98.2%). Combined with iScoreRefine, the total reduction reaches 44.3% over BPA with 97.8% recall (a 0.4% drop). iResolveX supports both conservative, recall-preserving and F1-optimized configurations and outperforms state-of-the-art systems.
翻译:间接调用解析始终是逆向工程与控制流图恢复中的核心挑战,尤其对于剥离符号或经过优化的二进制文件。静态分析方法虽完备,但常因过度近似而产生大量误报;而机器学习方法虽能提升精确度,却可能牺牲完备性与泛化能力。本文提出iResolveX——一种融合保守静态分析与学习式细化的混合多层框架。第一层采用保守值集分析(BPA)以确保高召回率。第二层引入基于学习的软签名评分器(iScoreGen)及结合内存检查的选择性过程间逆向分析(iScoreRefine)以降低误报。最终输出的p-IndirectCFG通过置信度分数标注间接边,使下游分析可根据需求权衡精确度与召回率。在SPEC CPU2006基准集与真实二进制文件上的实验表明:iScoreGen在保持BPA级别召回率(98.2%)的同时,平均减少19.2%的预测目标;结合iScoreRefine后,总预测目标较BPA减少44.3%,召回率维持在97.8%(仅下降0.4%)。iResolveX同时支持保守的召回优先模式与F1优化模式,其性能优于当前最先进系统。