Data races are critical issues in multithreaded program, leading to unpredictable, catastrophic and difficult-to-diagnose problems. Despite the extensive in-house testing, data races often escape to deployed software and manifest in production runs. Existing approaches suffer from either prohibitively high runtime overhead or incomplete detection capability. In this paper, we introduce HardRace, a data race monitor to detect races on-the-fly while with sufficiently low runtime overhead and high detection capability. HardRace firstly employs sound static analysis to determine a minimal set of essential memory accesses relevant to data races. It then leverages hardware trace instruction, i.e., Intel PTWRITE, to selectively record only these memory accesses and thread synchronization events during execution with negligible runtime overhead. Given the tracing data, HardRace performs standard data race detection algorithms to timely report potential races occurred in production runs. The experimental evaluations show that HardRace outperforms state-of-the-art tools like ProRace and Kard in terms of both runtime overhead and detection capability -- HardRace can detect all kinds of data races in read-world applications while maintaining a negligible overhead, less than 2% on average.
翻译:数据竞争是多线程程序中的关键问题,会导致不可预测、灾难性且难以诊断的故障。尽管经过广泛的内部测试,数据竞争仍常逃逸至已部署软件并在生产运行中显现。现有方法存在运行开销过高或检测能力不完整的缺陷。本文提出HardRace,一种能够在运行时动态检测数据竞争且具备足够低运行开销与高检测能力的监控器。HardRace首先采用可靠的静态分析技术,确定与数据竞争相关的最小化核心内存访问集合;随后利用硬件追踪指令(即Intel PTWRITE)在执行期间选择性记录仅这些内存访问及线程同步事件,实现可忽略的运行开销。基于追踪数据,HardRace执行标准数据竞争检测算法,及时报告生产运行中出现的潜在竞争。实验评估表明,HardRace在运行开销与检测能力上均优于ProRace、Kard等前沿工具——能够在真实应用中检测各类数据竞争,同时保持可忽略的平均开销(低于2%)。