Python's Global Interpreter Lock prevents execution on more than one CPU core at the same time, even when multiple threads are used. However, starting with Python 3.13 an experimental build allows disabling the GIL. While prior work has examined speedup implications of this disabling, the effects on energy consumption and hardware utilization have received less attention. This study measures execution time, CPU utilization, memory usage, and energy consumption using four workload categories: NumPy-based, sequential kernels, threaded numerical workloads, and threaded object workloads, comparing GIL and free-threaded builds of Python 3.14.2. The results highlight a trade-off. For parallelizable workloads operating on independent data, the free-threaded build reduces execution time by up to 4 times, with a proportional reduction in energy consumption, and effective multi-core utilization, at the cost of an increase in memory usage. In contrast, sequential workloads do not benefit from removing the GIL and instead show a 13-43% increase in energy consumption. Similarly, workloads where threads frequently access and modify the same objects show reduced improvements or even degradation due to lock contention. Across all workloads, energy consumption is proportional to execution time, indicating that disabling the GIL does not significantly affect power consumption, even when CPU utilization increases. When it comes to memory, the no-GIL build shows a general increase, more visible in virtual memory than in physical memory. This increase is primarily attributed to per-object locking, additional thread-safety mechanisms in the runtime, and the adoption of a new memory allocator. These findings suggest that Python's no-GIL build is not a universal improvement. Developers should evaluate whether their workload can effectively benefit from parallel execution before adoption.
翻译:Python的全局解释器锁阻止了程序同时在多个CPU核心上执行,即使使用了多线程。然而,从Python 3.13开始,一个实验性构建版本允许禁用GIL。尽管先前的研究已经探讨了禁用GIL对加速的影响,但其对能耗和硬件利用率的影响却较少受到关注。本研究使用四类工作负载:基于NumPy的工作负载、顺序内核、线程化数值工作负载以及线程化对象工作负载,测量了Python 3.14.2的GIL构建版本与无GIL(自由线程)构建版本在执行时间、CPU利用率、内存使用和能耗方面的表现。结果突显了一种权衡。对于在独立数据上运行的可并行化工作负载,无GIL构建版本可将执行时间减少多达4倍,能耗成比例降低,并能有效利用多核,但代价是内存使用量增加。相反,顺序工作负载无法从移除GIL中受益,反而显示出13-43%的能耗增加。同样,在线程频繁访问和修改相同对象的工作负载中,由于锁争用,性能提升有限甚至出现下降。在所有工作负载中,能耗与执行时间成正比,这表明即使CPU利用率增加,禁用GIL也不会显著影响功耗。在内存方面,无GIL构建版本普遍显示出内存使用增加,虚拟内存的增加比物理内存更为明显。这种增加主要归因于对象级锁、运行时中额外的线程安全机制以及采用新的内存分配器。这些发现表明,Python的无GIL构建版本并非普遍性的改进。开发者在采用前,应评估其工作负载是否能有效受益于并行执行。