Probabilistic computers built from p-bits offer a promising path for combinatorial optimization, but the dense connectivity required by real-world problems scales poorly in hardware. Here, we address this through graph sparsification with auxiliary copy variables and demonstrate a fully on-chip parallel tempering solver on an FPGA. Targeting MIMO detection, a dense, NP-hard problem central to wireless communications, we fit 15 temperature replicas of a 128-node sparsified system (1,920 p-bits) entirely on-chip and achieve bit error rates significantly below conventional linear detectors. We report complete end-to-end solution times of 4.7 ms per instance, with all loading, sampling, readout, and verification overheads included. ASIC projections in 7 nm technology indicate about 90 MHz operation with less than 200 mW power dissipation, suggesting that massive parallelism across multiple chips could approach the throughput demands of next-generation wireless systems. However, sparsification introduces sensitivity to the copy-constraint strength. Employing Two-Dimensional Parallel Tempering (2D-PT), which exchanges replicas across both temperature and constraint dimensions, we demonstrate over 10X faster convergence without manual parameter tuning. These results establish an on-chip p-bit architecture and a scalable algorithmic framework for dense combinatorial optimization.
翻译:由p比特构建的概率计算机为组合优化提供了一条前景广阔的路径,但现实问题所需的密集连接性在硬件中难以扩展。本文通过引入辅助复制变量进行图稀疏化,并在FPGA上演示了一种完全片上并行回火求解器。针对无线通信核心的密集NP难问题——MIMO检测,我们在单个芯片上完全实现了128节点稀疏化系统(1,920个p比特)的15个温度副本,并实现了显著低于传统线性检测器的误码率。我们报告了每个实例4.7毫秒的完整端到端求解时间,其中包含了所有加载、采样、读出和验证开销。基于7纳米工艺的ASIC性能预测显示,其工作频率约为90 MHz且功耗低于200 mW,这表明跨多芯片的大规模并行处理有望满足下一代无线系统的吞吐量需求。然而,稀疏化会引入对复制约束强度的敏感性。通过采用二维并行回火(2D-PT)方法——在温度和约束维度上同时交换副本,我们展示了超过10倍的收敛速度提升,且无需手动参数调优。这些成果为密集组合优化建立了一种片上p比特架构和可扩展的算法框架。