mLR: Scalable Laminography Reconstruction based on Memoization

ADMM-FFT is an iterative method with high reconstruction accuracy for laminography but suffers from excessive computation time and large memory consumption. We introduce mLR, which employs memoization to replace the time-consuming Fast Fourier Transform (FFT) operations based on an unique observation that similar FFT operations appear in iterations of ADMM-FFT. We introduce a series of techniques to make the application of memoization to ADMM-FFT performance-beneficial and scalable. We also introduce variable offloading to save CPU memory and scale ADMM-FFT across GPUs within and across nodes. Using mLR, we are able to scale ADMM-FFT on an input problem of 2Kx2Kx2K, which is the largest input problem laminography reconstruction has ever worked on with the ADMM-FFT solution on limited memory; mLR brings 52.8% performance improvement on average (up to 65.4%), compared to the original ADMM-FFT.

翻译：ADMM-FFT是一种针对层析成像具有高重建精度的迭代方法，但其存在计算时间过长和内存消耗过大的问题。本文提出mLR方法，该方法利用记忆化技术替代耗时的快速傅里叶变换（FFT）运算，其核心依据在于观察到ADMM-FFT迭代过程中会出现相似的FFT运算。我们引入了一系列技术，使得记忆化在ADMM-FFT中的应用既能提升性能又具备可扩展性。同时，我们提出了变量卸载技术以节省CPU内存，并实现ADMM-FFT在节点内及跨节点GPU间的扩展。通过mLR，我们能够在2K×2K×2K的输入问题上扩展ADMM-FFT，这是层析成像重建领域在有限内存条件下采用ADMM-FFT解决方案处理过的最大规模输入问题；与原始ADMM-FFT相比，mLR平均带来52.8%（最高达65.4%）的性能提升。