Magnetic Resonance Image (MRI) pre-processing is a critical step for neuroimaging analysis. However, the computational cost of MRI pre-processing pipelines is a major bottleneck for large cohort studies and some clinical applications. While High-Performance Computing (HPC) and, more recently, Deep Learning have been adopted to accelerate the computations, these techniques require costly hardware and are not accessible to all researchers. Therefore, it is important to understand the performance bottlenecks of MRI pre-processing pipelines to improve their performance. Using Intel VTune profiler, we characterized the bottlenecks of several commonly used MRI-preprocessing pipelines from the ANTs, FSL, and FreeSurfer toolboxes. We found that few functions contributed to most of the CPU time, and that linear interpolation was the largest contributor. Data access was also a substantial bottleneck. We identified a bug in the ITK library that impacts the performance of ANTs pipeline in single-precision and a potential issue with the OpenMP scaling in FreeSurfer recon-all. Our results provide a reference for future efforts to optimize MRI pre-processing pipelines.
翻译:磁共振图像(MRI)预处理是神经影像分析的关键步骤。然而,MRI预处理流程的计算成本是大规模队列研究和部分临床应用的主要瓶颈。尽管高性能计算(HPC)以及近年来的深度学习技术已被用于加速计算,但这些技术需要昂贵的硬件,并非所有研究人员都能使用。因此,理解MRI预处理流程的性能瓶颈对于提升其性能至关重要。利用Intel VTune性能分析器,我们对来自ANTs、FSL和FreeSurfer工具箱的多个常用MRI预处理流程进行了瓶颈特征分析。我们发现,少数函数贡献了大部分CPU时间,其中线性插值是最大的耗时环节。数据访问也是一个显著的瓶颈。我们识别出ITK库中一个影响ANTs流程在单精度模式下性能的错误,以及FreeSurfer recon-all中OpenMP扩展的潜在问题。我们的研究结果为未来优化MRI预处理流程的工作提供了参考。