The Square Kilometre Array (SKA) will operate one of the world's largest continuous scientific data systems, sustaining petascale imaging under strict power envelopes. Current radio-interferometric pipelines typically achieve only 4--14\% of hardware peak utilization due to memory and I/O bottlenecks, incurring high energy, operational, and carbon costs, further compounded by the absence of standardised cross-layer metrics and fidelity tolerances for principled hardware--software co-design. We present astroCAMP, a reproducible benchmarking and co-design framework for SKA-scale imaging, contributing: (1) a unified metric suite spanning performance, utilisation, memory/data-movement, sustainability, economics, and scientific fidelity; (2) standardised SKA-representative datasets and benchmark configurations for reproducible cross-platform evaluation; (3) a multi-objective co-design formulation linking quality constraints to time-, energy-, carbon-, and cost-to-solution; and (4) a design-space exploration workflow to derive Pareto-optimal operating regions. We evaluate WSClean+IDG on an AMD EPYC 9334 CPU and NVIDIA H100 GPU, revealing orchestration and synchronization bottlenecks despite efficient kernels, limited CPU strong scaling, and location-dependent carbon/cost efficiency. We illustrate astroCAMP for heterogeneous CPU--FPGA exploration and call on the SKA community to define quantifiable fidelity thresholds to accelerate principled optimisation for SKA-scale imaging.
翻译:平方公里阵列(SKA)将运行全球最大的连续科学数据系统之一,在严格的功耗约束下维持拍字节规模的成像任务。当前射电干涉测量流水线通常仅实现4%至14%的硬件峰值利用率,内存与I/O瓶颈导致高能耗、高运营成本及高碳排放,而跨层标准化指标与保真度容忍度的缺失进一步加剧了硬件-软件协同设计缺乏原则性的问题。我们提出astroCAMP——一个面向SKA尺度成像的可复现基准测试与协同设计框架,主要贡献包括:(1) 涵盖性能、利用率、内存/数据迁移、可持续性、经济成本及科学保真度的统一度量体系;(2) 标准化的SKA代表性数据集与基准配置,支持跨平台可复现评估;(3) 将质量约束与时间、能耗、碳排放及求解成本相关联的多目标协同设计公式;(4) 用于推导帕累托最优运行区域的设计空间探索工作流。我们在AMD EPYC 9334 CPU与NVIDIA H100 GPU上对WSClean+IDG进行了评估,揭示了尽管核函数高效,仍存在编排与同步瓶颈、CPU弱扩展性受限及位置依赖的碳/成本效率问题。我们通过异构CPU-FPGA探索场景演示了astroCAMP,并呼吁SKA社区定义可量化的保真度阈值,以加速SKA尺度成像的原则性优化。