The latest generation of Timepix series hybrid pixel detectors enhance particle tracking with high spatial and temporal resolution. However, their high hit-rate capability poses challenges for data processing, particularly in multidetector configurations or systems like Timepix4. Storing and processing each hit offline is inefficient for such high data throughput. To efficiently group partly unsorted pixel hits into clusters for particle event characterization, we explore parallel approaches for online clustering to enable real-time data reduction. Although using multiple CPU cores improved throughput, scaling linearly with the number of cores, load-balancing issues between processing and I/O led to occasional data loss. We propose a parallel connected component labeling algorithm using a union-find structure with path compression optimized for zero-suppression data encoding. Our GPU implementation achieved a throughput of up to 300 million hits per second, providing a two-order-of-magnitude speedup over compared CPU-based methods while also freeing CPU resources for I/O handling and reducing the data loss.
翻译:最新一代Timepix系列混合像素探测器以高空间与时间分辨率提升了粒子追踪能力。然而,其高击中率特性给数据处理带来了挑战,尤其是在多探测器配置或Timepix4等系统中。对于如此高的数据吞吐率,离线存储和处理每次击中效率低下。为了将部分未排序的像素击中高效分组为簇以进行粒子事件表征,我们探索了在线聚类的并行方法,以实现实时数据约简。尽管使用多CPU核心提升了吞吐率,且吞吐率随核心数量线性扩展,但处理与I/O之间的负载均衡问题仍会导致偶发数据丢失。我们提出了一种基于并查集结构并采用路径压缩优化的并行连通域标记算法,该算法专为零抑制数据编码而优化。我们的GPU实现达到了每秒3亿次击中的吞吐率,相比基于CPU的方法实现了两个数量级的加速,同时释放了CPU资源用于I/O处理并减少了数据丢失。