We describe a new end-to-end experimental data streaming framework designed from the ground up to support new types of applications -- AI training, extremely high-rate X-ray time-of-flight analysis, crystal structure determination with distributed processing, and custom data science applications and visualizers yet to be created. Throughout, we use design choices merging cloud microservices with traditional HPC batch execution models for security and flexibility. This project makes a unique contribution to the DOE Integrated Research Infrastructure (IRI) landscape. By creating a flexible, API-driven data request service, we address a significant need for high-speed data streaming sources for the X-ray science data analysis community. With the combination of data request API, mutual authentication web security framework, job queue system, high-rate data buffer, and complementary nature to facility infrastructure, the LCLStreamer framework has prototyped and implemented several new paradigms critical for future generation experiments.
翻译:我们描述了一种全新的端到端实验数据流处理框架,该框架自底向上设计,旨在支持新型应用——包括AI训练、极高通量X射线飞行时间分析、分布式处理的晶体结构测定,以及尚未创建的定制数据科学应用与可视化工具。在整个设计过程中,我们融合了云微服务与传统高性能计算批处理执行模式,以实现安全性与灵活性的统一。该项目为美国能源部综合研究基础设施(IRI)体系作出了独特贡献。通过创建灵活的API驱动数据请求服务,我们解决了X射线科学数据分析领域对高速数据流源的迫切需求。LCLStreamer框架集成了数据请求API、双向认证网络安全框架、作业队列系统、高通量数据缓冲区,并与设施基础设施形成互补,已成功原型化并实现了多项对下一代实验至关重要的新范式。