NVIDIA's CUTLASS library provides a robust and expressive set of methods for describing and manipulating multi-dimensional tensor data on the GPU. These methods are conceptually grounded in the abstract notion of a CuTe layout and a rich algebra of such layouts, including operations such as composition, logical product, and logical division. In this paper, we present a categorical framework for understanding this layout algebra by focusing on a naturally occurring class of tractable layouts. To this end, we define two categories Tuple and Nest whose morphisms give rise to layouts. We define a suite of operations on morphisms in these categories and prove their compatibility with the corresponding layout operations. Moreover, we give a complete characterization of the layouts which arise from our construction. Finally, we provide a Python implementation of our categorical constructions, along with tests that demonstrate alignment with CUTLASS behavior. This implementation can be found at our git repository https://github.com/ColfaxResearch/layout-categories.
翻译:NVIDIA的CUTLASS库提供了一套稳健且富有表现力的方法,用于描述和操作GPU上的多维张量数据。这些方法在概念上基于CuTe布局的抽象概念以及此类布局的丰富代数,包括复合、逻辑积和逻辑除等运算。本文提出一个范畴论框架,通过聚焦于自然出现的一类可处理布局来理解这种布局代数。为此,我们定义了两个范畴Tuple和Nest,其态射可生成布局。我们定义了这些范畴中态射的一系列运算,并证明了它们与相应布局运算的兼容性。此外,我们完整刻画了由我们构造产生的布局。最后,我们提供了范畴构造的Python实现,并通过测试证明了其与CUTLASS行为的一致性。该实现可在我们的git仓库https://github.com/ColfaxResearch/layout-categories中找到。