Programs are an increasingly popular representation for visual data, exposing compact, interpretable structure that supports manipulation. Visual programs are usually written in domain-specific languages (DSLs). Finding "good" programs, that only expose meaningful degrees of freedom, requires access to a DSL with a "good" library of functions, both of which are typically authored by domain experts. We present ShapeCoder, the first system capable of taking a dataset of shapes, represented with unstructured primitives, and jointly discovering (i) useful abstraction functions and (ii) programs that use these abstractions to explain the input shapes. The discovered abstractions capture common patterns (both structural and parametric) across the dataset, so that programs rewritten with these abstractions are more compact, and expose fewer degrees of freedom. ShapeCoder improves upon previous abstraction discovery methods, finding better abstractions, for more complex inputs, under less stringent input assumptions. This is principally made possible by two methodological advancements: (a) a shape to program recognition network that learns to solve sub-problems and (b) the use of e-graphs, augmented with a conditional rewrite scheme, to determine when abstractions with complex parametric expressions can be applied, in a tractable manner. We evaluate ShapeCoder on multiple datasets of 3D shapes, where primitive decompositions are either parsed from manual annotations or produced by an unsupervised cuboid abstraction method. In all domains, ShapeCoder discovers a library of abstractions that capture high-level relationships, remove extraneous degrees of freedom, and achieve better dataset compression compared with alternative approaches. Finally, we investigate how programs rewritten to use discovered abstractions prove useful for downstream tasks.
翻译:程序作为视觉数据日益流行的表示形式,能够揭示支持操作的紧凑、可解释结构。视觉程序通常使用领域特定语言编写。要找到仅暴露有意义自由度的"优质"程序,需要访问具有"优质"函数库的DSL,而这两者通常由领域专家设计。我们提出ShapeCoder,这是首个能够处理由非结构化基元表示的形状数据集,并同时发现以下内容的系统:(i)有用的抽象函数,以及(ii)使用这些抽象来解释输入形状的程序。所发现的抽象能捕捉数据集中常见的结构性和参数化模式,使得经抽象重写的程序更加紧凑,并暴露更少的自由度。相比以往的抽象发现方法,ShapeCoder在更宽松的输入假设下,能为更复杂的输入找到更优的抽象。这主要得益于两项方法论创新:(a)一种能学习解决子问题的形状到程序识别网络,以及(b)使用增强条件重写方案的e-graph,以可处理方式确定何时可应用含复杂参数表达式的抽象。我们在多个3D形状数据集上评估ShapeCoder,其中基元分解要么从人工标注解析而来,要么由无监督长方体抽象方法生成。在所有领域中,ShapeCoder都能发现捕捉高层关系、消除冗余自由度的抽象函数库,与替代方法相比实现了更优的数据集压缩。最后,我们研究了经抽象重写的程序如何证明对下游任务的有效性。