Many CAD learning pipelines discretize Boundary Representations (B-Reps) into triangle meshes, discarding analytic surface structure and topological adjacency and thereby weakening consistent instance-level analysis. We present STEP-Parts, a deterministic CAD-to-supervision toolchain that extracts geometric instance partitions directly from raw STEP B-Reps and transfers them to tessellated carriers through retained source-face correspondence, yielding instance labels and metadata for downstream learning and evaluation. The construction merges adjacent B-Rep faces only when they share the same analytic primitive type and satisfy a near-tangent continuity criterion. On ABC, same-primitive dihedral angles are strongly bimodal, yielding a threshold-insensitive low-angle regime for part extraction. Because the partition is defined on intrinsic B-Rep topology rather than on a particular triangulation, the resulting boundaries remain stable under changes in tessellation. Applied to the DeepCAD subset of ABC, the pipeline processes approximately 180{,}000 models in under six hours on a consumer CPU. We release code and precomputed labels, and show that STEP-Parts serves both as a tessellation-robust geometric reference and as a useful supervision source in two downstream probes: an implicit reconstruction--segmentation network and a dataset-level point-based backbone.
翻译:许多CAD学习流程将边界表示(B-Reps)离散化为三角形网格,这一过程丢弃了解析曲面结构与拓扑邻接关系,从而削弱了实例级分析的一致性。我们提出STEP-Parts——一种确定性CAD到监督标注的自动化工具链,该工具直接从原始STEP格式的B-Reps中提取几何实例分区,并通过保留的源面对应关系将分区转移至网格化载体,为下游学习与评估提供实例标签及元数据。该构造仅当相邻B-Rep面属于同一解析基元类型且满足近切连续准则时,才将其合并。在ABC数据集中,同基元二面角呈现显著双峰分布,从而为零件提取提供了对阈值不敏感的低角度区间。由于分区定义基于B-Rep内在拓扑而非特定三角剖分,所得边界在网格化变化下保持稳定。将该流程应用于ABC数据集中的DeepCAD子集时,可在消费级CPU上于六小时内处理约18万个模型。我们开源代码与预计算标签,并证明STEP-Parts在两项下游探索任务中(隐式重建-分割网络与数据集级点云骨干网络)既能作为鲁棒于网格化的几何参考,又能作为有效的监督源。