Many CAD learning pipelines discretize Boundary Representations (B-Reps) into triangle meshes, discarding analytic surface structure and topological adjacency and thereby weakening consistent instance-level analysis. We present STEP-Parts, a deterministic CAD-to-supervision toolchain that extracts geometric instance partitions directly from raw STEP B-Reps and transfers them to tessellated carriers through retained source-face correspondence, yielding instance labels and metadata for downstream learning and evaluation. The construction merges adjacent B-Rep faces only when they share the same analytic primitive type and satisfy a near-tangent continuity criterion. On ABC, same-primitive dihedral angles are strongly bimodal, yielding a threshold-insensitive low-angle regime for part extraction. Because the partition is defined on intrinsic B-Rep topology rather than on a particular triangulation, the resulting boundaries remain stable under changes in tessellation. Applied to the DeepCAD subset of ABC, the pipeline processes approximately 180{,}000 models in under six hours on a consumer CPU. We release code and precomputed labels, and show that STEP-Parts serves both as a tessellation-robust geometric reference and as a useful supervision source in two downstream probes: an implicit reconstruction--segmentation network and a dataset-level point-based backbone.
翻译:许多CAD学习流程将边界表示离散化为三角形网格,丢失了解析曲面结构与拓扑邻接关系,从而削弱了一致性的实例级分析。我们提出STEP-Parts——一种确定性的CAD到标注工具链,可直接从原始STEP边界表示中提取几何实例分割,并通过保留源面对应关系将其传递至网格化载体,为下游学习与评估提供实例标签及元数据。该构建仅当相邻边界表示面共享相同解析曲面类型且满足近切连续性准则时进行合并。在ABC数据集中,相同基元二面角呈现强双峰分布,从而为零件提取提供了对阈值不敏感的低角度区间。由于分割定义在边界表示的固有拓扑结构上而非特定三角剖分上,所得边界在网格化变化下保持稳定。将该流程应用于ABC数据集的DeepCAD子集,在消费级CPU上可在六小时内处理约18万个模型。我们公开代码与预计算标签,并通过两项下游探测任务(隐式重建-分割网络与数据集级点云骨干网络)证明STEP-Parts既能作为鲁棒于网格化的几何参考,又可作为有效的监督源。