Most existing 3D shape datasets and models focus solely on geometry, overlooking the material properties that determine how objects appear. We introduce a two-stage large language model (LLM) based method for inferring material composition directly from 3D point clouds with coarse segmentations. Our key insight is to decouple reasoning about what an object is from what it is made of. In the first stage, an LLM predicts the object's semantic; in the second stage, it assigns plausible materials to each geometric segment, conditioned on the inferred semantics. Both stages operate in a zero-shot manner, without task-specific training. Because existing datasets lack reliable material annotations, we evaluate our method using an LLM-as-a-Judge implemented in DeepEval. Across 1,000 shapes from Fusion/ABS and ShapeNet, our method achieves high semantic and material plausibility. These results demonstrate that language models can serve as general-purpose priors for bridging geometric reasoning and material understanding in 3D data.
翻译:现有的大多数三维形状数据集和模型仅关注几何结构,而忽略了决定物体外观的材料属性。我们提出了一种基于大语言模型的两阶段方法,用于直接从带有粗略分割的三维点云中推断材料组成。我们的核心思路是将物体是什么的推理与其由什么材料构成进行解耦。在第一阶段,大语言模型预测物体的语义类别;在第二阶段,它根据推断出的语义,为每个几何片段分配合理的材料。两个阶段均以零样本方式运行,无需针对特定任务进行训练。由于现有数据集缺乏可靠的材料标注,我们使用DeepEval实现的大语言模型作为评估器来评估我们的方法。在来自Fusion/ABS和ShapeNet的1,000个形状上,我们的方法在语义和材料合理性方面均取得了优异结果。这些结果表明,语言模型可以作为通用先验,在三维数据中桥接几何推理与材料理解。