Adapting Automotive Aerodynamics Surrogates to New Vehicle Families via Transfer Learning

Deploying Scientific Machine Learning surrogates in industrial CFD workflows requires adapting pretrained models to new vehicle families without large datasets; yet whether geometric representations learned by a geometry encoder transfer to topologically distinct shapes remains unvalidated. We address this through leave-one-family-out experiments on a 61.47M-parameter Transformer surrogate (AB-UPT) pretrained on four vehicle families (411 external aerodynamics cases) and adapted to the held-out fifth with only 20 samples. Three strategies are compared: Full Fine-Tuning (FFT), Lightweight Fine-Tuning (LFT), and Low-Rank Adaptation (LoRA). The central finding is that pretrained geometry encoders learn transferable representations, but the adaptation mechanism determines whether they can be exploited. FFT destabilizes as 61.47M unconstrained parameters overfit to 20 samples (R^2=0.40); LFT fails because the frozen encoder cannot represent unseen shapes (R^2<0). LoRA resolves both: rank-constrained adapters injected into all layers regularize the loss landscape while preserving pretrained features, achieving R^2=0.85+/-0.02 across all five families with 50% lower force RMSE than FFT and 28% lower pointwise field errors. LoRA also outperforms from-scratch training using 3x more target-family data, eliminating the need for large per-family datasets. These results recast LoRA from a memory-saving convenience into a convergence enabler for geometry transfer: a shared backbone paired with lightweight per-family adapters trainable in hours from minimal data.

翻译：在工业计算流体力学工作流中部署科学机器学习代理模型，需要将预训练模型适配到缺乏大规模数据集的新型车辆族系。然而，几何编码器所习得的几何表征能否迁移至拓扑结构截然不同的形状尚未得到验证。本研究通过留一族系交叉验证实验解决该问题：基于包含411个外部空气动力学案例的四个车辆族系预训练参数量为6147万的Transformer代理模型（AB-UPT），并仅用20个样本适配至被排除的第五个族系。我们比较了三种策略：全参数微调（FFT）、轻量级微调（LFT）与低秩适配（LoRA）。核心发现是：预训练几何编码器可习得可迁移表征，但适配机制决定了其能否被有效利用。全参数微调因6147万个无约束参数在20个样本上产生过拟合而失稳（R²=0.40）；轻量级微调因冻结编码器无法表征未见形状而失效（R²<0）。LoRA同时解决了上述问题：在所有层注入秩约束适配器在保持预训练特征的同时正则化损失景观，在五个族系上均取得R²=0.85±0.02，且力均方根误差较全参数微降低50%，逐点场误差降低28%。相较于使用三倍目标族系数据从头训练，LoRA表现更优，从而消除了对每个族系大规模数据集的需求。这些成果将LoRA从内存节约工具重构为几何迁移的收敛增强器——共享主干网络配合可在数小时内基于最小数据训练完成的轻量级族系适配器。