The design dataset is the backbone of data-driven design. Ideally, the dataset should be fairly distributed in both shape and property spaces to efficiently explore the underlying relationship. However, the classical experimental design focuses on shape diversity and thus yields biased exploration in the property space. Recently developed methods either conduct subset selection from a large dataset or employ assumptions with severe limitations. In this paper, fairness- and uncertainty-aware data generation (FairGen) is proposed to actively detect and generate missing properties starting from a small dataset. At each iteration, its coverage module computes the data coverage to guide the selection of the target properties. The uncertainty module ensures that the generative model can make certain and thus accurate shape predictions. Integrating the two modules, Bayesian optimization determines the target properties, which are thereafter fed into the generative model to predict the associated shapes. The new designs, whose properties are analyzed by simulation, are added to the design dataset. An S-slot design dataset case study was implemented to demonstrate the efficiency of FairGen in auxetic structural design. Compared with grid and randomized sampling, FairGen increased the coverage score at twice the speed and significantly expanded the sampled region in the property space. As a result, the generative models trained with FairGen-generated datasets showed consistent and significant reductions in mean absolute errors.
翻译:设计数据集是数据驱动设计的基石。理想情况下,数据集应在形状与属性空间上均匀分布,以高效探索潜在关联。然而,经典实验设计侧重于形状多样性,导致属性空间的探索产生偏差。近期开发的方法或需从大规模数据集中进行子集选择,或采用存在严重限制的假设。本文提出了一种面向公平性与不确定性感知的数据生成方法(FairGen),能够从少量数据开始主动检测并生成缺失的属性。在每次迭代中,其覆盖模块计算数据覆盖范围以指导目标属性的选择;不确定性模块确保生成模型能够进行确定且准确的形状预测。通过整合这两个模块,贝叶斯优化确定目标属性,并将其输入生成模型以预测对应的形状。通过仿真分析其特性的新设计将被添加至设计数据集。本研究通过S型槽设计数据集案例,验证了FairGen在拉胀结构设计中的效率。与网格采样和随机采样相比,FairGen将覆盖评分提升速度提高至两倍,并显著扩展了属性空间中的采样区域。最终,基于FairGen生成数据集训练的生成模型,其平均绝对误差呈现出一致且显著的降低。