Accurately predicting the performance of architecture with small sample training is an important but not easy task. How to analysis and train dataset to overcome overfitting is the core problem we should deal with. Meanwhile if there is the mult-task problem, we should also think about if we can take advantage of their correlation and estimate as fast as we can. In this track, Super Network builds a search space based on ViT-Base. The search space contain depth, num-heads, mpl-ratio and embed-dim. What we done firstly are pre-processing the data based on our understanding of this problem which can reduce complexity of problem and probability of over fitting. Then we tried different kind of models and different way to combine them. Finally we choose stacking ensemble models using GP-NAS with cross validation. Our stacking model ranked 1st in CVPR 2022 Track 2 Challenge.
翻译:通过小样本训练准确预测架构性能是一项重要但不易完成的任务。如何分析和训练数据集以克服过拟合是我们需要处理的核心问题。同时,若存在多任务问题,我们还应考虑能否利用其相关性并尽可能快速地进行估计。在此任务中,超级网络基于ViT-Base构建搜索空间,该搜索空间包含深度、注意力头数、MLP比率和嵌入维度等参数。我们首先基于对问题的理解进行数据预处理,以降低问题的复杂度和过拟合概率。随后尝试了不同模型及其组合方式,最终选择使用交叉验证的GP-NAS堆叠集成模型。我们的堆叠模型在CVPR 2022赛道2挑战赛中荣获第一名。