Scaling laws are used to plan multi-million-dollar training runs, but fitting those laws can itself cost millions. In modern large-scale workflows, assembling a sufficiently informative set of pilot experiments is already a major budget-allocation problem rather than a routine preprocessing step. We formulate scaling-law fitting as budget-aware sequential experimental design: given a finite pool of runnable experiments with heterogeneous costs, choose which runs to execute so as to maximize extrapolation accuracy in a high-cost target region. We then propose an uncertainty-aware method for sequentially allocating experimental budget toward the runs most useful for target-region extrapolation. Across a diverse benchmark of scaling-law tasks, our method consistently outperforms classical design-based baselines, and often approaches the performance of fitting on the full experimental set while using only about 10% of the total training budget. Our code is available at https://github.com/PlanarG/active-sl.
翻译:標度律常用於規劃耗資數百萬美元的訓練任務,但擬合這些定律本身就可能耗費同樣高昂的成本。在現代大規模工作流程中,構建一組信息量充足的預備實驗已成為一項主要的預算分配問題,而非例行預處理步驟。我們將標度律擬合建模為預算感知的序貫實驗設計:在給定一組成本各異的可運行實驗的有限集合時,選擇執行哪些實驗以最大化高成本目標區域的外推精度。隨後,我們提出了一種不確定性感知方法,用於序貫性地將實驗預算分配給對目標區域外推最有效的實驗。在一個多樣的標度律任務基準測試中,我們的方法始終優於基於經典設計的基線方法,且在使用僅約總訓練預算10%的情況下,其性能常接近使用完整實驗集擬合的結果。我們的代碼開源於https://github.com/PlanarG/active-sl。