Application profiling is essential for software optimization tasks such as code layout and memory placement, where optimization decisions depend on program behavior. However, modern applications exhibit significant input-dependent variability, limiting the effectiveness of conventional profiling approaches that rely on a single representative execution. We present Phaedrus, a compiler-assisted deep learning framework that predicts dynamic program behavior across diverse execution instances, with a focus on dynamic function call prediction. These predicted call sequences are used to guide input-specific compiler optimizations, enabling code specialization without requiring program execution. Phaedrus introduces two complementary techniques. Application Behavior Synthesis (Dynamis) is a profile-less approach in which large language models infer dynamic behavior directly from source code and static compiler analysis, bypassing traditional profiling. Application Profile Generalization (Morpheus) employs generative models trained on compressed and augmented Whole Program Path (WPP) function profiles to predict application behavior for unseen inputs. Experimental results show that Phaedrus accurately identifies frequently executed and runtime-dominated hotspot functions, covering up to 85-99% of total execution time. Using these predictions, Phaedrus enables superior profile-guided optimizations, achieving an average performance improvement of 6% (upto 25%) and a binary size reduction of 5.19% (upto 19%), without executing the target program. Additionally, Phaedrus reduces WPP function profile sizes by up to $10^{7} \times $.
翻译:应用性能剖析对于代码布局与内存放置等软件优化任务至关重要,此类优化决策依赖于程序行为特征。然而,现代应用程序表现出显著的输入依赖性行为差异,限制了依赖单一代表性执行的传统剖析方法的有效性。本文提出Phaedrus——一种编译器辅助的深度学习框架,旨在预测多样化执行实例中的动态程序行为,重点关注动态函数调用预测。这些预测的调用序列可用于指导针对特定输入的编译器优化,实现无需实际执行程序的代码特化。Phaedrus引入两种互补技术:应用行为合成(Dynamis)采用无剖析方法,通过大型语言模型直接基于源代码与静态编译器分析推断动态行为,绕过传统剖析流程;应用剖析泛化(Morpheus)利用在压缩增强的全程序路径函数剖析数据上训练的生成模型,预测未见输入下的应用行为。实验结果表明,Phaedrus能准确识别高频执行与运行时主导的热点函数,覆盖总执行时间的85-99%。基于这些预测,Phaedrus实现了更优的剖析指导优化,在不执行目标程序的情况下,平均性能提升达6%(最高25%),二进制文件大小减少5.19%(最高19%)。此外,Phaedrus将全程序路径函数剖析数据规模压缩高达$10^{7} \times $。