We propose a theoretical framework for formulating language model decoder algorithms with dynamic programming and information theory. With dynamic programming, we lift the design of decoder algorithms from the logit space to the action-state value function space, and show that the decoding algorithms are consequences of optimizing the action-state value functions. Each component in the action-state value function space has an information theoretical interpretation. With the lifting and interpretation, it becomes evident what the decoder algorithm is optimized for, and hence facilitating the arbitration of the tradeoffs in sensibleness, diversity, and attribution.
翻译:我们提出一个基于动态规划与信息论的语言模型解码算法理论框架。借助动态规划,我们将解码算法的设计从对数空间提升至动作-状态价值函数空间,并证明解码算法是动作-状态价值函数优化的结果。在动作-状态价值函数空间中,每个组成部分都具有信息论解释。通过这种提升与解释,解码算法的优化目标得以明确,从而有助于协调语义合理性、多样性与归因性之间的权衡关系。