How can we process a piece of recorded music to detect and visualize the onset of each instrument? A simple, interpretable approach is based on partially fixed nonnegative matrix factorization (NMF). Yet despite the method's simplicity, partially fixed NMF is challenging to apply because the associated optimization problem is high-dimensional and non-convex. This paper explores two optimization approaches that preserve the nonnegative structure, including a multiplicative update rule and projected gradient descent with momentum. These techniques are derived from the previous literature, but they have not been fully developed for partially fixed NMF before now. Results indicate that projected gradient descent with momentum leads to the higher accuracy among the two methods, and it satisfies stronger local convergence guarantees.
翻译:我们如何通过处理一段录音音乐来检测并可视化每个乐器的起始点?一种简单且可解释的方法基于部分固定的非负矩阵分解(NMF)。然而,尽管该方法原理简单,部分固定的NMF在实际应用中仍面临挑战,因为其相关的优化问题具有高维性和非凸性。本文探讨了两种保持非负结构的优化方法,包括乘法更新规则和带有动量的投影梯度下降法。这些技术源自已有文献,但此前尚未在部分固定的NMF中得到充分发展。结果表明,在两种方法中,带有动量的投影梯度下降法具有更高的准确性,并且满足更强的局部收敛保证。