Building on work of Charton, we train small transformer models to calculate the M\"obius function $\mu(n)$ and the squarefree indicator function $\mu^2(n)$. The models attain nontrivial predictive power. We then iteratively train additional models to understand how the model functions, ultimately finding a theoretical explanation.
翻译:基于Charton的研究,我们训练了小型Transformer模型来计算莫比乌斯函数$\mu(n)$与无平方因子数指示函数$\mu^2(n)$。这些模型展现出非平凡的预测能力。随后,我们通过迭代训练附加模型来解析模型的内在机制,最终获得了理论层面的解释。