We introduce a novel architecture, the Neuromodulation Gated Transformer (NGT), which is a simple implementation of neuromodulation in transformers via a multiplicative effect. We compare it to baselines and show that it results in the best average performance on the SuperGLUE benchmark validation sets.
翻译:我们提出了一种新型架构——神经调控门控Transformer(NGT),该架构通过乘法效应在Transformer中实现了神经调控的简洁实现。我们将其与基线模型进行对比,结果表明它在SuperGLUE基准验证集上取得了最佳平均性能。