This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task. Rather than the probabilistic outputs common to neural machine translation models, our approach employs a regression-based output layer to reconstruct the input sequence's word vectors. The model achieves high accuracy and fast training with the ADAM optimizer, a significant finding given that RNNs typically require memory units, such as LSTMs, or second-order optimization methods. We incorporate residual connections and introduce a "match drop" technique, where gradients are calculated only for incorrect words. Our approach demonstrates potential for various natural language processing applications, particularly in neural network-based systems that require high-quality sentence embeddings.
翻译:本研究提出了一种基于残差循环网络的新型可逆句子嵌入模型,该模型通过无监督编码任务进行训练。与神经机器翻译模型中常见的概率输出不同,本方法采用基于回归的输出层来重构输入序列的词向量。该模型在使用ADAM优化器时实现了高精度与快速训练,这一发现具有重要意义,因为循环神经网络通常需要LSTM等记忆单元或二阶优化方法。我们引入残差连接并提出了"匹配丢弃"技术——仅针对错误词计算梯度。本方法在多种自然语言处理应用中展现出潜力,尤其适用于需要高质量句子嵌入的基于神经网络的系统。