This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task. Rather than the probabilistic outputs common to neural machine translation models, our approach employs a regression-based output layer to reconstruct the input sequence's word vectors. The model achieves high accuracy and fast training with the ADAM optimizer, a significant finding given that RNNs typically require memory units, such as LSTMs, or second-order optimization methods. We incorporate residual connections and introduce a "match drop" technique, where gradients are calculated only for incorrect words. Our approach demonstrates potential for various natural language processing applications, particularly in neural network-based systems that require high-quality sentence embeddings.
翻译:本研究提出了一种基于残差循环网络的新型可逆句子嵌入模型,该模型通过无监督编码任务进行训练。与神经机器翻译模型中常见的概率输出不同,我们的方法采用基于回归的输出层来重构输入序列的词向量。该模型在使用ADAM优化器时实现了高精度和快速训练,这一发现具有重要意义,因为循环网络通常需要LSTM等记忆单元或二阶优化方法。我们引入了残差连接,并提出了“匹配丢弃”技术,即仅对错误计算的单词进行梯度计算。我们的方法展示了在多种自然语言处理应用中的潜力,尤其适用于需要高质量句子嵌入的神经网络系统。