Research in natural language processing has demonstrated that the quality of generations from trained autoregressive language models is significantly influenced by the used sampling strategy. In this study, we investigate the impact of different sampling techniques on musical qualities such as diversity and structure. To accomplish this, we train a high-capacity transformer model on a vast collection of highly-structured Irish folk melodies and analyze the musical qualities of the samples generated using distribution truncation sampling techniques. Specifically, we use nucleus sampling, the recently proposed "typical sampling", and conventional ancestral sampling. We evaluate the effect of these sampling strategies in two scenarios: optimal circumstances with a well-calibrated model and suboptimal circumstances where we systematically degrade the model's performance. We assess the generated samples using objective and subjective evaluations. We discover that probability truncation techniques may restrict diversity and structural patterns in optimal circumstances, but may also produce more musical samples in suboptimal circumstances.
翻译:自然语言处理研究表明,经过训练的自回归语言模型生成的文本质量显著受采样策略影响。本研究探讨不同采样技术对音乐多样性及结构等音乐特性的影响。为此,我们在大规模结构化的爱尔兰传统民谣数据集上训练高性能Transformer模型,分析使用分布截断采样技术生成的样本音乐特性。具体采用核采样(nucleus sampling)、近期提出的"典型采样"(typical sampling)以及传统祖先采样(ancestral sampling)。我们在两种场景下评估这些采样策略的效果:模型校准良好的最优环境,以及系统降低模型性能的次优环境。通过客观与主观评价对生成样本进行评估。研究发现,概率截断技术在最优环境下可能限制多样性与结构模式,但在次优环境下反而能生成更具音乐性的样本。