This work describes the participation of the MLLP-VRAIN research group in the shared task of the IWSLT 2026 Simultaneous Speech Translation track. Our submission utilizes the recently released Parakeet and Qwen 3.5 models to create a robust, cascaded solution for long-form SimulST through the use of adaptive "black-box" policies. We explore relaxations of these policies to achieve better quality-latency trade-offs. Compared to last year, we participate on all language directions. In addition to this, for the En$\rightarrow${De, It, Zh} directions we also participate in this year's new context track employing a combination of ASR word-boosting and a RAG mechanism of offline pre-translated exemplars to guide generation and enrich our system with domain-specific context. Finally, we provide a detailed latency analysis of our system. Compared to last year, results on the MCIF En$\rightarrow$De test set shows a substantial quality improvement of +5.82 XCOMET-XL. Our context track processing further improves performance by +1.03.
翻译:本文描述了MLLP-VRAIN研究组参与IWSLT 2026同声传译赛道共享任务的情况。我们的提交方案利用了近期发布的Parakeet和Qwen 3.5模型,通过自适应"黑盒"策略构建了一个稳健的级联解决方案,用于长格式SimulST。我们探索了这些策略的松弛方案,以实现更优的质量-延迟权衡。与去年相比,我们参与了所有语言方向的翻译。此外,针对En$\rightarrow${De, It, Zh}方向,我们还参与了今年新增的上下文跟踪任务,采用ASR词汇增强与基于检索增强生成(RAG)机制的离线预翻译实例相结合的方法,以引导生成并丰富系统的领域特定上下文。最后,我们提供了系统详细的延迟分析。与去年相比,在MCIF En$\rightarrow$De测试集上的结果显示,质量显著提升了+5.82 XCOMET-XL。我们的上下文跟踪处理进一步将性能提升了+1.03。