Generative translation systems are cultural technologies because they decide how socially meaningful cues are rendered within culturally specific grammatical systems. We study one concrete notion of successful cultural translation: when an English source explicitly encodes gender, an English-to-Hindi translation should preserve the recoverability of that cue unless the source itself is ambiguous. We evaluate this criterion on a 37,345-instance benchmark spanning twelve categories and show that five systems frequently erase gender through ergative and honorific constructions. We then introduce two mechanism-aware inference-time interventions. The first, the Source-Aware Reranker (SAR), prefers candidates that avoid gender-neutralizing syntax. The second, the Phenomenon-Aware Reranker (PAR), preserves gender through targeted lexical marking even when ergative syntax remains. Across GPT-4o-mini and Sarvam, PAR improves target-subset accuracy from 11.07% to 54.47% and from 15.99% to 49.66%, respectively. Human evaluation shows that PAR increases gender preservation from 10.3% to 81.3%, but reduces mean fluency from 4.36 to 3.37. These findings place the two interventions on a preservation and fluency frontier rather than supporting a single dominant solution, and show how culturally situated generation can require explicit tradeoffs among fidelity, fluency, and stylistic naturalness.
翻译:生成式翻译系统本质上是文化技术,因为它们决定了社会意义线索如何在特定文化语法体系中呈现。我们研究了文化翻译成功的一个具体标准:当英语源文本明确编码性别信息时,英译印翻译应保留该线索的可恢复性,除非源文本本身存在歧义。我们在涵盖12个类别的37,345个实例基准上评估该标准,发现五个系统常通过作格结构和敬语结构消除性别信息。随后我们提出两种机制感知推理时干预方法。第一种是源文本感知重排序器(SAR),优先选择避免性别中立化句法的候选翻译。第二种是现象感知重排序器(PAR),即使在作格句法保留的情况下,也能通过目标词汇标记保留性别信息。在GPT-4o-mini和Sarvam系统上,PAR分别将目标子集准确率从11.07%提升至54.47%,从15.99%提升至49.66%。人工评估显示PAR将性别保留率从10.3%提升至81.3%,但平均流畅度从4.36降至3.37。这些发现将两种干预方法置于保真-流畅性前沿这一范畴中,而非支持单一主导方案,并表明文化情境下的生成可能需要明确权衡保真度、流畅性与风格自然度。