Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make more comprehensive measurement challenging. With a population of only six million people and an Internet penetration rate of only 38%, it is challenging to either recruit in-country volunteers or obtain vantage points to conduct remote network measurements at scale. We present the largest measurement study to date of Turkmenistan's Web censorship. To do so, we developed TMC, which tests the blocking status of millions of domains across the three foundational protocols of the Web (DNS, HTTP, and HTTPS). Importantly, TMC does not require access to vantage points in the country. We apply TMC to 15.5M domains, our results reveal that Turkmenistan censors more than 122K domains, using different blocklists for each protocol. We also reverse-engineer these censored domains, identifying 6K over-blocking rules causing incidental filtering of more than 5.4M domains. Finally, we use Geneva, an open-source censorship evasion tool, to discover five new censorship evasion strategies that can defeat Turkmenistan's censorship at both transport and application layers. We will publicly release both the data collected by TMC and the code for censorship evasion.
翻译:自2006年以来,土库曼斯坦因其广泛受审查的互联网和严格管控的信息政策,被无国界记者组织列入少数“互联网公敌”国家之列。现有关于土库曼斯坦网络过滤的报告依赖于少量观测点或测试少量网站。然而,该国互联网普及率低且人口稀少,使得开展更全面的测量面临挑战。在仅有600万人口和38%互联网渗透率的条件下,招募境内志愿者或获取观测点以进行大规模远程网络测量十分困难。我们提出了迄今为止关于土库曼斯坦网络审查的最大规模测量研究。为此,我们开发了TMC系统,用于测试数百万个域名在三大基础网络协议(DNS、HTTP和HTTPS)下的屏蔽状态。值得注意的是,TMC无需借助该国境内的观测点。我们将TMC应用于1550万个域名,结果表明土库曼斯坦使用不同协议对应的封禁列表,屏蔽了超过12.2万个域名。我们进一步逆向分析了这些被屏蔽域名,识别出6000条过度封锁规则,导致超过540万个域名被连带过滤。最后,我们利用开源审查规避工具Geneva,发现了五种可在传输层和应用层突破土库曼斯坦审查的新规避策略。我们将公开发布TMC收集的数据及审查规避代码。