Towards Grammatical Tagging for the Legal Language of Cybersecurity

Legal language can be understood as the language typically used by those engaged in the legal profession and, as such, it may come both in spoken or written form. Recent legislation on cybersecurity obviously uses legal language in writing, thus inheriting all its interpretative complications due to the typical abundance of cases and sub-cases as well as to the general richness in detail. This paper faces the challenge of the essential interpretation of the legal language of cybersecurity, namely of the extraction of the essential Parts of Speech (POS) from the legal documents concerning cybersecurity. The challenge is overcome by our methodology for POS tagging of legal language. It leverages state-of-the-art open-source tools for Natural Language Processing (NLP) as well as manual analysis to validate the outcomes of the tools. As a result, the methodology is automated and, arguably, general for any legal language following minor tailoring of the preprocessing step. It is demonstrated over the most relevant EU legislation on cybersecurity, namely on the NIS 2 directive, producing the first, albeit essential, structured interpretation of such a relevant document. Moreover, our findings indicate that tools such as SpaCy and ClausIE reach their limits over the legal language of the NIS 2.

翻译：法律语言可理解为法律从业者通常使用的语言，其形式既包括口语也包括书面语。近期关于网络安全的立法显然以书面形式运用法律语言，因此继承了因典型案例和子案例的丰富性以及细节的普遍详尽性而导致的所有解释复杂性。本文面临网络安全法律语言基本解释的挑战，即从涉及网络安全的法律文档中提取核心词性。我们通过法律语言词性标注方法克服了这一挑战。该方法利用最先进的开源自然语言处理工具，并结合人工分析以验证工具的输出结果。最终，该方法实现了自动化，并且经过预处理步骤的微调后，原则上可推广至任何法律语言。我们在欧盟最重要的网络安全立法（即NIS 2指令）上进行了验证，生成了这一重要文档的首个（尽管是基础性的）结构化解释。此外，我们的研究结果表明，诸如SpaCy和ClausIE等工具在处理NIS 2的法律语言时已接近其能力极限。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日