Symbolic Regression is a powerful data-driven technique that searches for mathematical expressions that explain the relationship between input variables and a target of interest. Due to its efficiency and flexibility, Genetic Programming can be seen as the standard search technique for Symbolic Regression. However, the conventional Genetic Programming algorithm requires storing all data in a central location, which is not always feasible due to growing concerns about data privacy and security. While privacy-preserving research has advanced recently and might offer a solution to this problem, their application to Symbolic Regression remains largely unexplored. Furthermore, the existing work only focuses on the horizontally partitioned setting, whereas the vertically partitioned setting, another popular scenario, has yet to be investigated. Herein, we propose an approach that employs a privacy-preserving technique called Secure Multiparty Computation to enable parties to jointly build Symbolic Regression models in the vertical scenario without revealing private data. Preliminary experimental results indicate that our proposed method delivers comparable performance to the centralized solution while safeguarding data privacy.
翻译:符号回归是一种强大的数据驱动技术,用于搜索能够解释输入变量与目标变量之间关系的数学表达式。由于其高效性和灵活性,遗传编程常被视为符号回归的标准搜索技术。然而,传统遗传编程算法需将所有数据集中存储,而日益增长的数据隐私与安全考量使得这一要求难以实现。尽管近年来隐私保护研究取得了进展,或许能为该问题提供解决方案,但其在符号回归中的应用仍鲜有探索。此外,现有工作仅聚焦于水平分割场景,而垂直分割场景——另一种常见场景——尚未被研究。本文提出一种方法,利用名为安全多方计算的隐私保护技术,使各方能够在垂直场景中共同构建符号回归模型,同时不泄露私有数据。初步实验结果表明,所提方法在保障数据隐私的同时,性能与集中式方案相当。