Cloud storage has become the backbone of modern data infrastructure, yet privacy and efficient data retrieval remain significant challenges. Traditional privacy-preserving approaches primarily focus on enhancing database security but fail to address the automatic identification of sensitive information before encryption. This can dramatically reduce query processing time and mitigate errors during manual identification of sensitive information, thereby reducing potential privacy risks. To address this limitation, this research proposes an intelligent privacy-preserving query optimization framework that integrates Named Entity Recognition (NER) to detect sensitive information in queries, utilizing secure data encryption and query optimization techniques for both sensitive and non-sensitive data in parallel, thereby enabling efficient database optimization. Combined deep learning algorithms and transformer-based models to detect and classify sensitive entities with high precision, and the Advanced Encryption Standard (AES) algorithm to encrypt, with blind indexing to secure search functionality of the sensitive data, whereas non-sensitive data was divided into groups using the K-means algorithm, along with a rank search for optimization. Among all NER models, the Deep Belief Network combined with Long Short-Term Memory (DBN-LSTM) delivers the best performance, with an accuracy of 93% and precision (94%), recall, and F1 score of 93%, and 93%, respectively. Besides, encrypted search achieved considerably faster results with the help of blind indexing, and non-sensitive data fetching also outperformed traditional clustering-based searches. By integrating sensitive data detection, encryption, and query optimization, this work advances the state of privacy-preserving computation in modern cloud infrastructures.
翻译:云存储已成为现代数据基础设施的支柱,然而隐私保护和高效数据检索仍然是重大挑战。传统的隐私保护方法主要侧重于增强数据库安全性,但未能解决加密前敏感信息的自动识别问题。自动识别能够显著减少查询处理时间,并降低人工识别敏感信息过程中的错误,从而减少潜在的隐私风险。为应对这一局限,本研究提出了一种智能隐私保护查询优化框架,该框架集成命名实体识别(NER)以检测查询中的敏感信息,并并行利用安全数据加密与查询优化技术处理敏感与非敏感数据,从而实现高效的数据库优化。研究结合深度学习算法与基于Transformer的模型,以高精度检测和分类敏感实体,并采用高级加密标准(AES)算法进行加密,同时利用盲索引保障敏感数据的搜索功能安全;而非敏感数据则使用K-means算法进行分组,并结合排名搜索进行优化。在所有NER模型中,深度信念网络与长短期记忆网络结合模型(DBN-LSTM)表现出最佳性能,其准确率达到93%,精确率(94%)、召回率与F1分数分别为93%和93%。此外,借助盲索引,加密搜索获得了显著更快的检索结果,非敏感数据获取也优于传统的基于聚类的搜索方法。通过整合敏感数据检测、加密与查询优化,本研究推动了现代云基础设施中隐私保护计算的发展。