
软件学报 ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscas.ac.cn
Journal of Software,2020,31(7):21272156 [doi: 10.13328/j.cnki.jos.006052] http://www.jos.org.cn
©中国科学院软件研究所版权所有. Tel: +86-10-62562563
机器学习隐私保护研究综述
谭作文
,
张连福
(江西财经大学 信息管理学院 计算机科学与技术系,江西 南昌 330013)
通讯作者: 张连福, E-mail: zlf_jx@163.com
摘 要: 机器学习已成为大数据、物联网和云计算等领域的核心技术.机器学习模型训练需要大量数据,这些数
据通常通过众包方式收集,其中含有大量隐私数据,包括个人身份信息(如电话号码、身份证号等)、敏感信息(如金
融财务、医疗健康等信息).如何低成本且高效地保护这些数据是一个重要的问题.介绍了机器学习及其隐私定义
和隐私威胁,重点对机器学习隐私保护主流技术的工作原理和突出特点进行了阐述,并分别按照差分隐私、同态加
密和安全多方计算等机制对机器学习隐私保护领域的研究成果进行了综述.在此基础上,对比分析了机器学习不同
隐私保护机制的主要优缺点.最后,对机器学习隐私保护的发展趋势进行展望,并提出该领域未来可能的研究方向.
关键词: 机器学习;隐私保护;差分隐私;同态加密;安全多方计算
中图法分类号: TP181
中文引用格式: 谭作文,张连福.机器学习隐私保护研究综述.软件学报,2020,31(7):21272156. http://www.jos.org.cn/1000-
9825/6052.htm
英文引用格式: Tan ZW, Zhang LF. Survey on privacy preserving techniques for machine learning. Ruan Jian Xue Bao/Journal of
Software, 2020,31(7):21272156 (in Chinese). http://www.jos.org.cn/1000-9825/6052.htm
Survey on Privacy Preserving Techniques for Machine Learning
TAN Zuo-Wen, ZHANG Lian-Fu
(Department of Computer Science and Technology, School of Information Managemen, Jiangxi University of Finance and Economics,
Nanchang 330013, China)
Abstra ct : Machine learning has become a core technology in areas such as big data, Internet of Things, and cloud computing. Training
machine learning models requires a large amount of data, which is often collected by means of crowdsourcing and contains a large number
of private data including personally identifiable information (such as phone number, id number, etc.) and sensitive information (such as
financial data, health care, etc.). How to protect these data with low cost and high efficiency is an important issue. This paper first
introduces the concept of machine learning, explains various definitions of privacy in machine learning and demonstrates all kinds of
privacy threats encountered in machine learning, then continues to elaborate on the working principle and outstanding features of the
mainstream technology of machine learning privacy protection. According to differential privacy, homomorphic encryption, and secure
multi-party computing, the research achievements in the field of machine learning privacy protection are summarized respectively. On this
basis, the paper comparatively analyzes the main advantages and disadvantages of different mechanisms of privacy preserving for
machine learning. Finally, the developing trend of privacy preserving for machine learning is prospected, and the possible research
directions in this field are proposed.
Key words: machine learning; privacy-preserving; differential privacy; homomorphic encryption; secure multiparty computation
基金项目: 国家自然科学基金(61862028, 61702238); 江西省自然科学基金(20181BAB202016); 江西省教育厅科技项目(GJJ160430);
江西省教育厅青年科技项目(GJJ180288)
Foundation item: National Natural Science Foundation of China (61862028, 61702238); Natural Science Foundation of Jiangxi
Province, China (20181BAB202016); Science and Technology Project of Provincial Education Department of Jiangxi (GJJ160430);
Young Science and Technology Project of Provincial Education Department of Jiangxi (GJJ180288).
收稿时间: 2019-09-10; 修改时间: 2020-02-09, 2020-03-20; 采用时间: 2020-04-09; jos 在线出版时间: 2020-04-21
相关文档
评论