InstaHide: Instance-hiding schemes for private distributed learning

Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora

Research output: Chapter in Book/Report/Conference proceedingConference contribution


How can multiple distributed entities collaboratively train a shared deep net on their private data while preserving privacy? This paper introduces InstaHide, a simple encryption of training images, which can be plugged into an existing distributed deep learning pipeline. The encryption is efficient and has minor effect on test accuracy. InstaHide encrypts each training image with a "one-time secret key"which consists of mixing a number of randomly chosen images and applying a random pixel-wise mask. Other contributions of this paper include: (a) Using a large public dataset (e.g. ImageNet) for mixing during its encryption, which improves security. (b) Experimental results to show effectiveness in preserving privacy against known attacks with only minor effects on accuracy. (c) Theoretical analysis showing that successfully attacking privacy requires attackers to solve a difficult computational problem. (d) Demonstrating that Mixup alone is insecure (as contrary to recent proposals), by presenting some efficient attacks. (e) Release of a challenge dataset1 to encourage new attacks.

Original languageAmerican English
Title of host publication37th International Conference on Machine Learning, ICML 2020
EditorsHal Daume, Aarti Singh
PublisherInternational Machine Learning Society (IMLS)
Number of pages12
ISBN (Electronic)9781713821120
StatePublished - 2020
Event37th International Conference on Machine Learning, ICML 2020 - Virtual, Online
Duration: Jul 13 2020Jul 18 2020

Publication series

Name37th International Conference on Machine Learning, ICML 2020


Conference37th International Conference on Machine Learning, ICML 2020
CityVirtual, Online

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Cite this