Analysis of Lossy Generative Data Compression for Robust Remote Deep Inference

Mathew Williams, Silvija Kokalj-Filipovic, Armani Rodriguez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Networks of wireless sensors, including Internet of Things (IoT), motivate the use of lossy compression of the sensor data to match the available network bandwidth (BW). Hence, sensor data intended for inference by a remote deep learning (RDL) model is likely to be reconstructed with distortion, from a compressed representation received by the remote user over a wireless channel. Our focus is a particular type of lossy compression algorithm based on DL models, and known as learned compression (LC). The link between the information loss and compression rate in LCs has not been studied yet in the framework of information theory, nor practically associated with any meta-data which could describe the type and level of information loss to downstream users. This may make this compression undetectable yet potentially harmful. We study the robustness of a RDL classification model against the lossy compression of the input, including the robustness under an adversarial attack. We apply different compression methods of MNIST images, such as JPEG and a hierarchical LC, all with different compression ratios. For any lossy reconstruction and its uncompressed original, several techniques for topological feature characterization based on persistent homology are used to highlight the important differences amongst compression approaches that may affect the robust accuracy of a DL classifier trained on the original data. We conclude that LC is preferred in the described context, because we achieve the same accuracy as with the originals (with and without an adversarial attack) on a trained DL MNIST classifier, while using only 1/4 of the BW. We show that calculated topological features differ between JPEG and the comparable LC reconstructions, which are closer to the features of the original. We show that there is a distribution shift in those features due to the attack. Finally, most LC models are generative, meaning that we can generate multiple statistically independent compressed representations of a data point, which opens the possibility for the inference error correction at the RDL model. Due to space limitations, we leave this aspect for future work.

Original languageAmerican English
Title of host publicationWiseML 2023 - Proceedings of the 2023 ACM Workshop on Wireless Security and Machine Learning
PublisherAssociation for Computing Machinery, Inc
Pages33-38
Number of pages6
ISBN (Electronic)9798400701337
DOIs
StatePublished - Jun 1 2023
Event5th ACM Workshop on Wireless Security and Machine Learning, WiseML 2023 - Guildford, United Kingdom
Duration: Jun 1 2023 → …

Publication series

NameWiseML 2023 - Proceedings of the 2023 ACM Workshop on Wireless Security and Machine Learning

Conference

Conference5th ACM Workshop on Wireless Security and Machine Learning, WiseML 2023
Country/TerritoryUnited Kingdom
CityGuildford
Period6/1/23 → …

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Cite this