On the optimal time/space tradeoff for hash tables

Michael A. Bender, Martín Farach-Colton, John Kuszmaul, William Kuszmaul, Mingmou Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

For nearly six decades, the central open question in the study of hash tables has been to determine the optimal achievable tradeoff curve between time and space. State-of-the-art hash tables offer the following guarantee: If keys/values are (logn) bits each, then it is possible to achieve constant-time insertions/deletions/queries while wasting only O(loglogn) bits of space per key when compared to the information-theoretic optimum-this bound has been proven to be optimal for a number of closely related problems (e.g., stable hashing, dynamic retrieval, and dynamically-resized filters). This paper shows that O(loglogn) wasted bits per key is not the end of the line for hashing. In fact, for any k [log∗ n], it is possible to achieve O(k)-time insertions/deletions, O(1)-time queries, and the k-th iterated logarithm O(log(k) n) wasted bits per key (all with high probability in n), while also supporting dynamic resizing as the size of the table changes. We further show that this tradeoff curve is the best achievable by any of a large class of hash tables, including any hash table designed using the current framework for making constant-time hash tables succinct. Our result holds for arbitrarily large keys/values, and in the case where keys/values are very small, we can tighten our bounds to o(1) wasted bits per key. Building on this, we obtain a constant-time dynamic filter that uses n glog"-1 g‰+ n loge + o(n) bits of space for a wide choice of false-positive rates ", resolving a long-standing open problem for the design of dynamic filters.

Original languageEnglish (US)
Title of host publicationSTOC 2022 - Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing
EditorsStefano Leonardi, Anupam Gupta
PublisherAssociation for Computing Machinery
Pages1284-1297
Number of pages14
ISBN (Electronic)9781450392648
DOIs
StatePublished - Sep 6 2022
Event54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022 - Rome, Italy
Duration: Jun 20 2022Jun 24 2022

Publication series

NameProceedings of the Annual ACM Symposium on Theory of Computing

Conference

Conference54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022
Country/TerritoryItaly
CityRome
Period6/20/226/24/22

ASJC Scopus subject areas

  • Software

Keywords

  • balls and bins
  • data structures
  • hash tables
  • succinct

Cite this