Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections
The data-structures under-pinning collection API (e.g. lists, sets, maps) in the standard libraries of programming languages are used intensively in many applications. The standard libraries of recent Java Virtual Machine (JVM) languages, such as Clojure or Scala, contain scalable and well-performing immutable collection data-structures that are implemented as a Hash-Array Mapped Trie (HAMT). While they already feature efficient lookup, insert, and delete operations, their memory footprints and the runtime performance of iteration and equality checking are less than optimal. This particularly prohibits their application in high-end applications which process larger data sets. In this paper, we propose changes to the HAMT design that increase the overall performance of set and map implementations, without imposing a penalty. The resulting design outperforms Scala’s and Clojure’s data structure implementations in terms of memory footprint and runtime efficiency.