Zig: the Hash Map Types Maze

November 1, 2025 3 分钟阅读

1. Zig’s `HashMap` Hierarchy

If you’ve ever been confused by Zig’s HashMap types, you’re not alone. With 12 different HashMap types in the standard library and some misleading names, it can feel overwhelming. But there’s actually a clear, logical structure once you understand the categorization.

The Big Picture: 12 Hash Maps, 3 Dimensions

Zig’s hash maps vary across three independent dimensions:

Storage
- Unordered
- Ordered
Contexy
- Manual
- Auto
- String
Memory Management
- Managed
- Unmanaged

This gives us: $2 \times 3 \times 2 = 12$ total HashMap types. See std.

Dimension 1: Storage

This is the most fundamental choice, as it affects performance characteristics and whether insertion order is preserved. In standard lib, they are even implemented as individual modules. See hash_map.zig and array_hash_map.zig.

HashMap: The classic hash table implementation.
- Fast lookups, but no order guarantees.
ArrayHashMap: Uses double arrays to store keys and values separately, with a hash table for indexing.
- Preserves insertion order.

Dimension 2: Context (i.e. Key Handling)

In Zig, a Context type, which is essentially a set of the hashing and equality functions (similar to Python’s __hash__ and __eq__ methods, or Java’s hashCode() and equals()), determines how keys are hashed and compared.

Manual Context: `HashMap` / `ArrayHashMap`

The base types where you provide everything yourself. Maximum control, maximum verbosity.

const std = @import("std");

var map = std.HashMap(
    []const u8,                       // key type
    i32,                              // value type
    std.hash_map.StringContext,       // .ctx
    80                                // .max_load_percentage (0 ~ 100)
).init(allocator);

Best when you have custom types needing special hashing or equality logic.

`Auto` Context: `AutoHashMap` / `AutoArrayHashMap`

Uses Zig’s automatic hashing (std.hash.autoHash) and equality comparison (std.meta.eql(a, b)).

var m1 = std.AutoHashMap(u32, []const u8).init(allocator);
var m2 = std.AutoArrayHashMap(u32, []const u8).init(allocator);

Works well with:

Integers (u32, i64, etc.)
Enums
Structs with value semantics (no internal pointers/slices)

Does NOT work with:

Strings ([]const u8) - compares pointers, not content
Slices in general
Structs containing pointers (if you want deep equality)

Important: Despite the name, AutoHashMap is not truly “general purpose” – it’s optimized for value keys only.

`String` Context: `StringHashMap` / `StringArrayHashMap`

Specialized for string keys ([]const u8). Hashes and compares string content, not pointers.

var m1 = std.StringHashMap(i32).init(allocator);
var m2 = std.StringArrayHashMap(i32).init(allocator);

Important: This is the correct choice for string keys, not AutoHashMap.

Dimension 3: Memory Management

Managed (Default)

Stores the allocator inside the hash map structure (as .allocator).

var map = std.AutoHashMap(u32, u32).init(allocator);
map.put(1, 100);        // no allocator parameter needed
map.remove(1);          // no allocator parameter needed
map.deinit();           // uses stored allocator

Pros: More convenient, less parameter passing
Cons: Slightly larger memory footprint (stores allocator pointer)

`Unmanaged`

Does not store the allocator. You must pass it to every method that allocates.

var map = std.AutoHashMapUnmanaged(u32, u32){};
map.put(allocator, 1, 100);      // must pass allocator
map.remove(allocator, 1);        // must pass allocator
map.deinit(allocator);           // must pass allocator

Pros: Smaller memory footprint, more flexible (can use different allocators)
Cons: More verbose, easy to forget allocator parameter

2. Summary

Naming Pattern

The naming follows a pattern like [Context][Storage][Memory], like StringArrayHashMapUnmanaged = String context + Array storage + Unmanaged allocator.

Decision Tree: Which One Should I Use?

Follow this simple decision tree:

1. Do I need insertion order preserved?

No $\Rightarrow$ Use HashMap family
Yes $\Rightarrow$ Use ArrayHashMap family

2. Type of my keys?

Strings $\Rightarrow$ Use String*HashMap
Integers/Enums/Simple structs $\Rightarrow$ Use Auto*HashMap
Custom types needing special hashing $\Rightarrow$ Use base HashMap / ArrayHashMap

3. Do I want the convenience of stored allocator?

Yes (most common) $\Rightarrow$ Use managed version (default)
No (need flexibility/smaller size) $\Rightarrow$ Use *Unmanaged version

Performance Considerations

HashMap (Traditional)

Lookups Fast (O(1) average)
Iteration: Slower, cache-unfriendly
Memory: Moderate overhead per entry

ArrayHashMap (Array-backed)

Lookups: Slightly slower (O(1) but with extra indirection)
Iteration: Very fast, cache-friendly
Memory: More efficient for dense maps

Rule of thumb: If you iterate frequently, use ArrayHashMap. If you mostly do lookups, use HashMap.

X Facebook LinkedIn Bluesky

Zig: the Hash Map Types Maze

1. Zig’s `HashMap` Hierarchy

The Big Picture: 12 Hash Maps, 3 Dimensions

Dimension 1: Storage

Dimension 2: Context (i.e. Key Handling)

Manual Context: `HashMap` / `ArrayHashMap`

`Auto` Context: `AutoHashMap` / `AutoArrayHashMap`

`String` Context: `StringHashMap` / `StringArrayHashMap`

Dimension 3: Memory Management

Managed (Default)

`Unmanaged`

2. Summary

Naming Pattern

Decision Tree: Which One Should I Use?

Performance Considerations

分享

留下评论

猜您还喜欢

Zig: `defer` / `errdefer`

Bootstrapping in Compiler Design / Self-Hosting Compilers

Left recursion is not a problem for LR parsing

Packrat Parsing: Left Recursion Handling with Seed Parsing

1. Zig’s HashMap Hierarchy

The Big Picture: 12 Hash Maps, 3 Dimensions

Dimension 1: Storage

Dimension 2: Context (i.e. Key Handling)

Manual Context: HashMap / ArrayHashMap

Auto Context: AutoHashMap / AutoArrayHashMap

String Context: StringHashMap / StringArrayHashMap

Dimension 3: Memory Management

Managed (Default)

Unmanaged

2. Summary

Naming Pattern

Decision Tree: Which One Should I Use?

Performance Considerations

分享

留下评论

猜您还喜欢

Zig: defer / errdefer

Bootstrapping in Compiler Design / Self-Hosting Compilers

Left recursion is not a problem for LR parsing

Packrat Parsing: Left Recursion Handling with Seed Parsing

1. Zig’s `HashMap` Hierarchy

Manual Context: `HashMap` / `ArrayHashMap`

`Auto` Context: `AutoHashMap` / `AutoArrayHashMap`

`String` Context: `StringHashMap` / `StringArrayHashMap`

`Unmanaged`

Zig: `defer` / `errdefer`