David M. Lloyd: VarHandle fundamentals

David M. Lloyd's third installment in his "New Reflection" series dropped, this time on VarHandle fundamentals, and it's worth your time. The piece does what its title promises: it lays out what VarHandle is, how to acquire one, what the access modes mean, and why this thing exists in the first place as a typed, access-controlled, ordering-aware alternative to sun.misc.Unsafe, which one hoped never to use.

If you've been following the series (and if you haven't, the MethodHandles installment is worth taking a look at), Lloyd's framing should feel familiar: the JDK keeps quietly building safer versions of unsafe access, and VarHandle is the variable-access side of that effort.

The summary, in case you're skimming: VarHandle gives you a handle on a variable the way MethodHandle gives you a handle on a method. The variable can be a field, an array element, or - and this is where things get interesting - a chunk of off-heap memory.

Once you have the handle, you can read or write it using any of a number of access modes ranging from plain "just read it" to release-acquire to compare-and-set, all the way through the full opaque/acquire/release/volatile lattice that lets you tune memory ordering precisely. You can do this on a field that wasn't declared volatile, which is the bit Lloyd correctly identifies as a small philosophical shift: the call site, not the declaration, gets to choose the ordering strength.

He walks through acquisition, access modes, and basic usage. It's genuinely good stuff.

But the most striking thing in the piece isn't in the main flow. It's an aside, almost a casual mention, about off-heap access, and that aside deserves a post of its own1.

The off-heap thing

Consider the case of large binary files: geospatial imagery, scientific instrument output, video, anything where a single asset can hit forty, sixty, a hundred gigabytes, a terabyte. You do not - you really, really do not - want to pull that onto the Java heap to look at it, if you even can. You want to memory-map it, look at the header, find the directory, read what you actually need, and let the OS page in only the bytes you actually touched.

For years, doing this safely in Java has been "an adventure." ImageIO simply can't do it and doesn't pretend to; there are libraries that use native access to do it, but they're often commercial and/or unreliable. ByteBuffer works, but its API is a relic, and its size limits (two gigabytes per buffer, signed-integer offsets) make any file bigger than 2GB an exercise in chunking, at best. sun.misc.Unsafe works, but it's unsafe, and the JDK has been signaling for years that it's going away2. Direct ByteBuffer plus reflection plus prayer was the production answer, and the production answer was .. err... "unsatisfactory."

Many shops are still on Java 17, where the Foreign Function and Memory API has been previewing nearby but not in a form anyone could ship in good conscience. Now, we can jump away from hopes and dreams and make it work for real: Java 25's FFM API is finalized, and VarHandle is how you actually talk to off-heap memory in a typed, ordered, safe way. No Unsafe. No reflection necessary. Predictability.

What it looks like

Let's say we have... a BigTIFF file on disk. BigTIFF is the variant of TIFF that exists specifically because classic TIFF's 32-bit offsets ran out of room. The format was extended in the 2000s so that GIS folks (and astronomers, and microscopists, and anyone else with a single-file problem bigger than four gigabytes) could keep cramming things into single files past the old ceiling. A BigTIFF can be arbitrarily large, its offsets are 64-bit, and its structure looks like this: an 8-byte header followed by Image File Directories (IFDs) chained together at offsets based on a series of pointers. "IFD0" is the first one. The first eight bytes of any IFD in BigTIFF give you the number of entries in it.

We want to memory-map the file, confirm it's actually a BigTIFF, and read how many entries the first directory has. In Java 25, with FFM and VarHandle, that looks like this:

public static long readIfd0EntryCount(Path file) 
	throws IOException {
    try (FileChannel fc = FileChannel
	    .open(file, StandardOpenOption.READ);
        Arena arena = Arena.ofShared()) {
        MemorySegment seg = fc.
	        map(FileChannel.MapMode.READ_ONLY, 0, fc.size(), arena);

        // Bytes 0-1 declare byte order: 'II' is little-endian, 'MM' big.
        ByteOrder order = (seg.get(ValueLayout.JAVA_BYTE, 0) == 'I')
            ? ByteOrder.LITTLE_ENDIAN
            : ByteOrder.BIG_ENDIAN;

        // TIFF makes no alignment guarantees; use the unaligned layouts.
        VarHandle u16 = ValueLayout.JAVA_SHORT_UNALIGNED
	        .withOrder(order).varHandle();
        VarHandle u64 = ValueLayout.JAVA_LONG_UNALIGNED
	        .withOrder(order).varHandle();

        // Version 43 means BigTIFF; classic TIFF is 42.
        short version = (short) u16.get(seg, 2L);
        if (version != 43) {
            throw new IllegalArgumentException
	            ("Not a BigTIFF: version " + version);
        }

        // 64-bit IFD offsets, parked at byte 8.
        long ifd0Offset = (long) u64.get(seg, 8L);

        // First 8 bytes of an IFD in BigTIFF: 
        // entry count as a uint64.
        return (long) u64.get(seg, ifd0Offset);
    }
}

FileChannel.map hands back a MemorySegment representing the entire file as a memory region. The file might be sixty gigabytes (or more); that's fine. Nothing has been read yet. The operating system has been told "this file is now addressable as memory, please page in the bits I ask for." On a modern OS, this is genuinely free until a page is touched.

Arena.ofShared() is the lifetime manager from FFM. When the arena closes, the mapping is released, deterministically, at a known point in the code. Compare that to the old DirectByteBuffer, which relied on a cleaner running at some unspecified time after the buffer became unreachable; this is a code smell, because it can wreck your JVM when memory-mapped buffers pile up faster than the GC notices them. Scoped, explicit lifetime is not a small upgrade.

The byte-order check is two bytes of disk read, no VarHandle ceremony required, because at that point we don't yet know which way to interpret multi-byte values. Once we do, we build two VarHandle references: one for 16-bit reads, one for 64-bit reads. The byte order is baked into the layout, which means we never have to write byte-swapping code by hand. The VarHandle handles it. The same code reads a TIFF written on an ARM box and one written on x86.

The _UNALIGNED variants are not optional here. TIFF tag entries fall on whatever offset the file happens to place them, and 8-byte fields will routinely land on non-8-byte-aligned addresses. The aligned VarHandle references would throw exceptions at runtime. The unaligned ones do the right thing.

And then we read. Three reads, three lines, and we have our answer. No copying onto the heap. No Unsafe. No ByteBuffer's two-gigabyte ceiling. The file can be a terabyte and this code is unchanged.

Note that we didn't pick an access mode here. We didn't need to, not for this code. The plain get is fine for reading a file header that isn't going to mutate underneath us. But this is the part of VarHandle that makes off-heap concurrency possible in a way it wasn't before. A lock-free ring buffer in off-heap memory would reach for getAcquire and setRelease. An off-heap concurrent map would reach for compareAndSet. The whole access-mode spectrum Mr. Lloyd documents on field access works identically on off-heap memory, which means lock-free data structures that used to require Unsafe now have a sanctioned home.

Why this is the whole point

Every category of problem that used to require Unsafe to do well in Java is now doable without it.

Memory-mapped file parsers - GIS, image, audio, video, scientific data formats - get a sanctioned API. Off-heap caches, the kind that back high-throughput services and don't want to manage garbage collection on their warm-access data, get a sanctioned API. Native interop, where you're talking to a C library and need to lay out structs in memory the way the library expects, gets a sanctioned API. Lock-free ring buffers, custom concurrent collections, anything where you previously dropped down to Unsafe or other native access because the regular tools didn't fit, all of it now has a real, supported, type-safe path.

VarHandle is the keystone of that path. Without it, a MemorySegment is just bytes. With it, you have typed atomic access with configurable ordering, which is exactly the contract Unsafe used to provide, except now the JDK is on your side instead of glaring at you and yammering about unsanctioned access.

Mr. Lloyd's series on "the new reflection" are really fantastic reads, with a lot of technical insight, and they're well-worth digging into.


  1. To be fair to David, it's possible he actually intended a post on off-heap access; he's very wide-ranging and an incredibly valuable resource on such topics. But he hasn't posted such an article yet.

  2. sun.misc.Unsafe is probably going away, but this is a promise that's been made before. Right now it's what it's always been: the sword of Damocles hanging over Unsafe users' heads.

Comments (0)

Sign in to comment

No comments yet.