Virtual Masonry from First Principles

KnowMore’s gallery renders team photo albums that can run several hundred to a thousand images. virtual-masonry.svelte is 130 lines of Svelte 5 doing two cheap things in sequence — pack tiles into columns, then filter to the visible slice. Both are pure arithmetic; neither touches the DOM until the very end. This post walks the whole component from first principles.

The cost of not virtualizing
What the component does, in one sentence
Two coordinate spaces
The packing pass
- Why this is cheap enough to re-run on every prop change
The visibility filter
- Worked example
- Why overscan exists
The position-absolute payoff
Bridges between two worlds
The reactive graph end to end
When this approach breaks down

The cost of not virtualizing

A team album in a Swiss club can easily reach 800 photos for a single tournament weekend. The naive approach — loop the array, render an <img> per photo — has three problems compounding:

DOM size. 800 <div> wrappers, 800 <img> elements, 800 layout boxes. Even at zero per-element cost, the browser has to walk this tree on every reflow.
Image decode pressure. With loading="lazy", the browser decodes only what is in or near the viewport, but even queued decodes hold memory.
Reactive graph fan-out. Any prop change to the page re-runs every child component’s render function. Cheap in isolation, expensive at 800×.

Virtualization fixes all three by collapsing the rendered count to whatever fits in the viewport plus a small overscan buffer — typically 8 to 15 tiles instead of 800.

What the component does, in one sentence

Given a list of items and a tile-height function, <VirtualMasonry> packs items into a fixed number of columns using a greedy “shortest column wins” algorithm, then absolutely-positions only the tiles that overlap the viewport (plus an overscan margin) inside a container sized to the full layout height.

The full file is short enough to read top to bottom:

<script lang="ts" generics="T">
    import { cn } from '$lib/utils';
    import { browser } from '$app/environment';
    import type { Snippet } from 'svelte';

    type Props = {
        items: T[];
        columns?: number;
        gap?: number;
        getItemHeight: (item: T, columnWidth: number) => number;
        overscan?: number;
        class?: string;
        children?: Snippet<[T, number]>;
    };

    let {
        items,
        columns = 3,
        gap = 8,
        getItemHeight,
        overscan = 600,
        class: className,
        children
    }: Props = $props();

getItemHeight is a pure function the parent supplies. For the gallery, it is (asset, width) => width / asset.aspectRatio — a single divide using metadata fetched from Immich, never a DOM read.

Two coordinate spaces

The whole component pivots on one coordinate transformation. Two facts that tend to get muddled:

window.scrollY and getBoundingClientRect().top measure in document space, where 0 is the top of <html>.
The (x, y) positions produced by the masonry packer live in container space, where 0 is the top of the masonry container.

You cannot compare them directly. p.y = 400 and scrollY = 2000 are not on the same number line.

DOCUMENT SPACE                    CONTAINER SPACE
(origin = top of <html>)          (origin = top of masonry div)

  0 ─┬──────────────┐
     │  navbar      │
     │  filters     │
300 ─┼══════════════╡  ← containerOffsetTop       ─── 0
     │              │                                │
     │  ┌────┐      │                                │  ┌────┐
     │  │ A  │      │                                │  │ A  │   y=0
     │  └────┘      │                                │  └────┘
     │  ┌────┐      │                                │  ┌────┐
     │  │ B  │      │                                │  │ B  │   y=400
2000 ┤━━┓ ──── ━━━━ │  ← scrollY (viewport top)   1700│  └────┘
     │  ┃yellow┃    │                                │
     │  ┃ box  ┃    │                                │
2800 ┤━━┛ ──── ━━━━ │  ← scrollY + viewportHeight  2500
     │              │                                │
     │  ┌────┐      │                                │  ┌────┐
     │  │ Z  │      │                                │  │ Z  │
     │  └────┘      │                                │  └────┘

The translation between the two is one subtraction: subtract containerOffsetTop to convert document space to container space, or add it to go the other way. The component does this exactly once — translates the viewport bounds into container space — and then everything downstream stays in container space.

The packing pass

The first $derived.by runs whenever items, columns, gap, or containerWidth changes:

const layout = $derived.by<{ positions: Position[]; totalHeight: number }>(() => {
    if (!containerWidth || !columns) return { positions: [], totalHeight: 0 };
    const columnWidth = (containerWidth - (columns - 1) * gap) / columns;
    const heights = new Array(columns).fill(0);
    const positions: Position[] = [];
    for (let i = 0; i < items.length; i++) {
        let shortest = 0;
        for (let c = 1; c < columns; c++) {
            if (heights[c] < heights[shortest]) shortest = c;
        }
        const h = getItemHeight(items[i], columnWidth);
        const x = shortest * (columnWidth + gap);
        const y = heights[shortest];
        positions.push({ item: items[i], index: i, x, y, width: columnWidth, height: h });
        heights[shortest] = y + h + gap;
    }
    const tallest = heights.reduce((m, v) => (v > m ? v : m), 0);
    const totalHeight = items.length > 0 ? Math.max(0, tallest - gap) : 0;
    return { positions, totalHeight };
});

The whole algorithm carries one piece of state between iterations: heights, an array of length columns. Each entry is the running Y-coordinate of the next free slot in that column. The inner loop is the standard “argmin” pattern — track the index and the value at the same time, one pass instead of two.

For each item, three steps:

Find the column with the smallest accumulated height. That column has the most empty space at the bottom.
Compute height from the parent’s getter, using the divided column width.
Place the tile at (column_offset, heights[shortest]), bump that column’s height by tile_height + gap.

Walk it through with three columns and four photos, all height 100, gap 0:

START
  heights = [0, 0, 0]

ITEM 0 — heights = [0,0,0], shortest=0
  place at column 0, y=0
  heights = [100, 0, 0]

ITEM 1 — heights = [100,0,0], shortest=1
  place at column 1, y=0
  heights = [100, 100, 0]

ITEM 2 — heights = [100,100,0], shortest=2
  place at column 2, y=0
  heights = [100, 100, 100]

ITEM 3 — heights = [100,100,100], shortest=0 (ties go first)
  place at column 0, y=100
  heights = [200, 100, 100]

Resulting layout:

  ┌─────┬─────┬─────┐
  │  0  │  1  │  2  │  y=0
  ├─────┼─────┼─────┤
  │  3  │     │     │  y=100
  └─────┴─────┴─────┘

The same algorithm with one tall photo handles ragged shapes naturally — every new tile fills the lowest valley.

Why this is cheap enough to re-run on every prop change

The packing pass is O(items × columns). For a 1000-photo album with three columns, that is 3000 comparisons plus 1000 divides — well under a millisecond on any device that can run a browser. The pass produces ~1000 plain objects in an array; that is negligible memory.

There are zero DOM reads. getItemHeight is pure arithmetic on aspect-ratio metadata that Immich delivers at fetch time. If the height function had to measure DOM elements (offsetHeight), the pass would force a synchronous layout per item and tank performance — that is the trap to avoid in any real-world masonry.

The greedy “shortest column wins” heuristic is not the optimal packing — finding a packing that minimizes the tallest column is the multiway partition problem and is NP-hard. The greedy version stays within a few percent of optimal in practice and runs in linear time. You will never see the difference visually.

The visibility filter

The second $derived.by runs whenever containerEl, layout.positions, scrollY, containerOffsetTop, viewportHeight, or overscan changes:

const visibleItems = $derived.by<Position[]>(() => {
    if (!containerEl || layout.positions.length === 0) return [];
    const top = scrollY - containerOffsetTop - overscan;
    const bottom = scrollY - containerOffsetTop + viewportHeight + overscan;
    return layout.positions.filter((p) => p.y + p.height >= top && p.y <= bottom);
});

The first two lines are the coordinate translation. scrollY - containerOffsetTop is the document-to-container shift; adding viewportHeight to one and subtracting overscan from both produces the visibility window in container space.

The filter is a 1D rectangle-overlap test. A tile at (p.y, p.y + p.height) overlaps a window at (top, bottom) if and only if both:

Condition	Meaning
`p.y + p.height >= top`	tile’s bottom edge is at or below the window’s top → not fully above
`p.y <= bottom`	tile’s top edge is at or above the window’s bottom → not fully below

If either fails, the tile is fully outside the window. If both pass, it overlaps. That is the entire visibility logic.

Worked example

Plug in concrete numbers:

scrollY            = 2000
containerOffsetTop =  300
viewportHeight     =  800
overscan           =  600

  top    = 2000 - 300 - 600              = 1100
  bottom = 2000 - 300 + 800 + 600        = 3100

For five hypothetical tiles produced by the packer:

Tile	y	h	y+h	y+h ≥ 1100 ?	y ≤ 3100 ?	Result
A	200	300	500	no	yes	skip
B	900	400	1300	yes	yes	render
C	1800	500	2300	yes	yes	render
D	2700	400	3100	yes	yes	render
E	3500	400	3900	yes	no	skip

Three tiles render out of five. Scale to 1000 tiles and roughly 10 to 20 typically render at any moment. That is the entire performance win, expressed in one filter.

Why overscan exists

With overscan = 0, scrolling becomes flickery. A tile mounts the moment its top edge crosses the viewport boundary, the <img> decodes from scratch, and the user sees a blank box for a frame or two. Setting overscan = 600 adds a 600-pixel buffer on each side, so tiles mount before they are visible and stay rendered after they leave. The cost is rendering roughly 1.5 viewports worth instead of 1 — cheap.

The knob to turn for “tiles pop in” is overscan — bump it. The knob for “scrolling lags” is columns (smaller) or overscan (smaller).

The position-absolute payoff

The render side is short:

<div
    bind:this={containerEl}
    class={cn('relative w-full', className)}
    style={`height: ${layout.totalHeight}px;`}
>
    {#each visibleItems as p (p.index)}
        <div
            class="absolute"
            style={`left: ${p.x}px; top: ${p.y}px; width: ${p.width}px; height: ${p.height}px;`}
        >
            {@render children?.(p.item, p.index)}
        </div>
    {/each}
</div>

Two details make the trick work:

The container has its full height. Even though only ten tiles exist in the DOM, the wrapper is sized to layout.totalHeight — say 80,000 pixels for 1000 photos. The browser scrollbar is the correct length, scroll behaves like a normal long page, window.scrollY is honest.
Every tile is absolutely positioned. A tile rendered at top: 1800px does not depend on the existence of any tile above it. Removing tile A from the visible list does not shift B, C, or D.

The mental model that helps people: virtualization does not lie about scroll. The scroll is real. The container’s height is real. What is fake is the assumption that every position inside the container has a DOM node — most positions are simply unrendered void.

What the browser sees                What is actually in the DOM
(the layout box):                    (rendered children):

  ┌──────────────────┐ y=0           ┌──────────────────┐ y=0
  │                  │               │                  │
  │  ░░░░░░░░░░░░░░  │ ← "tiles"     │   (empty)        │
  │  ░░░░░░░░░░░░░░  │   above        │                  │
  │  ░░░░░░░░░░░░░░  │   that         │                  │
  │  ░░░░░░░░░░░░░░  │   the          │                  │
  │  ░░░░░░░░░░░░░░  │   browser      │                  │
  │  ░░░░░░░░░░░░░░  │   "expects"    │                  │
  │  ┌────┐  ┌────┐  │ ← here is     │  ┌────┐  ┌────┐  │ ← real DOM
  │  │ B  │  │ C  │  │   the          │  │ B  │  │ C  │  │   nodes,
  │  └────┘  └────┘  │   viewport     │  └────┘  └────┘  │   absolutely
  │  ┌────┐          │                │  ┌────┐          │   positioned
  │  │ D  │          │                │  │ D  │          │   at their
  │  └────┘          │                │  └────┘          │   real y
  │  ░░░░░░░░░░░░░░  │ ← more         │   (empty)        │   coords
  │  ░░░░░░░░░░░░░░  │   "expected"   │                  │
  │  ░░░░░░░░░░░░░░  │   tiles below  │                  │
  │                  │                │                  │
  └──────────────────┘ y=80000        └──────────────────┘ y=80000
   (height: 80000px)                   (height: 80000px,
                                        only 3 children)

Bridges between two worlds

The reactive computations described so far depend on four browser-sourced values: containerWidth, scrollY, viewportHeight, and containerOffsetTop. Svelte’s reactive graph cannot subscribe to scroll events directly — those are browser events that fire outside the graph. An $effect is the bridge: subscribe to a browser event, write the result into a $state, and from that point on the value lives inside the graph.

The component uses three effects, grouped by lifecycle and source.

Effect 1: container width

$effect(() => {
    if (!containerEl || !browser) return;
    const ro = new ResizeObserver(([entry]) => {
        containerWidth = entry.contentRect.width;
        recomputeOffset();
    });
    ro.observe(containerEl);
    containerWidth = containerEl.clientWidth;
    recomputeOffset();
    return () => ro.disconnect();
});

ResizeObserver watches the element directly, not the window. It fires when a parent flexbox redistributes space, when the sidebar collapses, when a scrollbar appears — all cases where window.innerWidth does not change but the masonry’s box does. The window-resize listener would miss those.

The manual containerWidth = containerEl.clientWidth after ro.observe() is necessary because ResizeObserver fires asynchronously — its first callback runs in the next animation frame, not synchronously. Without the manual read, the masonry would render once with containerWidth = 0 (no positions, empty DOM), then re-render once the observer fires.

The cleanup returns ro.disconnect(). Without it, the observer holds a reference to containerEl after unmount and the component cannot be garbage-collected.

Effect 2: window scroll and resize

$effect(() => {
    if (!browser) return;
    let rafId: number | null = null;
    function onScroll() {
        if (rafId !== null) return;
        rafId = requestAnimationFrame(() => {
            scrollY = window.scrollY;
            rafId = null;
        });
    }
    function onResize() {
        viewportHeight = window.innerHeight;
        recomputeOffset();
    }
    window.addEventListener('scroll', onScroll, { passive: true });
    window.addEventListener('resize', onResize);
    scrollY = window.scrollY;
    viewportHeight = window.innerHeight;
    return () => {
        window.removeEventListener('scroll', onScroll);
        window.removeEventListener('resize', onResize);
        if (rafId !== null) cancelAnimationFrame(rafId);
    };
});

The interesting part is the rAF throttle on scroll. On a 120 Hz trackpad with momentum, scroll events fire over 200 times per second. If every event writes to scrollY, the reactive graph re-runs 200 times per second — recomputing visibleItems, diffing the each block, mutating the DOM. The browser only paints 60 to 120 times per second, so most of that work is thrown away.

The fix is leading-edge frame gating:

scroll events:    │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
                  ├───── frame 1 ──────┤├───── frame 2 ──────┤
rAF callback:     ↑ writes scrollY     ↑ writes scrollY
                  (fresh value)         (fresh value)

The first scroll event in a frame schedules a requestAnimationFrame callback. Every subsequent event in the same frame sees rafId !== null and bails. When the frame fires, the callback reads window.scrollY once (always fresh — the browser updates it synchronously) and writes to the $state. At most one update per frame regardless of input rate.

The { passive: true } flag on the scroll listener tells the browser the handler will not call preventDefault(), so scrolling can proceed without waiting for the JS to return. On older devices this is the difference between buttery and stuttery.

Resize does not need rAF throttling — it fires a few times per second during a drag, and recomputeOffset() is cheap.

Effect 3: offset on layout change

$effect(() => {
    void layout.totalHeight;
    recomputeOffset();
});

This effect bridges the other direction: from inside the reactive graph back to a DOM measurement. containerOffsetTop is “how far down the document does the masonry start?” Most of the time it is stable. But:

If the user opens a filter panel above the masonry, everything below shifts down and containerOffsetTop is now wrong.
If pagination loads more photos, the masonry grows downward — its own offset does not change, but a sibling sidebar might shift.

The honest fix would be a MutationObserver watching every ancestor — heavy machinery for a rare event. The pragmatic fix: whenever the masonry’s own layout changes, re-read the offset. The effect depends on layout.totalHeight (the void prefix tells the linter “I am reading this for tracking purposes only, not the value”) and runs whenever the layout grows or columns change.

The comment in the source calls this “cheap belt-and-suspenders.” It is not a correctness guarantee, just a heuristic that catches the common cases.

What `$effect` actually tracks

A subtle Svelte 5 rule that matters here: an $effect re-runs based on what it reads during the synchronous execution of the effect body. Reads inside callbacks that fire later — ResizeObserver callbacks, addEventListener handlers, requestAnimationFrame callbacks — are not tracked.

Effect 1’s body reads containerEl and not much else. The ResizeObserver callback reads and writes containerWidth, but those operations happen on a future tick, after the tracking window has closed. Effect 1 therefore mounts the observer once on first run and tears it down only on unmount. The observer fires forever, writing to containerWidth, which propagates through layout and visibleItems and the DOM — but it does not re-trigger the effect that owns it.

The same applies to Effect 2: its synchronous body reads no $state, so it has zero tracked dependencies. Mount-once, unmount-once.

Only Effect 3 actually re-runs reactively, because it has a synchronous read of layout.totalHeight.

Effect	Synchronous reads	Re-runs when
1 (ResizeObserver)	`containerEl`	mount / unmount only
2 (window listeners)	nothing reactive	mount / unmount only
3 (offset recompute)	`layout.totalHeight`	layout changes

If you internalize one rule about $effect, this is the one: dependencies are reads, not writes, and only synchronous reads count.

The reactive graph end to end

Putting all the pieces together:

   Browser events
        │
        ▼  Effect 1            Effect 2                     Effect 3
        │  ResizeObserver      scroll, resize               (layout changed)
        ▼
   $state: containerWidth, scrollY, viewportHeight, containerOffsetTop
        │
        ▼ ($derived.by — packing pass)
   layout: { positions, totalHeight }
        │
        ▼ ($derived.by — visibility filter)
   visibleItems
        │
        ▼ ({#each} on absolute-positioned tiles)
   DOM: ~10 nodes regardless of items.length

Browser → effects → state → layout → visible → DOM. Every arrow is “Svelte recomputes downstream when upstream changes,” automatically, with no manual update() anywhere. The whole component is 130 lines and handles tens of thousands of items at native scroll speed.

When this approach breaks down

Three caveats worth naming, since this technique is not universal:

Items must have known heights. If the height function has to measure the DOM, the packing pass forces synchronous layout per item and the win evaporates. For Immich photos this is a non-issue because the API returns aspect ratios. For arbitrary user content (chat messages with rich embeds, for example) you need a different strategy: measure on first render, cache the result, and reflow when the cache changes — typically using something like TanStack Virtual with measureElement correction.
Scroll must happen on the window or a known element. The component reads window.scrollY directly. Scrolling inside a modal or a sticky pane needs a different scroll source — pass it in as a prop or restructure to use IntersectionObserver instead.
Layout pass scales with item count. Linear is great, but with 100,000 items the pass starts to noticeably block the main thread. At that point you want to chunk the input, render in batches, or move the packing into a worker. The 1000-photo gallery is well under this ceiling.

The two-derived structure — pack everything, filter to a slice — is the soul of the technique. Once you see it, every “render thousands of things” problem starts to look the same: it is a question of where you put the layout pass, the visibility filter, and the bridge effects.

This is the ninth post in the KnowMore series.

Virtual Masonry from First Principles — Inside KnowMore

Table of Contents

The cost of not virtualizing

What the component does, in one sentence

Two coordinate spaces

The packing pass

Why this is cheap enough to re-run on every prop change

The visibility filter

Worked example

Why overscan exists

The position-absolute payoff

Bridges between two worlds

Effect 1: container width

Effect 2: window scroll and resize

Effect 3: offset on layout change

What `$effect` actually tracks

The reactive graph end to end

When this approach breaks down

Virtual Masonry from First Principles — Inside KnowMore

Table of Contents

The cost of not virtualizing

What the component does, in one sentence

Two coordinate spaces

The packing pass

Why this is cheap enough to re-run on every prop change

The visibility filter

Worked example

Why overscan exists

The position-absolute payoff

Bridges between two worlds

Effect 1: container width

Effect 2: window scroll and resize

Effect 3: offset on layout change

What $effect actually tracks

The reactive graph end to end

When this approach breaks down

What `$effect` actually tracks