Virtual Masonry from First Principles β Inside KnowMore
Published on May 2, 2026
KnowMoreβs gallery renders team photo albums that can run several hundred to a thousand images. virtual-masonry.svelte is 130 lines of Svelte 5 doing two cheap things in sequence β pack tiles into columns, then filter to the visible slice. Both are pure arithmetic; neither touches the DOM until the very end. This post walks the whole component from first principles.
Table of Contents
- The cost of not virtualizing
- What the component does, in one sentence
- Two coordinate spaces
- The packing pass
- The visibility filter
- The position-absolute payoff
- Bridges between two worlds
- The reactive graph end to end
- When this approach breaks down
The cost of not virtualizing
A team album in a Swiss club can easily reach 800 photos for a single tournament weekend. The naive approach β loop the array, render an <img> per photo β has three problems compounding:
- DOM size. 800
<div>wrappers, 800<img>elements, 800 layout boxes. Even at zero per-element cost, the browser has to walk this tree on every reflow. - Image decode pressure. With
loading="lazy", the browser decodes only what is in or near the viewport, but even queued decodes hold memory. - Reactive graph fan-out. Any prop change to the page re-runs every child componentβs render function. Cheap in isolation, expensive at 800Γ.
Virtualization fixes all three by collapsing the rendered count to whatever fits in the viewport plus a small overscan buffer β typically 8 to 15 tiles instead of 800.
What the component does, in one sentence
Given a list of items and a tile-height function, <VirtualMasonry> packs items into a fixed number of columns using a greedy βshortest column winsβ algorithm, then absolutely-positions only the tiles that overlap the viewport (plus an overscan margin) inside a container sized to the full layout height.
The full file is short enough to read top to bottom:
<script lang="ts" generics="T">
import { cn } from '$lib/utils';
import { browser } from '$app/environment';
import type { Snippet } from 'svelte';
type Props = {
items: T[];
columns?: number;
gap?: number;
getItemHeight: (item: T, columnWidth: number) => number;
overscan?: number;
class?: string;
children?: Snippet<[T, number]>;
};
let {
items,
columns = 3,
gap = 8,
getItemHeight,
overscan = 600,
class: className,
children
}: Props = $props();getItemHeight is a pure function the parent supplies. For the gallery, it is (asset, width) => width / asset.aspectRatio β a single divide using metadata fetched from Immich, never a DOM read.
Two coordinate spaces
The whole component pivots on one coordinate transformation. Two facts that tend to get muddled:
window.scrollYandgetBoundingClientRect().topmeasure in document space, where 0 is the top of<html>.- The
(x, y)positions produced by the masonry packer live in container space, where 0 is the top of the masonry container.
You cannot compare them directly. p.y = 400 and scrollY = 2000 are not on the same number line.
DOCUMENT SPACE CONTAINER SPACE
(origin = top of <html>) (origin = top of masonry div)
0 ββ¬βββββββββββββββ
β navbar β
β filters β
300 ββΌβββββββββββββββ‘ β containerOffsetTop βββ 0
β β β
β ββββββ β β ββββββ
β β A β β β β A β y=0
β ββββββ β β ββββββ
β ββββββ β β ββββββ
β β B β β β β B β y=400
2000 β€βββ ββββ ββββ β β scrollY (viewport top) 1700β ββββββ
β βyellowβ β β
β β box β β β
2800 β€βββ ββββ ββββ β β scrollY + viewportHeight 2500
β β β
β ββββββ β β ββββββ
β β Z β β β β Z β
β ββββββ β β ββββββ
The translation between the two is one subtraction: subtract containerOffsetTop to convert document space to container space, or add it to go the other way. The component does this exactly once β translates the viewport bounds into container space β and then everything downstream stays in container space.
The packing pass
The first $derived.by runs whenever items, columns, gap, or containerWidth changes:
const layout = $derived.by<{ positions: Position[]; totalHeight: number }>(() => {
if (!containerWidth || !columns) return { positions: [], totalHeight: 0 };
const columnWidth = (containerWidth - (columns - 1) * gap) / columns;
const heights = new Array(columns).fill(0);
const positions: Position[] = [];
for (let i = 0; i < items.length; i++) {
let shortest = 0;
for (let c = 1; c < columns; c++) {
if (heights[c] < heights[shortest]) shortest = c;
}
const h = getItemHeight(items[i], columnWidth);
const x = shortest * (columnWidth + gap);
const y = heights[shortest];
positions.push({ item: items[i], index: i, x, y, width: columnWidth, height: h });
heights[shortest] = y + h + gap;
}
const tallest = heights.reduce((m, v) => (v > m ? v : m), 0);
const totalHeight = items.length > 0 ? Math.max(0, tallest - gap) : 0;
return { positions, totalHeight };
});The whole algorithm carries one piece of state between iterations: heights, an array of length columns. Each entry is the running Y-coordinate of the next free slot in that column. The inner loop is the standard βargminβ pattern β track the index and the value at the same time, one pass instead of two.
For each item, three steps:
- Find the column with the smallest accumulated height. That column has the most empty space at the bottom.
- Compute height from the parentβs getter, using the divided column width.
- Place the tile at
(column_offset, heights[shortest]), bump that columnβs height bytile_height + gap.
Walk it through with three columns and four photos, all height 100, gap 0:
START
heights = [0, 0, 0]
ITEM 0 β heights = [0,0,0], shortest=0
place at column 0, y=0
heights = [100, 0, 0]
ITEM 1 β heights = [100,0,0], shortest=1
place at column 1, y=0
heights = [100, 100, 0]
ITEM 2 β heights = [100,100,0], shortest=2
place at column 2, y=0
heights = [100, 100, 100]
ITEM 3 β heights = [100,100,100], shortest=0 (ties go first)
place at column 0, y=100
heights = [200, 100, 100]
Resulting layout:
βββββββ¬ββββββ¬ββββββ
β 0 β 1 β 2 β y=0
βββββββΌββββββΌββββββ€
β 3 β β β y=100
βββββββ΄ββββββ΄ββββββ
The same algorithm with one tall photo handles ragged shapes naturally β every new tile fills the lowest valley.
Why this is cheap enough to re-run on every prop change
The packing pass is O(items Γ columns). For a 1000-photo album with three columns, that is 3000 comparisons plus 1000 divides β well under a millisecond on any device that can run a browser. The pass produces ~1000 plain objects in an array; that is negligible memory.
There are zero DOM reads. getItemHeight is pure arithmetic on aspect-ratio metadata that Immich delivers at fetch time. If the height function had to measure DOM elements (offsetHeight), the pass would force a synchronous layout per item and tank performance β that is the trap to avoid in any real-world masonry.
The greedy βshortest column winsβ heuristic is not the optimal packing β finding a packing that minimizes the tallest column is the multiway partition problem and is NP-hard. The greedy version stays within a few percent of optimal in practice and runs in linear time. You will never see the difference visually.
The visibility filter
The second $derived.by runs whenever containerEl, layout.positions, scrollY, containerOffsetTop, viewportHeight, or overscan changes:
const visibleItems = $derived.by<Position[]>(() => {
if (!containerEl || layout.positions.length === 0) return [];
const top = scrollY - containerOffsetTop - overscan;
const bottom = scrollY - containerOffsetTop + viewportHeight + overscan;
return layout.positions.filter((p) => p.y + p.height >= top && p.y <= bottom);
});The first two lines are the coordinate translation. scrollY - containerOffsetTop is the document-to-container shift; adding viewportHeight to one and subtracting overscan from both produces the visibility window in container space.
The filter is a 1D rectangle-overlap test. A tile at (p.y, p.y + p.height) overlaps a window at (top, bottom) if and only if both:
| Condition | Meaning |
|---|---|
p.y + p.height >= top | tileβs bottom edge is at or below the windowβs top β not fully above |
p.y <= bottom | tileβs top edge is at or above the windowβs bottom β not fully below |
If either fails, the tile is fully outside the window. If both pass, it overlaps. That is the entire visibility logic.
Worked example
Plug in concrete numbers:
scrollY = 2000
containerOffsetTop = 300
viewportHeight = 800
overscan = 600
top = 2000 - 300 - 600 = 1100
bottom = 2000 - 300 + 800 + 600 = 3100
For five hypothetical tiles produced by the packer:
| Tile | y | h | y+h | y+h β₯ 1100 ? | y β€ 3100 ? | Result |
|---|---|---|---|---|---|---|
| A | 200 | 300 | 500 | no | yes | skip |
| B | 900 | 400 | 1300 | yes | yes | render |
| C | 1800 | 500 | 2300 | yes | yes | render |
| D | 2700 | 400 | 3100 | yes | yes | render |
| E | 3500 | 400 | 3900 | yes | no | skip |
Three tiles render out of five. Scale to 1000 tiles and roughly 10 to 20 typically render at any moment. That is the entire performance win, expressed in one filter.
Why overscan exists
With overscan = 0, scrolling becomes flickery. A tile mounts the moment its top edge crosses the viewport boundary, the <img> decodes from scratch, and the user sees a blank box for a frame or two. Setting overscan = 600 adds a 600-pixel buffer on each side, so tiles mount before they are visible and stay rendered after they leave. The cost is rendering roughly 1.5 viewports worth instead of 1 β cheap.
The knob to turn for βtiles pop inβ is overscan β bump it. The knob for βscrolling lagsβ is columns (smaller) or overscan (smaller).
The position-absolute payoff
The render side is short:
<div
bind:this={containerEl}
class={cn('relative w-full', className)}
style={`height: ${layout.totalHeight}px;`}
>
{#each visibleItems as p (p.index)}
<div
class="absolute"
style={`left: ${p.x}px; top: ${p.y}px; width: ${p.width}px; height: ${p.height}px;`}
>
{@render children?.(p.item, p.index)}
</div>
{/each}
</div>Two details make the trick work:
- The container has its full height. Even though only ten tiles exist in the DOM, the wrapper is sized to
layout.totalHeightβ say 80,000 pixels for 1000 photos. The browser scrollbar is the correct length, scroll behaves like a normal long page,window.scrollYis honest. - Every tile is absolutely positioned. A tile rendered at
top: 1800pxdoes not depend on the existence of any tile above it. Removing tile A from the visible list does not shift B, C, or D.
The mental model that helps people: virtualization does not lie about scroll. The scroll is real. The containerβs height is real. What is fake is the assumption that every position inside the container has a DOM node β most positions are simply unrendered void.
What the browser sees What is actually in the DOM
(the layout box): (rendered children):
ββββββββββββββββββββ y=0 ββββββββββββββββββββ y=0
β β β β
β ββββββββββββββ β β "tiles" β (empty) β
β ββββββββββββββ β above β β
β ββββββββββββββ β that β β
β ββββββββββββββ β the β β
β ββββββββββββββ β browser β β
β ββββββββββββββ β "expects" β β
β ββββββ ββββββ β β here is β ββββββ ββββββ β β real DOM
β β B β β C β β the β β B β β C β β nodes,
β ββββββ ββββββ β viewport β ββββββ ββββββ β absolutely
β ββββββ β β ββββββ β positioned
β β D β β β β D β β at their
β ββββββ β β ββββββ β real y
β ββββββββββββββ β β more β (empty) β coords
β ββββββββββββββ β "expected" β β
β ββββββββββββββ β tiles below β β
β β β β
ββββββββββββββββββββ y=80000 ββββββββββββββββββββ y=80000
(height: 80000px) (height: 80000px,
only 3 children)
Bridges between two worlds
The reactive computations described so far depend on four browser-sourced values: containerWidth, scrollY, viewportHeight, and containerOffsetTop. Svelteβs reactive graph cannot subscribe to scroll events directly β those are browser events that fire outside the graph. An $effect is the bridge: subscribe to a browser event, write the result into a $state, and from that point on the value lives inside the graph.
The component uses three effects, grouped by lifecycle and source.
Effect 1: container width
$effect(() => {
if (!containerEl || !browser) return;
const ro = new ResizeObserver(([entry]) => {
containerWidth = entry.contentRect.width;
recomputeOffset();
});
ro.observe(containerEl);
containerWidth = containerEl.clientWidth;
recomputeOffset();
return () => ro.disconnect();
});ResizeObserver watches the element directly, not the window. It fires when a parent flexbox redistributes space, when the sidebar collapses, when a scrollbar appears β all cases where window.innerWidth does not change but the masonryβs box does. The window-resize listener would miss those.
The manual containerWidth = containerEl.clientWidth after ro.observe() is necessary because ResizeObserver fires asynchronously β its first callback runs in the next animation frame, not synchronously. Without the manual read, the masonry would render once with containerWidth = 0 (no positions, empty DOM), then re-render once the observer fires.
The cleanup returns ro.disconnect(). Without it, the observer holds a reference to containerEl after unmount and the component cannot be garbage-collected.
Effect 2: window scroll and resize
$effect(() => {
if (!browser) return;
let rafId: number | null = null;
function onScroll() {
if (rafId !== null) return;
rafId = requestAnimationFrame(() => {
scrollY = window.scrollY;
rafId = null;
});
}
function onResize() {
viewportHeight = window.innerHeight;
recomputeOffset();
}
window.addEventListener('scroll', onScroll, { passive: true });
window.addEventListener('resize', onResize);
scrollY = window.scrollY;
viewportHeight = window.innerHeight;
return () => {
window.removeEventListener('scroll', onScroll);
window.removeEventListener('resize', onResize);
if (rafId !== null) cancelAnimationFrame(rafId);
};
});The interesting part is the rAF throttle on scroll. On a 120 Hz trackpad with momentum, scroll events fire over 200 times per second. If every event writes to scrollY, the reactive graph re-runs 200 times per second β recomputing visibleItems, diffing the each block, mutating the DOM. The browser only paints 60 to 120 times per second, so most of that work is thrown away.
The fix is leading-edge frame gating:
scroll events: β β β β β β β β β β β β β β β β β β β β
ββββββ frame 1 βββββββ€ββββββ frame 2 βββββββ€
rAF callback: β writes scrollY β writes scrollY
(fresh value) (fresh value)
The first scroll event in a frame schedules a requestAnimationFrame callback. Every subsequent event in the same frame sees rafId !== null and bails. When the frame fires, the callback reads window.scrollY once (always fresh β the browser updates it synchronously) and writes to the $state. At most one update per frame regardless of input rate.
The { passive: true } flag on the scroll listener tells the browser the handler will not call preventDefault(), so scrolling can proceed without waiting for the JS to return. On older devices this is the difference between buttery and stuttery.
Resize does not need rAF throttling β it fires a few times per second during a drag, and recomputeOffset() is cheap.
Effect 3: offset on layout change
$effect(() => {
void layout.totalHeight;
recomputeOffset();
});This effect bridges the other direction: from inside the reactive graph back to a DOM measurement. containerOffsetTop is βhow far down the document does the masonry start?β Most of the time it is stable. But:
- If the user opens a filter panel above the masonry, everything below shifts down and
containerOffsetTopis now wrong. - If pagination loads more photos, the masonry grows downward β its own offset does not change, but a sibling sidebar might shift.
The honest fix would be a MutationObserver watching every ancestor β heavy machinery for a rare event. The pragmatic fix: whenever the masonryβs own layout changes, re-read the offset. The effect depends on layout.totalHeight (the void prefix tells the linter βI am reading this for tracking purposes only, not the valueβ) and runs whenever the layout grows or columns change.
The comment in the source calls this βcheap belt-and-suspenders.β It is not a correctness guarantee, just a heuristic that catches the common cases.
What $effect actually tracks
A subtle Svelte 5 rule that matters here: an $effect re-runs based on what it reads during the synchronous execution of the effect body. Reads inside callbacks that fire later β ResizeObserver callbacks, addEventListener handlers, requestAnimationFrame callbacks β are not tracked.
Effect 1βs body reads containerEl and not much else. The ResizeObserver callback reads and writes containerWidth, but those operations happen on a future tick, after the tracking window has closed. Effect 1 therefore mounts the observer once on first run and tears it down only on unmount. The observer fires forever, writing to containerWidth, which propagates through layout and visibleItems and the DOM β but it does not re-trigger the effect that owns it.
The same applies to Effect 2: its synchronous body reads no $state, so it has zero tracked dependencies. Mount-once, unmount-once.
Only Effect 3 actually re-runs reactively, because it has a synchronous read of layout.totalHeight.
| Effect | Synchronous reads | Re-runs when |
|---|---|---|
| 1 (ResizeObserver) | containerEl | mount / unmount only |
| 2 (window listeners) | nothing reactive | mount / unmount only |
| 3 (offset recompute) | layout.totalHeight | layout changes |
If you internalize one rule about $effect, this is the one: dependencies are reads, not writes, and only synchronous reads count.
The reactive graph end to end
Putting all the pieces together:
Browser events
β
βΌ Effect 1 Effect 2 Effect 3
β ResizeObserver scroll, resize (layout changed)
βΌ
$state: containerWidth, scrollY, viewportHeight, containerOffsetTop
β
βΌ ($derived.by β packing pass)
layout: { positions, totalHeight }
β
βΌ ($derived.by β visibility filter)
visibleItems
β
βΌ ({#each} on absolute-positioned tiles)
DOM: ~10 nodes regardless of items.length
Browser β effects β state β layout β visible β DOM. Every arrow is βSvelte recomputes downstream when upstream changes,β automatically, with no manual update() anywhere. The whole component is 130 lines and handles tens of thousands of items at native scroll speed.
When this approach breaks down
Three caveats worth naming, since this technique is not universal:
- Items must have known heights. If the height function has to measure the DOM, the packing pass forces synchronous layout per item and the win evaporates. For Immich photos this is a non-issue because the API returns aspect ratios. For arbitrary user content (chat messages with rich embeds, for example) you need a different strategy: measure on first render, cache the result, and reflow when the cache changes β typically using something like TanStack Virtual with
measureElementcorrection. - Scroll must happen on the window or a known element. The component reads
window.scrollYdirectly. Scrolling inside a modal or a sticky pane needs a different scroll source β pass it in as a prop or restructure to useIntersectionObserverinstead. - Layout pass scales with item count. Linear is great, but with 100,000 items the pass starts to noticeably block the main thread. At that point you want to chunk the input, render in batches, or move the packing into a worker. The 1000-photo gallery is well under this ceiling.
The two-derived structure β pack everything, filter to a slice β is the soul of the technique. Once you see it, every βrender thousands of thingsβ problem starts to look the same: it is a question of where you put the layout pass, the visibility filter, and the bridge effects.
This is the ninth post in the KnowMore series.