The day the reviews piled up

In the previous piece we built Shlokas’ scheduler up from the forgetting curve: a trimmed SM-2 that, for every verse you know, computes the last cheap day to review it before it slips. The algorithm is honest and the math is clean. And if you follow it to the letter, it will eventually ruin someone’s Tuesday.

This is the story of that Tuesday, and the twenty lines that fixed it.

How the pile-up happens

SM-2 is a per-card algorithm. It looks at one verse, its ease, its last interval, and names one ideal day. It has no idea the verse next to it exists. That is fine for one card and quietly disastrous for a deck, because real practice arrives in clumps.

Picture a motivated morning. You graduate five new ślokas in one session. Each one seeds a couple of recall cards (the verse-from-the-number, the translation, the several directions you’ll be tested in), and SM-2, doing its job perfectly, schedules every one of them for the obvious first interval: tomorrow. Ten cards land on the same day. Do that for a week and the clumps stack — a few dozen cards that all happened to graduate together now travel through time in a convoy, all coming due on the same days forever after. The learner opens the app on Tuesday to sixty reviews and on Wednesday to four. The algorithm is right on every individual card and the lived experience is lumpy and discouraging.

The naïve fixes are both wrong. Cap the day at twenty and you are now reviewing verses late — pushing them down the forgetting curve, the one thing the whole system exists to prevent. Spread cards arbitrarily and you have thrown away the spacing you carefully computed. We wanted a third thing: smooth the bumps without meaningfully moving any card off its mark.

The opening: a day is wider than a point

The unlock is realising that SM-2’s “perfect day” is not actually a point. The forgetting curve is shallow near the top — around a 24-day interval, reviewing on day 23 or day 25 changes the recall probability by a rounding error. The “right day” is really a small neighbourhood of almost-equally-good days. And inside a neighbourhood, you are free to choose — so choose the least crowded one.

That turns scheduling into a placement problem: for each card, look at a window of days around the SM-2 target, ask how many cards are already due on each, and drop this card on the emptiest. The crowd levels itself, and no card moves more than a day or two from where the math wanted it.

The window is lopsided on purpose

Here is the first decision that isn’t obvious. The window is not symmetric. It reaches further into the past than the future:

// reviewDueAt(): the search window around the SM-2 target day
const kBack = Math.max(2, Math.round(intervalDays * 0.075)) // ~7.5% of the interval
const kForward = 3                                          // capped, always small

The asymmetry encodes a fact about the curve that the symmetric version would ignore: early is cheap, late is dangerous. Seeing a verse a day before its ideal moment costs you almost nothing — you recall it slightly more easily, the interval barely shortens. Seeing it a day after means it may already have crossed into forgotten, and a lapse is the most expensive thing that can happen to a card — it resets the interval and cuts the ease hardest. So the window leans back. For a 24-day card it can pull a review forward by round(24 × 0.075) ≈ 2 days if that day is emptier, but it will push it back by at most three, ever. We spend cheap days to save expensive ones.

graph LR
    B2["−2"] --- B1["−1"] --- T["target day"] --- F1["+1"] --- F2["+2"] --- F3["+3"]
    T -.->|"kBack: up to ~7.5% of the interval"| B2
    T -.->|"kForward: capped at 3"| F3

Ties, and the dice we keep deterministic

Within that window the rule is simple — pick the day with the smallest load — until several days are tied at the minimum, which at the start of a fresh deck is most of them. If we always broke ties the same way (say, “earliest wins”), then two sibling cards graduating back-to-back would make the identical choice and re-cluster on the very day we were trying to spread them off. So ties are broken at random — but with two twists.

// chooseDueDay(): least-loaded day in the window, random among ties
let chosen = windowStart, seen = 0
const considerDay = (day) => {
  if ((dayLoadCounts.get(day) ?? 0) !== minLoad) return
  seen += 1
  if (seen === 1 || rng() < 1 / seen) chosen = day   // reservoir sampling
}
if (targetInWindow) considerDay(target)               // target gets seat #1
for (const day of window) if (day !== target) considerDay(day)

First, it’s reservoir sampling: one pass over the tied days, each replacing the current pick with probability 1/seen, which leaves every tied day equally likely without knowing up front how many there are. Second, the SM-2 target day is offered first — it takes “seat #1” — and because a strictly-lower load is required to displace a pick, a mere tie never knocks the card off its ideal day. Load balancing only ever acts as a tie-breaker; it never overrules the math. The drift from textbook SM-2 is, by construction, the minimum possible.

The rng is injected, not global. In production it’s Math.random; in tests it’s a seeded generator, so a behaviour that is random by design is still exactly reproducible when we need to assert on it. Randomness where the user benefits, determinism where the test demands it.

The batch bug hiding in “already due”

The whole thing runs on a day-load map — “how many cards are due on each day” — which the application layer fetches once, as a single indexed GROUP BY over the recall table. Cheap, and correct, with one trap.

When you graduate five verses at once, their cards are placed in a loop inside a single transaction that hasn’t committed yet. Query the database for the load and every card in the batch sees the same pre-batch snapshot — zero on every day — and they all “balance” onto the same empty Tuesday. We’d have rebuilt the pile-up inside the very routine meant to prevent it.

The fix is to treat the map as a running tally, not a fact:

const load = await fetchRecallDayLoadCounts(...)   // one GROUP BY, the snapshot
for (const card of graduating) {
  const due = placeOnLeastLoadedDay(card, load)
  load.set(due, (load.get(due) ?? 0) + 1)          // book the seat in memory
}

Each placement increments the in-memory map before the next card looks, so within the batch they see each other’s bookings even though nothing has touched the database yet. Obvious in hindsight; the kind of bug that only appears under the exact load it was built to handle.

A knob, because some people want the math raw

All of this sits behind a single switch — evenOutDailyLoad, on by default, under Settings → Progress. Turn it off and every card lands precisely on its SM-2 day: predictable, and prone to exactly the clumping we started with. Some users — the ones who like a sparse, honest schedule and don’t mind a heavy day — prefer it. Most never touch it, and never see a sixty-card Tuesday either.

The shape of the lesson

The interesting part was never the SM-2 math; that’s settled, forty years old, and well documented. The interesting part is the seam between a correct per-item algorithm and a humane whole-deck experience — and that seam turned out to be a handful of pure functions: a lopsided window that knows early is cheaper than late, a tie-break that adds entropy without losing reproducibility, and a running tally that survives a batch. Twenty-odd lines, sitting between the textbook and the person, making sure the right answer also feels like one.