Building for the kill switch

Russia is a special case. Slow internet in the regions, unstable links abroad, and the looming prospect of a “Cheburnet” — a national network that could one day be cut off from the outside world by a flick of a switch. For an app that hands out hundreds of gigabytes of lectures and media, that is not a theoretical worry: the user taps download, the request leaves the country, hits a throttled or degraded cross-border link, and on the other end there is a timeout. A grandmother in a village on 3G should not have to understand BGP routing to hear a lecture.

So we did not build a fallback. We built a closed contour — a complete, self-sufficient copy of the infrastructure that lives entirely inside Russia and needs nothing from the outside to work. The catalogue, the audio, the transcripts, the cover art, the metadata, the chat backend, auth — all of it, mirrored. The idea is simple and a little paranoid: prepare for the worst case before it happens, so that if the borders close, if the cross-border links go dark, if anything at all happens — nothing inside the app even notices. It is already running on infrastructure that never had to cross the border in the first place. We are not reacting to a blackout; we are pre-positioned for one.

A copy on each contour

All of the app’s data lives simultaneously on two contours — a Russian one and an international one. If the external links degrade, or are closed entirely, the Russian contour keeps working as if nothing happened. If you are outside Russia, you go through the international one, with no loss of speed.

In the code it is just two regions, each with its own hosts — global served from AWS, russia from Yandex Object Storage — and every HTTP client reads the active region’s base URL at call time:

const REGIONS = {
  global: {
    name: "Global",
    urlTemplate: "https://…s3.amazonaws.com/{path}",      // CDN-fronted, US + EU
    chatBaseUrl: HOST,
  },
  russia: {
    name: "Russia",
    urlTemplate: "https://…storage.yandexcloud.net/{path}", // in-country, Yandex
    chatBaseUrl: HOST_RU,
  },
} as const;

let activeRegion: keyof typeof REGIONS = "global";

const urlFor = (path: string) =>
  REGIONS[activeRegion].urlTemplate.replace("{path}", path);

Switching is a single act. In settings you choose your region; tick the one you are in, and the app re-routes everything — catalogue, audio, transcripts, auth, chat — from there, no restart needed. No VPN, no proxy, no hand-edited DNS. Just a checkbox that swaps one activeRegion value.

The global side has two of its own

“Global” is not one bucket in one city. A single origin in North America serves a listener in Toronto well and a listener in Lisbon badly — the packets have to cross the Atlantic and back for every range request. So the international contour is itself doubled: an origin in the United States (us-east-1, N. Virginia) and an edge in Europe (eu-central-1, Frankfurt), with a CDN steering each request to whichever is closer. A European listener terminates TLS in Frankfurt; a North-American one in Virginia. Same checkbox, same global region — the nearest edge is chosen for you.

The gap is not subtle. Measured from our build server in Europe, against the bare storage endpoints (three samples each, time to first byte):

Endpoint	Region	Connect	TLS handshake	TTFB
`s3.eu-central-1.amazonaws.com`	Europe (FRA)	~33 ms	~75 ms	~110 ms
`storage.yandexcloud.net`	Russia	~75 ms	~155 ms	~250 ms
`s3.us-east-1.amazonaws.com`	USA (IAD)	~125 ms	~255 ms	~385 ms

From a European vantage point the US origin’s first byte arrives roughly 3.5× later than the European edge’s. That is one vantage point and one moment — the absolute numbers will differ from yours — but the shape is the lesson: distance to origin is paid back on every request, and for a 300 MB lecture pack split into hundreds of range requests, those milliseconds compound into real minutes. A CDN that picks Frankfurt over Virginia for European users is not a nicety; it is the difference between a download that finishes and one the user abandons.

What a latency table from Europe cannot show is the actual pain inside Russia. There the bottleneck is rarely the origin’s first byte — it is packet loss, throttling and resets on the cross-border leg, which turn a clean 250 ms request into a stalled connection that never completes. You cannot CDN your way around a link that is actively being squeezed. The only honest fix is to not cross the border at all — which is exactly what the Russian contour does.

Downloads, retries, and the switch

A region map is only half the story. The thing the user actually feels is the download manager, and its whole job is to survive a flaky link. Three rules: resume, never restart (a dropped connection at 280 MB picks up at 280 MB, not zero); retry with bounded backoff (so a momentary blip self-heals without hammering the server); and fail loudly, with a way out (when retries are exhausted, surface the one action that actually helps — switch region).

async function download(path: string, into: PartialFile, retries = 4) {
  for (let attempt = 0; ; attempt++) {
    try {
      const from = into.size; // bytes already on disk → resume from here
      const res = await fetch(urlFor(path), {
        headers: from ? { Range: `bytes=${from}-` } : {},
      });
      if (res.status !== 200 && res.status !== 206) throw new HttpError(res.status);
      await into.append(res.body);
      return;
    } catch (err) {
      if (attempt >= retries) {
        // out of retries: this is the moment to suggest the other contour
        throw new DownloadFailed(path, { hint: "switch-region", cause: err });
      }
      // 0.5s, 1s, 2s, 4s … capped at 8s, plus jitter to avoid a thundering herd
      await sleep(Math.min(2 ** attempt * 500, 8000) + Math.random() * 250);
    }
  }
}

Range requests are what make resume possible — Yandex Object Storage and S3 both honour them, returning 206 Partial Content — and the same loop runs unchanged on both contours, because urlFor already points at whichever region is active. The bounded exponential backoff (0.5 → 1 → 2 → 4 → cap at 8 s, with jitter) means a one-second hiccup costs the user nothing, while a genuinely dead link gives up in seconds rather than spinning forever. And when it does give up, the error carries a switch-region hint instead of a bare “download failed,” so the UI can offer the one button that fixes it.

Not 100 ms → 30 ms. Ten minutes → thirty seconds.

That Range header looks like a footnote. It was the whole game.

When we first dug into “downloads are too slow,” the obvious suspect was latency — the metadata and API calls round-tripping to a faraway origin. But the traces said something stranger: the responses came back fast. A list, a track’s metadata, the first bytes — all quick. It was the file itself that crawled. The bug was not in how fast the first byte arrived; it was in what happened after byte one.

The native downloader was issuing a plain GET with no Range header and no resume — on Android, on iOS, on web alike. So the moment a mobile connection hiccuped at 50 MB of an 80 MB track — and mobile connections hiccup constantly — the next attempt did not pick up at 50 MB. It started over at byte zero. On a flaky link that meant downloading the same track two or three times before it ever completed, and a single lecture could sit in that restart-from-zero death spiral for the better part of ten minutes. The response was fast the whole time. The file just never arrived.

Two changes killed it. First, Range/resume — the loop above — so a drop at 50 MB resumes at 50 MB instead of restarting, turning “re-download the whole thing 2–3×” into “fetch the missing tail once.” Second, an edge near the user instead of one origin in Virginia, so each of those range requests is paid back against a PoP a few milliseconds away rather than across an ocean. On a real 7.3 MB track the edge alone went from ~1.8 s and ~4 MB/s (S3 direct from Europe) to ~0.38 s and ~19 MB/s on a warm edge — roughly 4× the throughput and ~10× the time-to-first-byte — and that multiplies across every chunk of a hundred-megabyte pack.

So the headline number here is not the kind you shave a few milliseconds off. We were not turning 100 ms into 30 ms. We were turning a download that took ten minutes — or never finished at all — into one that finishes in about thirty seconds. Different order of magnitude, different bug, different fix: the slow part was never the answer coming back, it was the file going out.

What it buys, concretely

For a listener inside Russia, the win is not “a bit faster” — it is the difference between works and does not work. Before the contour, a large pack downloaded across the border could stall on the cross-border leg and time out; with the data sitting in Yandex Object Storage inside the country, the request never leaves Russia, so first-byte latency drops from “variable and lossy” to a local fetch in the tens of milliseconds, and the timeout-and-retry churn that used to eat downloads collapses toward zero. The closed contour also means there is no single switch — flicked in Moscow or anywhere else — that can take the app offline for its Russian listeners. That, more than any millisecond, is the point.

The honest cost

We will not pretend otherwise: this doubles the support burden. Every catalogue update, every new lecture, every corrected transcript has to roll out across both contours. Storage and traffic are doubled too. Architecturally it is more expensive and more fragile to operate than one CDN for the whole world.

But availability matters more. The lectures should open for a grandmother in a village on 3G and for a student in Moscow on gigabit, equally. For a listener in Toronto and a listener in Novosibirsk, equally. If that takes holding two copies of everything — two contours, and within the global one, two regions — then we hold two copies of everything.