Designing feature flags that survive an incident
What our flag system has to do on its worst day — graceful degradation, clear owners, and the three rules we use to keep a runtime switch from becoming the next outage.
Short essays from Nexus Dev engineers and designers — on product engineering, cloud platforms, and shipping AI features that hold up in production.
What our flag system has to do on its worst day — graceful degradation, clear owners, and the three rules we use to keep a runtime switch from becoming the next outage.
Short reads from the team — the kind of notes we'd normally keep in a shared doc, published in case they're useful to someone else.
Letters, numbers, spaces, and basic punctuation only.
A field guide to retrieval eval loops — what to measure, what to ignore, and when to throw the whole setup out and start over.
Runbooks, IAM sprawl, and the unexciting work that determines whether a migration lands on time or slips two quarters.
Why we ask for a messy export before we open Figma — and what it changes about the first week of a project.
How a six-person rotation stays humane — paging thresholds, runbook hygiene, and post-incident reviews that actually land.
Five component patterns that quietly make our React apps more accessible, with the tradeoffs we've hit along the way.
A pragmatic look at when to host, when to call an API, and the cache strategy that made our production eval bill survivable.
No articles match your filters. Try another category or clear the search.
One short digest at the start of each month — new articles, significant corrections, and what we're working on next. No tracking pixels, no third-party trackers.