My Safety Check Killed 100% of Video Generations — Right When Traffic Spiked 3x
You know what's worse than a bug in production? A bug you introduced while fixing another bug — deployed at midnight, right when your biggest traffic spike ever is happening. The Setup: Our Best Da...

Source: DEV Community
You know what's worse than a bug in production? A bug you introduced while fixing another bug — deployed at midnight, right when your biggest traffic spike ever is happening. The Setup: Our Best Day Ever I run RepoClip, an AI-powered SaaS that generates promotional videos from GitHub repositories. On March 19, 2026, console.dev featured us in their newsletter. Traffic exploded: Day Active Users New Users March 19 (feature day) 448 445 March 20 (day after) 154 139 448 users. For a solo indie SaaS, that felt massive. But there was a problem I wouldn't discover until 24 hours later. The "Fix" That Broke Everything Earlier on March 19, I noticed that some video generations were failing because Remotion Lambda (our video renderer on AWS) couldn't download images from fal.ai's temporary CDN URLs fast enough. The URLs were expiring or timing out. So at midnight (00:20 JST, March 20), I shipped what I thought was a solid improvement: Pre-fetch all media from fal.ai CDN → Supabase Storage befor