Let’s talk about the Ticketmaster (Taylor Swift) Debacle and what we can learn from it.
You may remember this incident where Ticketmaster tried to sell tickets for a Taylor Swift concert, and their site went down for hours.
They said that it happened due to the unprecedented demand.
To me, that’s nonsense because this situation could have been easily avoided if they had load tested their systems properly.
But I want to talk about a deeper underlying issue: The first job of a service is to protect itself.
You do that by putting a queue in front of your service, which acts as a buffer between the service and the incoming requests. Suppose your service can handle 500 requests per second. If 50,000 requests arrive, instead of crashing, it will show an error message or queue up the other 49500 requests while it serves the 500.
Had Ticketmaster used this mechanism, it’d have protected their service from crashing while ensuring that a portion of the demand was still being served.
Think of a queue as an escalator. It operates at a consistent pace and can handle a certain amount of demand.
In comparison, an elevator is like a service that does not handle variability in demand very well. During rush hour, elevators tend to get overwhelmed and stop functioning effectively.
So when designing a reliable service, try to create an escalator-like system instead of an elevator.