About Shoreline’s Fleet-Wide Debugging and Repair

At Shoreline, we enable highly targeted fleet-wide debugging and repair.

It allows you to:

It’s similar to what you’d do to debug an individual box, but you're debugging across the fleet in about the same amount of time.

You can do many things in this model that you couldn't through dashboards.

For example:

At AWS, a large-scale event happened once due to a BIOS upgrade.

There's no way we could have a log file or a dashboard for it.

The only way out was to log into the boxes and find out what the heck was going on.

So I had ~20 people run this manual parallelization process (which is obviously ridiculous).

But that was the only way back then.

Today, you can use Shoreline to safely run individual commands across a lot of boxes simultaneously, all by yourself.

It is executed in a parallel distributed framework (like everything else we do at Shoreline).

That’s how our fleet-wide debugging and repair works.

Have you ever done fleetwide debugging? Could you use this capability?