Blog

Resources and insights

Read the latest stuff we're up to and what we're most excited about.

Loading...
Filtered byAuthor: Kristine Newman
Unleashing the Power of AI in IT Operations
Podcast

Unleashing the Power of AI in IT Operations

In a compelling discussion, Evan and Anurag delve into the intricacies of Shoreline's AI Ops platform for incident response. Anurag, drawing from his experience leading reliable services at AWS, highlights the challenges of maintaining high availability in the face of rapid growth. He emphasizes the role of innovative automations in ensuring consistent service for demanding customers. Anurag suggests the first step in driving reliability for cloud services is understanding the root causes of incidents. He points to Shoreline's free tool designed to aid in this process. The conversation also features a case study of a major Shoreline client managing a 30,000-node fleet across multiple clouds and regions. Anurag shares how the client efficiently handles security checks and issue detections over thousands of instances simultaneously, treating the entire fleet as a single entity. For a deeper dive into this insightful discussion, the full video podcast is available on YouTube and LinkedIn.

Article

5 Ways to Prevent an Outage

The main challenge in preventing outages lies in the inevitable breakdown of various components like disks, nodes, and networks. To mitigate this, companies need to acknowledge human error as an unavoidable factor, especially when numerous commands are manually inputted daily. Investigating how minor errors can cause significant damage and implementing safeguards and redundancies are essential steps to reduce the risks and impacts of potential outages.