At Sumo Logic’s Illuminate conference, Head of Cloud Operations at Automation Anywhere Raj Desikavinayagompillai hosted a session about his company’s work with Sumo Logic. During the session, Raj also highlighted Shoreline as a key piece of his company’s strategy to automate and accelerate cloud operations moving forward.
Read on to learn how Automation Anywhere is leveraging Shoreline to optimize production operations and manage cloud reliability incidents with ease. You can also check out his session from Illuminate 2022 here.
A commitment to automation
With a million process automation bots running across the world, Automation Anywhere helps companies transform the way they do business. Like Shoreline, Automation Anywhere believes that automation is an essential tool in modern business that should be used to enhance the work of humans, not replace them. The company’s goal is to use automation to liberate workers from mundane tasks so humans can focus on creative, innovative solutions for the more complex problems that plague their companies.
Naturally — as Automation Anywhere’s name and mission imply — the company is always thinking about what to automate next. The team focused their efforts on cloud operations because, as Automation Anywhere grew, the team encountered a lot of the same problems that we hear from many production operations executives:
- Data and logs continue to grow, while budgets shrink. As environments become larger and more complex, more reliability issues occur. It’s tough to hire enough on-call operators — and afford enough fancy tools — to keep up with demand.
- Operations teams spend too much time in firefighting mode. The influx of reliability issues causes a butterfly effect across the entire team. On-call operators are constantly solving urgent issues instead of stepping back to create innovative solutions to fix root causes. For Automation Anywhere, this was especially evident during spikes in usage — like when a new feature was released.
- When a problem occurs, it’s hard for operations teams to figure out exactly what’s broken. Production operations teams are spending too much time diagnosing issues, which leads to a lengthy timeline to make a final repair. As Raj said, diagnosing issues is particularly hard because it typically requires more than searching for a needle in a haystack. It’s more like searching for a “needle in a needlestack.”
To tackle these issues, Automaton Anywhere teamed up with Sumo Logic to monitor data and logs and create notifications that alert the team when something is broken. The automatic notification system saves Automation Anywhere valuable time when an issue occurs, as they no longer have to rely on coming across a problem organically — or from an angry customer.
But with a commitment to automation, the team knew they could push themselves further to optimize their operations processes.
First, automate existing runbooks with Shoreline Op Packs
Once alerts and notifications came in from Sumo Logic, the team at Automation Anywhere had to implement manual actions to find the root cause of issues, review the company’s runbooks, and implement a remediation sequence. While the remediations don’t take long for Sumo Logic engineers, not everyone on-call is Kubernetes-savvy, so it can take them much longer to work through a runbook. Auto-remediations through Shoreline offer a solution that fixes issues in real-time, which not only saves developer time for each incident, but also eliminates time lost due to the work disruptions which often take even longer to overcome.
To optimize production operations, the team wanted to automate as many post-notification actions as possible. Raj saw this as a way to make his most productive employees even more powerful and free up valuable time so they could work on more impactful projects.
To get there, Automation Anywhere used Shoreline’s Op Packs. Our Op Packs are pre-built solutions consisting of Alarms and Actions, as well as the Bots that connect them together for auto-remediation. Once configured, they continuously monitor Automation Anywhere’s environment and automatically fire a series of remediation Actions based on the company’s runbooks whenever issues are detected.
For example, FluentD is a service that sends logs to Sumo Logic so that they can be analyzed for issues. But the service can sometimes stop sending data without warning. That means that metrics for debugging may be missing when they are needed most, and users can lose trust in the system. Shoreline automatically restarts FluentD when necessary, keeping the data flowing, and creating better customer experiences.
It’s had a huge impact on Automation Anywhere. Raj noted that, for just one use case, this process has helped reduce one to three manual repairs per day, saving cloud ops engineers an estimated 45 minutes of work everyday, not counting time lost due to work disruption.
Building the future with Shoreline
The company plans to invest more heavily in our cloud reliability solution as they continue to optimize production operations processes. The next step is to directly connect Sumo Logic’s monitors to Shoreline’s cloud reliability platform. With that connection, Sumo Logic can instantly detect something that is not right and then trigger Shoreline to investigate the issue and invoke a remediation script. It’s an end-to-end automation solution that detects and solves issues before an on-call operator is even made aware an issue has occurred.
We believe Automation Anywhere is paving the way for the future of production operations. The company’s use of Shoreline highlights the exact goal of our solution — to find a better way to manage incidents, so teams can spend more time innovating.