Automated root cause analysis is finally here
The rise of Kubernetes has been swift and it’s the go-to system for containerised applications for many startups and corporations alike. By its very nature, it’s core to many of the key processes for these companies and their applications. It makes any problems with Kubernetes a potentially massive nightmare for developers to need to deal with. As a consultant, I’ve seen clients lose their heads when something goes wrong with their setup.
Every company should have fallbacks and ensure maximum resilience for such important infrastructure. Komodor recently revealed their new Workflows feature just before Kubecon 2021, and the potential impact on developers looks incredible. The idea is to democratise troubleshooting of Kubernetes problems.
Other companies have claimed to do this in the past, but Komodor seems to be the first to really deliver. They only came out of stealth six months ago and already have been named a Gartner Cool Vendor 2021. Before you make any big plans about how you’re going to manage your Kubernetes in the future, it’s worth learning more about Workflows and deciding if it’s right for you.
How it works
There’s one important thing to note right at the start, Komodor Workflows isn’t going to take over your systems and potentially implement fixes that will break your system. Instead, it’s going to monitor your system and offer you solutions when something goes wrong. You still have all the power.
For those unfamiliar with the existing Komodor product, it essentially is a way to track everything across your K8s system. You can zoom into specific namespaces or clusters for example so you see a full timeline including deployments and health changes. This gives you the context you need to troubleshoot effectively with the information organised for you.
The new Workflows feature builds upon this and uses the signals created by the application to detect Kubernetes issues and take them through a preconfigured workflow. It runs through the normal checks a developer would have to do to work out what the underlying root issue is and uses this to give detailed instructions for what needs to be done.
The beauty of this is when a Kubernetes expert is unavailable, there’s no need to panic. The workflows contain vast amounts of troubleshooting steps that even the most experienced developers would struggle to match. There are no issues with privileges either. Rights issues can sometimes slow down human troubleshooters if they don’t have access to the offending code but Workflows will have this by design. If you’re a DevOps leader, I feel sorry for you because I know all of these tasks would normally land at your feet regardless of how big or small they are. Workflows may turn out to be your best friend.
Why it is such a game-changer
The checks automated by Komodor’s Workflows would normally take days and require immediate attention from developers pulling them away from whatever they were working on. During this time multiple teams could be affected and drag down productivity across the board. From a business perspective, this is money down the drain. For a developer, it’s even more pressure to deliver the things they were originally working on as timelines are rarely adjusted for emergencies.
When I heard about the feature, initially I was a sceptic. Sure, it would automate things but would it really save that much time? I was completely wrong and in demos, they were getting solutions to problems in under a second. Yes, it still needs someone to carry out the steps but that’s usually the easy bit relative to the normal root cause analysis.
Built with organisational structure in mind
The best thing about Komodor’s approach is that they give full transparency behind why different actions are suggested. It means someone senior within the team can always validate the decision and offer quality assurance. In this way, you’ll never be left scratching your head as to why Workflows gave you the answer it did. This makes it a safer investment because you know it can’t become a single point of failure. It can be a good way for junior developers to increase their understanding of how to troubleshoot too.
The killer feature for the future is the ability for companies to create their own playbooks. It means companies can customise their suggested solutions and add steps that are unique to their own businesses. I believe this feature will remove any hesitation from internal tech teams as it becomes a way to stop them from repeating efforts on the more difficult issues.
Workflows are built by developers for developers and it’s impressive to see the attention to detail taking into account the reality of how teams work. No company wants to be held hostage by software that no one can understand and Komodor has gone out of its way to ensure humans always have control. This greatly lowers the risk and should mean more companies begin to use it as part of their K8s best practice processes.