When I first wrote about AIOps in 2017, Gartner was predicting that IT operations (ITOps) personnel were in for a major change over the next few years. Traditional IT management techniques were viewed as unable to cope with digital business transformation. Gartner predicted that there would be significant changes in ITOps procedures and a restructuring of how we manage our IT ecosystems. They called the evolving platform on which these changes would take place “AIOps.”
Changes in IT over the intervening years have proven Gartner correct. Interest and adoption of AIOps has increased exponentially as organizations have sought to enable innovation, fend off disruptors, and manage the velocity, volume, and variety of digital data that is beyond human scale. This blog covers the original and current market drivers of AIOps and its components and benefits.
Digital transformation and the road to AIOps
It’s important to understand how digital transformation gave rise to Gartner’s AIOps platform. Digital transformation encompasses DevOps and the adoption of cloud and new technologies like containers. It represents a shift from centralized IT to applications and developers, an increased pace of innovation and deployment, and the acquisition of new digital users—machine agents, Internet of Things (IoT) devices, Application Program Interfaces (APIs), etc.—that organizations previously didn’t need to service. All of these new technologies and users are straining traditional performance and service management strategies and tools to the breaking point. AIOps is the ITOps paradigm shift required to handle these digital transformation issues.
What is AIOps?
AIOps stands for artificial intelligence for IT operations. It refers to multi-layered technology platforms that automate and enhance IT operations through analytics and machine learning (ML). AIOps platforms leverage big data, collecting a variety of data from various IT operations tools and devices in order to automatically spot and react to issues in real-time while still providing traditional historical analytics.
Gartner explains how an AIOps platform works by using the diagram in Figure 1. AIOps has two main components: big data and ML. It requires a move away from siloed IT data in order to aggregate observational data (such as that found in monitoring systems and job logs) alongside engagement data (usually found in ticket, incident, and event recording) inside a big data platform. AIOps then implements a comprehensive analytics and ML strategy against the combined IT data. The desired outcome is automation-driven insights that yield continuous improvements and fixes. AIOps can be thought of as continuous integration and deployment (CI/CD) for core IT functions.
AIOps bridges three different IT disciplines—service management (“Engage”), performance management (“Observe”), and automation (“Act”)—to accomplish the goal of continuous insights and improvements. AIOps creates a game plan that recognizes that, within our new accelerated IT environments, there must be a new approach that’s underwritten by advances in big data and ML.
What’s driving AIOps?
AIOps is the evolution of IT operational analytics (ITOA). It grows out of several trends and needs affecting ITOps, including:
- IT environments exceeding human scale. Traditional approaches to managing IT complexity—offline, manual efforts that require human intervention—don’t work in dynamic, elastic environments. Tracking and managing this complexity through manual, human oversight is no longer possible. ITOps has been exceeding human scale for years and it continues to get worse.
- The amount of data that ITOps needs to retain is exponentially increasing. Performance monitoring is generating exponentially larger numbers of events and alerts. Service ticket volumes experience step-function increases with the introduction of IoT devices, APIs, mobile applications, and digital or machine users. Again, it is simply becoming too complex for manual reporting and analysis.
- Infrastructure problems must be addressed at ever-increasing speeds. As organizations digitize their business, IT becomes the business. The “consumerization” of technology has changed user expectations for all industries. Reactions to IT events—whether real or perceived—need to occur immediately, particularly when an issue impacts user experience.
- More computing power is moving to the edges of the network. The ease with which cloud infrastructure and third-party services can be adopted has empowered line of business (LOB) functions to build their own IT solutions and applications. Control and budget have shifted from the core of IT to the edge. And more computing power (that can be leveraged) is being added from outside core IT.
- Developers have more power and influence but accountability still sits with core IT. As I talk about in my post on application-centric infrastructure, DevOps and Agile are forcing programmers to take on more monitoring responsibility at the application level, but accountability for the overall health of the IT ecosystem and the interaction between applications, services, and infrastructure still remains the province of core IT. ITOps is taking on more responsibility just as their networks are getting more complex.
It should be noted that an acknowledgement that ITOps management is exceeding human scale doesn’t mean that the machines are replacing humans. It means we need big data, AI/ML, and automation to deal with the new reality. Humans aren’t replaced, but ITOps personnel will need to develop new skills and new roles will emerge.
The elements of AIOps
I’m going to take a moment here to go through the elements of AIOps as represented in the Gartner diagram above. While I encourage everyone to read the Market Guide, what follows should serve as an adequate grounding in the key pieces of the AIOps puzzle and how they contribute.
- Extensive and diverse IT data: Enumerated in the black and blue chevrons, AIOps is predicated on bringing together diverse data from both IT operations management (ITOM) (metrics, events, etc.) and IT service management (ITSM) (incidents, changes, etc.). This is often referred to as “breaking down data silos”—bringing data together from disparate tools so they can “speak” to each other and accelerate root cause identification or enable automation.
- Aggregated big data platform: At the heart of the platform (in the center of the graphic) is big data. As the data is liberated from siloed tools, it needs to be brought together to support next-level analytics. This needs to occur not just offline—as in a forensic investigation using historical data—but also in real-time as data is ingested. See my other post for more on AIOps and big data.
- Machine learning: Big data enables the application of ML to analyze vast quantities of diverse data. This is not possible prior to bringing the data together nor by manual human effort. ML automates existing, manual analytics and enables new analytics on new data—all at a scale and speed unavailable without AIOps.
- Observe: This is the evolution of the traditional ITOM domain that integrates development (traces) and other non-ITOM data (topology, business metrics) to enable new modalities of correlation and contextualization. In combination with real-time processing, probable-cause identification becomes simultaneous with issue generation.
- Engage: The evolution of the traditional ITSM domain includes bi-directional communication with ITOM data to support the above analyses and auto-create documentation for audit and compliance/regulatory requirements. AI/ML expresses itself here in cognitive classification plus routing and intelligence at the user touchpoint, e.g., chatbots.
- Act: This is the “final mile” of the AIOps value chain. Automating analysis, workflow, and documentation is for naught if responsibility for action is put back in human hands. Act encompasses the codification of human domain knowledge into the automation and orchestration of remediation and response.
The future of AIOps
Understanding what is driving AIOps and how it is a response gets us to the current state of the market. As IT moves beyond human scale, IT tooling needs to adapt. But simply reacting defensively is not enough. The organizations that embrace AIOps will see the challenge it is meant to address as an opportunity to grow, evolve, innovate, and disrupt. Here are some ways that AIOps-enabled organizations will transform their business in the next five years.
- Technology becomes more human: Analytics and orchestration enable frictionless experiences, allowing ubiquitous self-service.
- The automation of technology, and, hence, business processes: Costs lower, speed increases, and errors decrease while freeing up human capital for higher-level achievement.
- Enterprise ITOps gains DevOps agility: Continuous delivery extends to operations and the business.
- Data becomes currency: The vast wealth of untapped business data is capitalized, unleashing high-value use cases and monetization opportunities.
At BMC, we call this vision of an AIOps-enabled future the Autonomous Digital Enterprise. Our mission is to enable our customers to innovate and differentiate quickly and continuously to deliver customer-driven value. The successful organizations of tomorrow will be the ones embracing intelligent, tech-enabled systems that allow them to thrive while others falter during times of massive change.
Although AIOps is a seismic change for IT operations, it’s not a radical application of analytics and machine learning. A similar ML approach was implemented when stockbrokers moved from manual trading to machine trading. Analytics and ML are used in social media and in applications like Google Maps, Waze, and Yelp, as well as in online marketplaces like Amazon and eBay. These techniques are used reliably and extensively in environments where real-time responses to dynamically-changing conditions and user customization are required.
AIOps is the application of tried-and-true technology and processes to ITOps. ITOps personnel are typically slow to adopt new technologies because, out of necessity, our jobs have always been more conservative. It’s the job of ITOps to make sure the lights stay on and provide stability for the infrastructure that supports organizational applications. We’ve passed the tipping point, however, and AIOps adoption is the key indicator for the trajectory of the digital enterprise.