According to Gartner, IT Operations personnel (IT Ops) are in for a major change over the next few years. This change is driven by frustration with traditional IT management techniques, methods that enterprise IT Ops teams see as unable to cope with digital business transformation. Gartner predicts we will see a significant change in current IT Ops procedures and a restructuring in how we manage our IT ecosystems. And the key to these changes is a new platform Gartner calls AIOps.
Over the next few posts, I’ll review AIOps and how it will affect all of us in the near future. Today, I discuss what AIOps is and what’s driving its development.
Digital transformation and the road to AIOps
It’s important to understand how digital transformation gives rise to Gartner’s AIOps platform. Digital transformation encompasses cloud adoption, rapid change, and the implementation of new technologies. It also requires a shift in focus to applications and developers, an increased pace of innovation and deployment, and the acquisition of new digital users–machine agents, Internet of Things (IOT) devices, Application Program Interfaces (APIs), etc.–that organizations didn’t need to service in the past. All these new technologies and users are straining traditional performance and service management strategies and tools to the breaking point.
Gartner uses the name AIOps to describe the IT Ops paradigm shift required to handle these digital transformation issues.
What is AIOps?
AIOps stands for Artificial Intelligence for IT Operations. It refers to multi-layered technology platforms that automate and enhance IT operations by using analytics and machine learning to analyze big data collected from various IT operations tools and devices, in order to automatically spot and react to issues in real time.
Gartner explains how an AIOps platform works by using the diagram in figure 1. AIOps has two main components: Big Data and Machine Learning. It requires a move away from siloed IT data in order to aggregate observational data (such as that found in monitoring systems and job logs) alongside engagement data (usually found in ticket, incident, and event recording) inside a Big Data platform. AIOps then implements a comprehensive Analytics and Machine Learning (ML) strategy against the combined IT data. The desired outcome is continuous insights that yield continuous improvements and fixes, using automation. AIOps can be thought of as Continuous Integration and Deployment (CI/CD) for core IT functions.
Figure 1: Gartner’s visualization of the AIOPS platform
AIOps bridges three different IT disciplines—service management, performance management, and automation—to accomplish its goals of continuous insights and improvements. AIOps is recognition and a game plan that in our new accelerated IT environments, there must be a new approach that’s underwritten by advances in big data and machine learning.
What’s driving AIOps?
AIOps is a new enough IT platform that it doesn’t even have its own Wikipedia page…yet. AIOps is Gartner’s next step evolution of IT Operational Analytics (ITOA). It’s growing out of several trends and needs affecting IT Operations, including:
- The difficulty IT Operations has in manually managing their infrastructure. It’s becoming a misnomer to use the term “infrastructure” here, as modern IT environments include managed cloud, unmanaged cloud, third party services, SaaS integrations, mobile, and more. Traditional approaches to managing complexity don’t work in dynamic, elastic environments. Tracking and managing this complexity through manual, human oversight is no longer possible. Current IT Ops technology is already beyond the scope of manual management and it will only get worse in the coming years.
- The amount of data that IT Ops needs to retain is exponentially increasing. Performance monitoring is generating exponentially larger numbers of events and alerts. Service ticket volumes experience step function increases with the introduction of IOT devices, APIs, mobile applications and digital or machine users. Again, it is simply becoming too complex for manual reporting and analysis.
- Infrastructure problems must be responded to at ever-increasing speeds. As organizations digitize their business, IT becomes the business. The “consumerization” of technology has changed user expectations for all industries. Reactions to IT events–whether real or perceived–need to occur immediately, particularly when an issue impacts user experience.
- More computing power is moving to the edges of the network. The ease with which cloud infrastructure and third party services can be adopted has empowered line of business (LOB) functions to build their own IT solutions and applications. Control and budget have shifted from the core of IT to the edge. More computing power (that can be taken advantage of) is being added from outside core IT.
- Developers have more power and influence but accountability still sits with core IT. As I talked about in my post on application-centric infrastructure, DevOps and Agile are forcing programmers to take more monitoring responsibility at the application level, but accountability for the overall health of the IT ecosystem and the interaction between applications, services and infrastructure still remains the province of core IT. IT Ops is taking on more responsibility just as their networks are getting more complex.
The elements of AIOps
AIOps acknowledges that the old way of doing IT Ops won’t work in the new world defined by the needs listed above. In the same way that Gartner has defined IT Operations Management (ITOM) and Application Performance Management (APM) as magic quadrant markets, Gartner may also build a magic quadrant for the AIOps marketplace.
AIOps platforms consist of the following elements, shown in figure 2:
Figure 2: The technologies that make up an AIOps platform
- Extensive and diverse IT data sources, from currently siloed tools and IT disciplines such as events, metrics, logs, job data, tickets, monitoring, etc.
- A big data platform that aggregates IT data for historical analysis and real-time reaction and insights.
- Computation (calculations) and Analytics that enable the system to generate new data and metadata from existing IT data. Calculations and analytics also eliminate noise, identify patterns or trends, isolate probable causes, expose underlying problems, and achieve other IT specific goals.
- Algorithms that leverage IT domain expertise to intelligently apply computation and analytics appropriately and efficiently, as dictated by an organization’s data and its desired outcomes.
- Unsupervised Machine learning that can automatically alter or create new algorithms based on the output of algorithmic analysis and new data introduced into the system.
- Visualization, which presents insights and recommendations to IT Ops in an easily consumable way, to facilitate understanding and action.
- Automation, which uses the outcomes generated by the analytics and machine learning to automatically create and apply a response or improved for identified issues.
As mentioned above, AIOps platforms should encompass the IT disciplines of Performance Management, Service Management, Automation, and Process Improvement, along with technologies such as monitoring, service desk, capacity management, cloud computing, SaaS, mobility, IoT and more.
It needs to be said that although AIOps represents a radical departure for IT Ops, it’s not a radical application of analytics and machine learning. A similar ML approach was implemented when stock brokers moved from manual trading to machine trading. Analytics and ML are used in social media, in applications like Google Maps, Waze, and Yelp, as well as in online marketplaces like Amazon and eBay. These techniques are used reliably and extensively in environments where real-time responses to dynamically changing conditions and user customization are required.
IT Ops personnel have been slow to adapt to AIOps-like environments because out of necessity, our jobs have always been more conservative. It’s IT Ops’ job to make sure the lights stay on and to provide stability for the infrastructure that organizational applications ride on. However, due to the trends listed above, more IT Ops shops (especially those in the Enterprise) will need to implement AIOps strategies and technologies in the near future.
- AIOps and the New IT Skill Sets
- Announcing TrueSight 11
- Why AIOps Needs Big Data and What That Means for You
- Supervised vs Unsupervised AIOps Machine Learning
- Reduce MTTR: Machine Learning to the Rescue