In this five-part blog series, members of the BMC OnDemand organization will share their perspectives on the five key tenets that guide the way we run BMC software in the cloud for our customers. Nandu Mahadevan, who leads BMC’s OnDemand organization, provides his views on the fourth tenet: Operational Intelligence.
Tenet Four: Operational Intelligence
Operational Intelligence is the critical information necessary for both people and systems to make the best decisions in an operational environment. For example, when there has been an impact to service or a threat is looming, several critical questions arise. They include:
- How many users are currently logged on?
- What data center is this?
- What configuration item is impacted?
- What changes were deployed recently?
- Is the symptom on a single server or on multiple servers?
- Have we seen this happen before?
- What are the versions of the components in use?
- Why is a parameter configured out of standard? Is that a known exception?
- Who worked on this last?
- Are there external integrations like LDAP, local or SAML IDP, ODBC, Reporting, Monitoring, Discovery, Knowledge broker, or federated data?
- Are there any batch jobs active?
- Are there any long running queries or database blocks currently?
- Who is the Service Delivery Manager?
- What is the customer’s contact information?
- Which logs should be captured for further investigation?
- Which experts should be consulted?
- And more…
BMC Remedy OnDemand Network Operations Center (NOC) has access to all this information in five minutes or less! Growing this body of knowledge and providing immediate accessibility is key to building sophistication and scale. If you take a monthly patching process, for instance, it is important to have a reliable source of truth for the targets and the right server credentials to perform the task.
Back in the day (2007), I took a BMC-led, executive ITIL® course that included an Airport Simulation exercise. Wow, what an impression did that make, and especially given that my memory rarely goes back that far! One of my key takeaways from that simulation was the concept of Operational Intelligence: business value realization by harnessing the power of knowledge.
As a core tenet in our operations, we ask the question: “How could we have done this better for the customer?” And many a time, the answer adds to our knowledge base or generates specific requirements for our Center of Excellence (COE) team to then build and implement. This is one way to enable continuous service improvement (CSI).
One of our customers reported slowness in the behavior of a particular field within Incident Management. This field was critical to their business process: it discerned whether the end-user was a VIP user. Any delays directly affected SLAs. After about seven hours of troubleshooting, the issue ultimately was resolved by rebuilding statistics on a database table, an activity that took just five minutes. So how does one prevent spending seven hours next time for a five-minute job? We knew a simple knowledge article was not going to be sufficient. In addition to building awareness, we made our operations more intelligent: all databases that need statistics rebuilt now get them automatically. This is also a perfect example of our previous tenet, Proactive Hygiene.
In our next blog, we’ll discuss our fifth tenet: Infrastructure Resiliency.
In case you missed them, here are links to the first three Tenets for high-performance IT operations.