In this Run and Reinvent podcast I chat with Paul Mercina, Director of Product Management at Park Place Technologies, about how his company is leveraging AIOps to better serve its customers. Park Place is one of the largest third-party hardware maintenance firms in the world. Below is a condensed transcript of our conversation.
Bill Talbot: Tell us a little bit more about your role and Park Place in general.
Paul Mercina: My primary role here at Park Place is to direct all the activities around product management and that mostly focuses on our product roadmap. So, all the things that we support today versus what we’re going to support going forward.
We’re supporting over 16,000 customers, we’re in 140 countries. We’re, last count, servicing/supporting about 55,000 data centers around the globe. So, we support and service a wide breadth of equipment, most of it’s all high-end storage server networking equipment and going back to one of my roles as a kind of the manager of the roadmap, the product roadmap, we’re always looking for the next platform that we’re going to be able to maintain and service for our customers.
Bill: What types of environments are you guys monitoring and managing on behalf of your customer base?
Paul: If you look at the stratification of our customer base, we’re in virtually every vertical. We’re particularly big in manufacturing, healthcare, legal, but virtually every vertical we’re in. A lot of small to medium business, but also Fortune 500. We’re in on-prem data centers, we service cloud service providers, co-los, hybrid environments. Anywhere you find high-end data center hardware. Pretty much regardless of the OEM and the product line, we’re probably supporting it.
Bill: I know you guys, a few years ago were looking for a better sort of IT monitoring and management solution, so what prompted that search and that evaluation and how did you guys end up choosing our BMC TrueSight platform?
Paul: We’ve been in hyper-growth mode ever since I joined about four years ago. One of the first things I noticed coming in, and we talked about the operations team, is the way we were kind of maintaining and supporting. It was pretty much a classical kind of break/fix field service model, where we had customers finding that they had a failed drive or something in their data center or IT environment. Calling our support center or emailing our support center, opening up a ticket.
We had, on average, a little over eight touches with that customer for every ticket they opened. And a lot of this was just gathering information like log files, diagnostic information, things that would help us identify the product, really understand its configuration, and data that would help us troubleshoot the root cause.
So, there’s a lot of back and forth. And that’s kind of one of the pain points we initially saw is we’re taking a lot of time from our customers and they have to spend some time engaging with us in the process and at the same time, on the operations side, we’re putting multiple resources on a single ticket trying to get the information we need to solve it.
Having said all that, we were a best in class provider and this was the typical model and you reach a point or kind of a plateau of how much you can improve your performance, if you’re driving the same process.
So, in parallel with all that activity, we were doing some level of monitoring, most of it was in the storage array products that we maintained and a lot of it was the OEM inherent monitoring tools. Which were pretty good in their own right, but when you service as many different – and maintain as many different OEM product lines and platforms as we do, you find that all of these alert messages that were coming in in the form of an email.
I mean, everybody has their own format, and there’s a high degree of complexity in these messages so we’re finding that we’re missing certain calls when we should have been acting on them, acting on ones that probably didn’t need action, and by and large, for the most part we really weren’t connected to or monitoring the majority of our install base. And, to try and scale that business to – I mentioned we were in hyper-growth mode and still are – at the rate of growth the operations just couldn’t operate in that manner, in that method anymore. All that back and forth with customers and the highly differentiated messages we got and the complexity of those messages, it just was not going to scale.
And so that was a big pain point for us. The other one I would mention is, we were always looking for a way to differentiate ourselves, as good as we are and as much as we’re growing, we’re really looking to make an impact in the market, really disrupt the market with something new and different from a value prop standpoint.
So, I think from both an operational efficiency standpoint, and as I mentioned that, relevant to both our customers’ operations and our own internal operations, and from a real market differentiation standpoint, we set out to find technology that could help us really support multi-vendor IT environments, highly complex environments, and would help us scale our business.
Bill: It sounds like where you’re getting to is trying to move into more of a proactive and predictive type mode and strategy. So, how do you see our TrueSight solution, our AIOps solution, helping you achieve that?
Paul: I think there’s tremendous value in, number one, getting connected to the end-device that we’re supporting. So, that was kind of our first premise. We had to find a tool that could support the highly diverse contract base that we maintain and support, so we had to have something that could connect to that.
And the other key aspect of this was finding something, finding the right technology that operated at the hardware level, which is where we live. That’s our business is hardware maintenance.
So, we had to find technology that could really help us give insights into the configuration of the end-point, as well as any hardware related errors. We looked at a lot of different tools, a lot of good ones, but we found TrueSight from that perspective, from the hardware perspective, really served what we needed at the time and that was hardware monitoring at the component level. We also set out to find something that could normalize those alerts and events that came through the monitoring process.
We really were looking for a uniform message format that gives us details like product serial number, model number. We’re even looking for what device failed and what’s the part number of that device to really help us do a much better job of getting the right part on the first call. That’s really important to our business.
And also, something that really gives us insight into root cause and probable cause and remedial action, that would help us recover from that event and help our customers drive uptime and reduce the time to repair and just overall create a better experience for the customer. And at the same time, really driving operational efficiencies, as you said, inside of our shop.
So, that whole process of troubleshooting and identifying, we kind of refer to that as triage here at Park Place and we were looking for something that’d help us automate the triage process. And this is what TrueSight really is, one of the key things that it brought to the table for us.
It normalized those alerts and it basically automated the triage, which means, we don’t have to chase down people at the customer location to gather log files and request information. Because with every alert, and they all come in a very common format, no matter the manufacturer or product line, we get the configuration of that device. We get the part number of what’s failed. We even get a probable cause and remedial action that’s recommended. That really helps speed up the time to repair and ensures we get the right part out there on the first call.
The other thing we needed was something we could integrate with our ticketing system because we wanted to automate tickets. So, if we can filter out the alerts to the vital few actionable alerts, the critical events that need our attention, we can auto-generate a ticket, in our system. We can populate that on our website, give customers visibility, and some level of control over what we’re doing for them, since now they’re not having to see the fault and call us about the fault. In many cases, we’ll know about it before they do and they’ll find that we already have a ticket open and are already acting on their problem or maybe the ticket’s closed because we’ve resolved it.
We’re really trying to automate that end-to-end process of opening the ticket, automate that process so that the customer doesn’t have to spend time seeing faults and opening and calling and emailing us. And then that triage process that was creating a lot of that back and forth with the customer, and our goal was to get this down from, I said, over eight calls or back and forth interactions with the customer per ticket, our goal was to get that down to two. One to say, look we see a fault in your environment, we’re working on the problem. And that second call might be, we fixed your problem.
That’s really what we were trying to accomplish, and this is all starting to come to fruition now that we have this out in the market and in production.
Bill: Any other benefits you guys are starting to realize since you’ve implemented the TrueSight AIOps solution?
Paul: Our first-time fix rate, which I think is only going to get better as we add more volume, we’re consistently over 90 percent now. Whereas before it was usually hovering around 85 –86 percent, so we’ve made a good, measurable improvement in fixing it right on the first call. That drives uptime. That drives operational efficiency. It differentiates us from the competition, right? We’ve mentioned the touches from over eight interactions, down to two.
The other one I don’t want to overlook is we’re first to market with this in our industry. So, we’re leading the way here. There’s really – haven’t found anyone that we compete with that has an offering like this. That really filters those events down to the vital few, provides that automated triage, and really provides that whole proactive model for the customer. We’re also seeing our time to resolve improve. We’ve measured as high as 31 percent improvement in our time to get things back up and running for the customer, which is really significant for the customer.
And because we’re able to more accurately identify if it’s a failed component down to the, as I mentioned, down to the part number of that component, we can actually make less on-site visits because we know exactly which part the customer needs and many times if it’s a plug-and-play part our customers will ask us to send it and they’ll pop it in, versus us sending a field engineer out. So, we’ve actually gained some efficiencies on the dispatch side as well.