Service Management Blog

Ensuring High Availability for a Nonprofit Healthcare Organization

4 minute read
George Klarmann

Healthcare organizations today are embracing digital technologies to improve health outcomes and keep costs in check through highly efficient business operations. Electronic Medical Records (EMR) systems are a great example. These systems enable medical professionals to not only record and track a patient’s medical history, vital statistics, treatments, and progress but also share that data securely with everyone involved in the patient’s care.

But what happens if the EMR system shuts down unexpectedly due to an infrastructure issue? Physicians, nurses, medical technicians, and others no longer have access to the information they need to treat patients effectively. Lives could literally be at stake. And that’s precisely why high availability of patient-care systems—and business applications as well—is a top priority for IT professionals in the healthcare industry.

Transcendence IT, a BMC partner, recently worked with a regional not-for-profit healthcare organization to enhance its application and infrastructure event management capabilities to better meet the organization’s ultra-high availability requirements. The infrastructure serves the organization’s two dozen acute care hospitals and 100+ clinics. The project focused on modernizing enterprise monitoring and enabling data-driven event management.

The Backstory

The healthcare organization’s IT staff engaged us to address multiple concerns about the trustworthiness of the then-current event monitoring environment, which comprised a patchwork of monitors, event managers, and agents. The operations team had been struggling to keep up with alerts from many disparate sources. AS-400s were monitored by BMC Patrol and AIX and Linux were monitored by Microsoft System Center Operations Manager (SCOM). The operations staff worried that the EMR system wasn’t receiving an adequate level of scrutiny. In addition, the staff had identified monitoring gaps for Apache, IIS, SQL Server, Tomcat, DB2, Microsoft SQL Server, Microsoft Exchange, and other important software technologies.

After working with the organization’s IT staff to specify requirements, Transcendence IT recommended replacing the event management tool with BMC’s TrueSight Operations Management. This solution offers sophisticated event management capabilities, with the ability to capture alerts from a variety of monitoring tools—including Splunk, SCOM, Lawson, and HP Network Node Manager—and present them in a single pane of glass for better visibility and management.

The biggest challenge the Transcendence IT team faced was the extremely aggressive timeline. Team members needed to deliver and go live with this expansive and comprehensive enterprise monitoring solution before the current event management software came up for renewal. The Transcendence IT team had less than five months to get the job done.

TrueSight in Action

TrueSight is delivering numerous benefits. The solution funnels alerts from thousands of monitoring tools and agents into TrueSight, providing broad visibility into what’s happening in the infrastructure. Proactive monitoring has accelerated response to issues and shortened mean time to resolution (MTTR). In addition, broader coverage, including more comprehensive monitoring of the AS 400s running the EMR system, ensures the high availability required for exceptional patient care.

Integration of TrueSight with the organization’s ITSM solution enables automatic generation of incident tickets. The tickets are routed immediately to the right team for prompt handling, allowing the IT staff to step in before problems result in service disruptions.

The integration was tricky. A key requirement was that all alerts, messages, and tickets generated in one data center must be available to the other data center, even if the connection between the two is disrupted. The fault tolerant approach designed and implemented by our consultants proved highly effective when a domain name system (DNS) issue interrupted communications between the two data centers. After communications were reestablished, everything worked exactly as planned. All alerts, messages, and tickets were saved and queued up. They were automatically forwarded to the ITSM system when systems were brought back online, so the two data centers remained in sync.

Saving Time and Money

TrueSight Operations Management is saving the organization both time and money by streamlining event management and enabling proactive response to issues. Highlights include:

  • 30 percent faster onboarding of new system monitors
  • A $400,000 reduction in monitoring costs
  • A 27% improvement in mean time to resolution
  • Avoidance of $1,000,000 in spending for the renewal of the incumbent monitoring license

In addition, TrueSight IT analytics capabilities use machine learning to understand normal behavior and issue alerts only on abnormalities, significantly reducing event noise.

The solution has enhanced alert management related to routine maintenance. Previously, when servers were brought down for maintenance, the monitoring system would automatically issue superfluous alerts and notifications. Because routine maintenance usually occurs outside of normal business hours, operations people weren’t on hand to field the alerts. Consequently, the alerts would trigger a phone call to an operations person at home, often in the middle of the night. State law requires that a person responding to an alert must receive a minimum of two hours pay at overtime rates—even if it takes only minutes to determine that no action is required. TrueSight filters out the maintenance-related alerts, yielding a significant reduction in overtime costs.

Planning for the Future

The IT staff is enthusiastic about the results achieved to date and is looking ahead to the additional benefits TrueSight Operations Management can deliver. The solution’s machine learning and advanced analytics will pay off in the future by helping operations people find and fix problems faster, resulting in reduced risk of downtime, higher efficiency, and lower costs. IT is now positioned to consolidate monitoring and event management tools, reducing the number of vendors involved to simplify the environment and boost efficiency, all while containing costs.

Learn More about TrueSight Operations Management

These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

BMC Brings the A-Game

BMC works with 86% of the Forbes Global 50 and customers and partners around the world to create their future. With our history of innovation, industry-leading automation, operations, and service management solutions, combined with unmatched flexibility, we help organizations free up time and space to become an Autonomous Digital Enterprise that conquers the opportunities ahead.
Learn more about BMC ›

About the author

George Klarmann

I’ve been working on BMC products since 2009—BMC Helix Discovery, MainView Middleware, TrueSight Operations Management, BMC Helix, BMC Helix CMDB. With in-depth Global Fortune 1000 experience across multiple industries, I am skilled in full project life cycle in Business Service Management, Enterprise Application Integration, and Application Development. Innovative, I solve complex problems with creative and simple solutions. I thoroughly understand and translate business requirements into technical strategies and solutions. Through collaboration with all stakeholders, I create excellent IT solution results.