I had no sooner posted about how Heartbleed is somehow still a problem than there was yet another new vulnerability out there. Reassuringly, the new disclosure from our good friends at Qualys reinforces my point. My post explored why Heartbleed is still a problem more than nine months after it was first disclosed, and I blamed the SecOps Gap. While security teams may have been aware of issues, IT operations either didn’t get the information, were not able to prioritise it correctly, or were simply too underwater to deal with it in a timely manner.
This “new” GHOST vulnerability is another example of the same thing. The original bug in glibc dates back to the year 2000, and has already been fixed once, in 2013. As the Qualys blog post explains:
Unfortunately, it was not recognized as a security threat; as a result, most stable and long-term-support distributions were left exposed
I emphasised that part about “long-term-support distributions” because it’s key to understanding the SecOps gap and these long windows of vulnerability.
Predictability means availability
To take the example I am personally most familiar with: Debian is a distribution of Linux that is known for its stability and predictability. Its maintainers prized that above keeping up with the latest versions of components, let alone following along with the Linux world’s periodic wholesale migrations from one component to another. This, however, became a problem for people using Debian on the desktop, who wanted to keep up with the latest web browsers and so on. In 2004, a new distribution was created on the basis of Debian, called Ubuntu. The maintainers of Ubuntu prize being up to date over ultimate stability, and provide major releases every six months, with the option for more rapid updates in between. Over time users expressed a need for more stability – and so the Ubuntu LTS (Long-Term Support) releases were born, guaranteed to be stable and supported for five years.
IT operations people like the predictability of Debian or of Ubuntu LTS. This allows them to have an environment that is at least somewhat uniform so that deploying an application here should work roughly the same – and take about as long – as deploying it there.
Security is the problem here. When a new vulnerability is announced with its accompanying patch, fix, or workaround, the security goal is to deploy it as rapidly as possible. The faster you can patch, the shorter a time you are vulnerable for, closing the window of vulnerability.
The cause of the gap between Security and Operations is that making wholesale changes in a hurry goes against that ethos of predictability and stability. GHOST is a good example, because since the bug affects a service as core to the operating system as glibc, the best advice is that a full reboot is required, not just a restart of individual services.
A reboot of a production server is not undertaken lightly, because by definition it impacts all the applications running on that system and their users. For IT Operations, there is a delicate calculus between how urgent the change under consideration is, versus how many people will be affected. If you take down something the sales team needs at the end of quarter, there will be a baying mob with pitchforks and flaming torches at the door of the data center – so you evaluate each change very carefully indeed.
All sorts of mechanisms exist to mitigate this problem. You can cluster different components, allowing you to do rolling upgrades where you drop a node from the cluster, patch it, return it to the cluster, and continue to the next one, keeping the application available throughout the operation. This sort of setup is complicated and expensive, though, so only core applications generally receive it.
Containerisation is very much in the news lately, but it won’t help here, as glibc is one of the core infrastructure components that are shared between containers. The hope for Docker administrators is that their base OS post-dates the fix in glibc (that came in version 2.18, for reference). The same sort of thing may happen again in the future, though, so it’s worth bearing in mind that containerisation, while useful, is not a panacea.
So what do we do now?
We at BMC have been working with Qualys to figure out how to close the SecOps gap, enabling security and operations teams to work together seamlessly. This will give everyone what they need: rapid response to emerging security requirements, coupled with control and governance over the process, with the whole effort supported by industrial-grade automation capabilities to implement the final decision. You can find out more at bmc.com/SecOps, and stand by for exciting announcements around this topic in the next few weeks.
- Data Center Management Tools: Features, Functions, and How To Choose
- Data Center Automation Explained Simply
- Heartbleed and the SecOps gap
- Data Center Tiers: What Are They and Why Are They Important?
- IT Risk Assessment and ITSM Service Delivery: What You Need to Know