After the congratulations and getting-to-know-you introductions are settled, one of the first formal "as your CISO..." conversations I have with any new IT leader deals with vulnerability management. How do we plan for and manage the inevitable disclosure/planning/remediation cycle for exploitable weaknesses in our systems? These are challenging conversations because they tend to land squarely at the intersection of multiple ambiguities:
- When will we find out we have a vulnerability, and what project and non-project work will I already have in flight that will have to be reprioritized?
- How severe will the vulnerability be, and who is exploiting it?
- Will I have a change window available to me that falls within the remediation SLA?
From the looks of it, you might think it were my exciting proof that I had built a better AI for Breakout than DeepMind/Google. I wish! Rather, representing known vulnerability data from cvedetails.com, it maps out the frequency (cells shaded) with which various common technologies have had serious security defects disclosed in the past 6+ quarters, with "serious" being defined somewhat arbitrarily (by me) as "having a CVSS score of above 6.0." (Note: see Roytman/Bellis for a healthy perspective on why CVSS is an imperfect rubric.) So, while it's hard to predict exactly when a certain product will need a vulnerability mitigated, past performance does give us a reasonably good idea of the number of "opportunities" to exercise mitigation plans in a large enough future window.
A couple of interesting observations emerge from adopting this view. One is that the purple box covers a set of products that have staggeringly good odds of needing to be patched in any given month — enough so that it pays to have a robust, repeatable plan for scheduling and deploying fixes to these platforms on a set schedule, with predictable downtime published and expected by the user community that depends on them. Only one month out of the last 20 (October '14) has seen fewer than three out of the four end-user computing platforms in this group (Flash, Reader, IE/Office, and the Windows OS itself) go without a significant patch. Yes, a good restart manager is becoming a critical success factor for many vulnerability management programs.
Another is that the blue box represents a set of technologies that would need similarly robust plans defined to allow patching with very short turnaround times (because of the size of the population of possible attackers, i.e. "the whole Internet")... but with only Cisco and Juniper likely needing monthly attention while the others are proving less predictable, albeit equally important toward resisting breaches.
A final, and less obvious observation: in my experience, one of the more interesting phenomena in IT is what feels eerily like an anti-vaxxer approach to remediating vulnerabilities. Some enterprises routinely resist taking their maintenance meds even though the data says the upside greatly exceeds the potential risks. While vendors like Microsoft, Google, Mozilla, Adobe, and the various Linux distributions have made it relatively easy to innoculate automatically, the adoption rate for these capabilities remains remarkably low in most enterprises.
One argument against "automatic vaccination" is that automated system updates are forbidden by ITIL. My opinion: there is room within ITIL for well-documented, automated change processes with a break-glass option to opt out if a patch is shown to cause deleterious effects during initial burn-in with a well-defined population. Having to opt in to patching for these platforms, given their clockwork-like frequency, creates a monthly mountain of potentially avoidable work — especially given the race to mitigate before exploitation becomes likely.
Another concern is that there are occasionally bad patches. The ones marked in red in the table above are notable examples from Microsoft that were pulled and re-released, yet the total proportion of re-issued fixes within the purple box remains under 3.2% - darned good odds, given the significant amount of effort that IT teams ultimately put into the manual portions of patch deployment today, combined with the substantial regression testing that goes into the deployment of a patch by the cadre of vendors listed above. In fact, one can argue that our vendors perform considerably more QA testing on their patches before releasing to the public than most IT shops do for a variety of other ad-hoc changes (managing storage allocations, updating group policy objects, reconfiguring network cards, etc.) that are routinely treated as lower-risk than applying security-related maintenance.
Where does this leave IT leaders looking to optimize their staff's time and resources while remaining breach-resistant?* Looking at historical data *does* yield suggestions for dealing with unpredictable patch cycles; * Multiple strategies emerge for gracefully handling certain high-volume/high-risk and low-volume/high-risk combinations; * A more optimized strategy *may* be adopted, if the option to adjust traditional change management constraints and processes is available; * [Betteridge's Law](https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines) should apply to this post, one remediation cycle at a time.