Five things I learned from Hubbard in 2016

Unlike earthquakes and lightning strikes, for which unimpeachable data exists, information security trends suffer from underreporting, environmental factors and subjectivity that call into question any sort of estimate being done on a case-by-case basis.

– me, in /ciso/taking-the-bait/

There are times in your career when you feel vindicated, when someone you trust and respect echoes and validates ideas that have formed the foundation of your thinking. I was humbled and proud to have had the opportunity to read a relatively early draft of James Kaplan's Beyond Cybersecurity manuscript and find a kindred spirit who recognized, as did I, the importance of threat modeling over cyber-to-do checklists, of measurement over FUD^[1].

With similar magnitude, but on the opposite end of the scale, I was humbled this year to read Hubbard and Seiersen's How to Measure Anything in Cybersecurity Risk and find that it tore apart my notions of what could and couldn't be quantified in a field that I've been practicing for 20 years. I once read Jaquith's seminal Security Metrics expecting to crib some shortcuts on what to include in my then-embryonic scorecard – but instead learned much more profound lessons about data-to-ink ratios^[2] and how to present any measurement effectively. Similarly, Hubbard's book lived up to its title as a valuable treatise on how to measure risk, rather than what to measure. The five most memorable things I learned:

You need less data than you think, and you have more data than you realize
Heat maps: you're (probably) using them incorrectly
The four scales of risk measurement
Decomposition strategies
Getting to 90% confidence

You need less data than you think, and you have more data than you realize.

One of the first myths that Hubbard begins to debunk is the idea that measuring risk is frought with peril, and that decisions based on imperfect data are tainted by that imperfection. His most poingnant argument on this point is that of the "rule of five," which declares that for a data set of any arbitrary size, the median of the data set is 93.75% likely to be between the highest and lowest of five randomly-chosen data points.

Hubbard's brilliance is his ability to simplify the math that makes a seemingly provocative assertion imminently accessible. And, as with many math proofs, the best demonstration is to prove the opposite of the hypothesis to be false. In this case, given that any one randomly chosen point has a 50% chance of being either above or below the median, there is a 1 in 2^5 chance (3.125%) that all five points are above the median, and a similar 3.125% chance that all five points are below the median.

The likelihood of picking five random values that are all above (or below) the median is equivalent to the odds of getting heads on five consecutive coin tosses.

Therefore, Hubbard asserts that five random measurements are enough for a risk professional to circumscribe the median of (practically) any data set that they need to quantitatively describe for risk management purposes with over 90% confidence! We need less data than we think, and we have more data than we realize.

Heat maps: you're (probably) using them incorrectly

At some point in our professional careers, we've all employed the trusty heat map to quickly put data into buckets. Low, medium, high, critical. Moderate, severe, catastrophic. Red, amber, green. Sometimes, we've even invented ways to perform arithmetic on these adjectives! Hubbard makes an important assertion about heat maps that, without hyperbole, shatters the notion that it can act as an effective guide to risk management.

Sooner or later, each of us will encounter (or possibly even produce) a heat map that looks something like this:

If we are having a good day, our heat map will make an attempt at quantifying the ranges that accompany our qualitative descriptions of the X and Y axis traunches... but we don't always have good days. Hubbard's point is that, if we are to use any visualization to make risk management decisions, it has to assist us in prioritizing the treatment of one risk over another. The heat map guides the reader towards addressing the "highest" (by color? or likelihood? or impact?) risks first, but is that the optimal outcome?

The heat map method confuses rather than clarifies the risk management question. In the example above, the expected loss for each of the three data points – one "green," one "amber," and one "red" – is $2M. The risk manager is either choosing to use information of lesser fidelity (based on an ordinal^[3] color scale), or can do away with the heatmap altogether and assess the three risks based on their quantitative probabilities (recall that Hubbard has already asserted that we have more data than we realize and need less data than we think). I can't promise that I'll never use a heat map again, but I suddenly struggle to think of situations where I'd choose this visualization over other available options.

The four scales of risk measurement

A mnemonic device that Hubbard introduces is Stevens' levels of measurements:

The nominal scale sorts items into named buckets without declaring an order or a preference: soft/hard, solid/liquid/gas/plasma, etc.
The ordinal scale also sorts items into named buckets, but the buckets have a suggested rank (red/amber/green, etc.) that implies sequence or preference.
The interval scale allows us to perform some mathematical operations to demonstrate the degree of difference between two measurements in a consistent way: 10°C is the same amount warmer than 0°C as 30°C is to 20°C, etc., but 10°C is not twice as warm as 5°C.
The ratio scale allows us to perform true mathematical comparisons between measurements. $10 is twice as much money as $5.

Stevens Scales illustration

Decomposition strategies

While Hubbard's nod to Stevens is not itself novel, it does serve as the backdrop to an important theme: decomposition. Hubbard argues that complex risks can be decomposed into simpler ones – ones that can be expressed in interval or ratio scales and with a high level of confidence (more on this later) – which can then be used to seed Monte Carlo simulations to evaluate likely loss scenarios and the impact of mitigation strategies. His most poignant argument is that of what makes a reasonable, value-added decomposition, i.e. that it is:

clear,
observable, and
useful

I particularly enjoyed his takedown of the "threat actor sophistication" decomposition that is so often used to justify vulnerability risk rankings. This fails each of the three tests: not only is attacker skill a subjective measurement (not clear or observable), but over as individual exploits are democratized over time, i.e. through inclusion in kits or automated attack tools^[4], the usefulness of the decomposition shifts. Does any of us want to be the kind of risk management partner who declares that a team has X days to patch a vulnerability, based on an assertion of exploit difficulty or availability of proof-of-concept code to sophisticated attackers, only to return later and declare a far more aggressive deadline because that subjective opinion was subsequently contradicted?

Getting to 90% confidence

distribution animation

The exact shape of the distribution of events is important, but not nearly so much as focusing on the outcomes that we can most effectively manage.

The last of the big, memorable ideas I gleaned was the notion that, with practice, we can all become "calibrated experts" at defining a confidence interval within which most risky events will land^[5]. This is powerful precisely because it allows us as risk managers to focus on the most common outcomes and not spend our careers chasing fringe cases at the expense both of likely results and of credibility with our stakeholders. This is not to say that black swan events can or should be ignored – they require their own strategies such as having appropriate insurance, which itself requires a healthy dose of accurate measurement – but rather that risk management professionals tend to focus on these disproportionately: way more than 5% of our time is spent on scenarios that happen less than 5% of the time^[6].

The tools that Hubbard provides for improving one's estimation abilities continue to stick with me:

the "equivalent bet" (would I rather bet a large sum^[7] that a roll of a 10-sided die does not land on a specific number... or bet that my 90% confidence interval is correct?)
the absurd upper/lower bounds test (did I set my intervals to effectively exclude results that seem implausible to me as an expert)
the deconstruction of the 90% range into its 5%-above and 5%-below exceptions

Most importantly, Hubbard asserts that experts can be trained to become good at this with repetition and feedback, and provides simple-to-administer tests^[8] for assessing whether an expert is over-confident (I was, the first time I calibrated myself), too conservative, or well calibrated.

Epilogue

When I think about the five lessons above, I cannot help but reflect on how wrong I was in my Taking the Bait post a few years back, citing "underreporting, environmental factors and subjectivity." It is humbling to imagine how I've been viewed, as a risk manager who was (once) uncomfortable giving guidance because I didn't have the benefit of a geologically-sound^[9] record of past events – neglecting to use the otherwise rich data already available to me, allowing the perfect to be the enemy of the good, granting edge and corner cases a disproportionate amount of my attention. The revolution I'm experiencing is the liberty that Hubbard and Seiersen give information risk managers to make decisions despite imperfect data and the uncertainties that come with it.

This post was not intended to be a substitute for reading Hubbard. There is 101-, 201-, and 301-level material throughout the book, and I plan on returning to it as I myself become better equipped to apply his methods. There are also parts of the book with which I disagree, especially the periodic appeals to authority in the form of Verizon's DBIR^[10], whose own risk assessment methods have been called into question. Overall, however, Hubbard has been one of the two or three most influential books I've read on the subject of information risk management. If you're a CISO or are thinking of becoming one, you could make many worse investments of your time than picking up a copy and spending an afternoon reading.

Author's note: this post was updated in January, 2017, to add better visualization and black swan / cyber insurance narrative to the 90% confidence section.

Fear, uncertainty, and doubt. ↩︎
see Dark Horse Analytics's work for my favorite animated GIFs explaining this very important concept. ↩︎
see the four scales section for more on ordinal scales and their more- or less-descriptive brethren. ↩︎
e.g. Metasploit ↩︎
I first came upon Peter Bernstein's concise definition of risk as the idea that not all things that can happen will in Dan Geer's excellent RSA 2014 keynote. ↩︎
For all that our industry gets excited about the latest zero-day threat or proposed mitigation, according to Verizon's 2016 DBIR, over 93% of vulnerabilities exploited during breaches in the previous year were over a year old. If you are one of the folks who believe that report is flawed (see one such opinion), multiple other sources corroborate an overall view of exploitation data that suggests that attention paid to good hygiene on older vulnerabilities is an exceedingly good investment. ↩︎
Hubbard suggests $1,000.00 ↩︎
scroll down to the "Chapter 7" section of HTMA's online site for some of Hubbard's supplemental expert calibration tests or search the web for Calibrated probability assessments. ↩︎
earthquakes and lightning strikes being my two favorite hobby horses ↩︎
Data Breach Investigation Report ↩︎