This article is part of the series "CEO vs. CSO Mindsets". Check out the rest:

To convince CEOs and CFOs to invest in data security software, CSOs have to speak their language. As I started describing in the previous post, corporate decision makers spend part of their time envisioning various business scenarios, and assigning a likelihood to each situation. Yeah, the C-level gang is good at poker, and they know all the odds for the business hand they were dealt.

For CSOs to get through to the rest of the C-suite, they’ll need to understand the language of risk and chance. It is expecting too much for upper-level non-IT executive to appreciate, say, operational reports on the results of vulnerability scans or how many bots were blocked per month. It’s definitely *helpful* for IT, but the C-suite is focused on far more fundamental measures.

They would want to know the answer to the following question: can you tell me what are the chances of a catastrophic breach or cyber event, perhaps costing over $10 million, occuring in the next 10 years? Once CEOs have this number, then they can price various options to offset the risk and keep their business on track.

I suspect most IT departments, except in all but the largest companies where formal catastrophe planning is part of their DNA, would be hard pressed to come up with even rough estimates for this number

In this post, I’ll guide you into doing a back-of-the-envelope calculation to answer this question. The approach is based on the FAIR Institute’s risk analysis model. Their core idea is that you can assign numbers to cyber risk using available data sources — both internal and external. In other words, you don’t have to fly completely blind in the current cyber threat environment.

FAIR (and other approaches as well) effectively break down the cost of a data breach into two components: the severity or magnitude of a *single* incident and the frequency at which these cyber events over a given period of time. Simple, right?

If you’re thinking, that you can multiple the separate averages of each — frequency and severity — to obtain an average for yearly cyber losses, you’re right (with some qualifications). The advantage of the FAIR model is that it allows you to go as deep as you want, depending on your resources, and you can get more granular information beyond broad averages.

I should add that FAIR is *not* re-inventing the risk wheel. Instead they’ve systemized techniques used principally by banks and insurance companies who have long had to handle catastrophes, financial crises and natural disasters, and know how much to a set aside for the proverbial rainy day.

## Mastering Data Disasters: How Bad Is the Risk?

In the last post, I took two years of HIPAA breach reporting data to derive what I called an exceedance loss curve. That’s a fancy way of saying I know the percentiles (or quantiles in risk-speak) for various cyber costs . For this post, I rearranged the curve to make it a bit more intuitive, and you can stare at the graph below:

We *finally* have a little more insight into answering the question a CEO of a health insurer might ask: how bad can it get?

Answer: Pretty bad!

The top 10% of healthcare cyber incidents can be very costly, starting at $8 million per attack. Yikes.

It’s also interesting to analyze the “weight” or average cost of the last 10% of this severity curve. As we’ll soon see, this is a power-law-ish Pareto style distribution that I talked about back here.

I did a quick calculation using the HIPAA data: the top 10% (or 90th percentile) of incidents, representing under 30 data points, carries a disproportionate *65% *of the total cost of all losses! With heavy-tailed curves, we need to focus on the extremes or tail because that’s where all the oomph is.

If we were to do a more sophisticated analysis along the lines of FAIR, we would then take the above healthcare loss severity distribution and merge with both internal loss data (if available) and a risk profile based on, perhaps, a survey of company infosec experts.

Naturally, it would be very helpful to conduct a data risk assessment to discover how much sensitive data is spread out across your file systems, and their associated permissions. This would be fed into the risk formulas.

There are some math-y methods to combine all this together using various weights, and you can learn more about this in Doug Hubbard’s RSA presentation, or (shameless plug) in his book: How to Measure Anything In Cybersecurity Risk.

## Healthcare Incident Rates and the Ultimate Average

For our purposes, let’s take the HIPAA reporting data as a good representation of how bad breach costs can be for our imaginary healthcare insurance company.

Now let’s deal with the next component. For the frequency or ate at which incidents occur, you may want to rely more heavily on your own internal data. There are also external data sets. For example, the Identity Theft Resource Center tracks breach incidents by industry sectors, and their health care numbers can guide you in your guesstimating.

Let’s say, for argument’s sake, that our insurer has has logged one significant cyber incident every four years, for an average rate of “.25 incidents” per year.

Drumroll . .. I multiply the average incident rate, .25, by the average loss or severity cost of the $4.2 million (from the above curve) to come up with an average annual loss of about $1 million.

This number may be eyebrow-raising to CISOs and CEOs of our hypothetical healthcare company. We are dealing with heavy-tailed data, and while this company may not have experienced a $4 million average for an incident (yet), the average tells how bad it *can* get. And this can help guide C-levels in deciding how much to spend on security risk mitigation — software, training, etc.

To go a little deeper, I used stat software — thank you EasyFit! — for some curve wrangling. I picked a power-law distribution, known as Pareto-2, to fit the data.

Though there are more comfy fits with other heavy-tail distribution in their software, it turns out that this is a *very* good approximation for the tail, which is really what we’re interested in. And as we’ll soon see, this function will help us more precisely say how bad it can get.

## Towards Value at Risk (VaR)

I just took you on a speedy tour through the first level of the FAIR approach. The average number we came up with, known in trade as AAL (average annual loss), is good baseline for understanding cyber risks.

As I suggested above, the CEO and (especially) the CFO want more precise information. This leads to Value at Risk or VaR, a biz-school formula that financiers and bankers use in their risk estimates.

If you’ve ever taken, as I have, “statistics for poets”, you’ll immediately recognize VaR as the 90% or 95% confidence formula for normal curves. Typically, CFOs prefer measuring the 99% level — 2.3 standard deviations from the mean — or the “once in a hundred year” event.

Why?

They are planning for extreme situations! You can think of CFOs grappling with how much to set aside to deal with the equivalent of a cyber tornado or hurricane, and for this, VaR is well suited.

Using my stats software, I came up with a 90% VaR formula — once in 10 year event — for my *non-normal* heavy-tailed distribution: it calculates a VaR of $8.2 million, which is close to the actual HIPAA data at the 10% mark in the graph above. Good work, EasyFit!

The advantage of having a VaR formula, which I’ll go into more detail next time, is that it enables us to *extrapolate*: we can answer other questions beyond what the data tells us.

For example: what can I expect in breach costs over a 10 year period assuming on average of, say, three cyber events in that period? I won’t hold you in suspense … the 90% VaR formula tells us it’s about $19 million, under 3 *times* the average cost of a single cyber event.

Let’s call it a day.

I’ll go over some of this material again, and tie it all up into a nice package in my next post. And we’ll learn more scary details about evil heavy-tailed breach loss curves.