Those of us in the infosec community eagerly await the publication of Ponemon’s annual breach cost analysis in the early summer months. What would summer be without scrolling through the Ponemon analysis to learn about last year’s average incident costs, average per record costs, and detailed industry breakdowns? You can find all this in the current report. But then Ponemon did something astonishing.
The poor souls who made it through my posts on breach costs stats learned that datasets used here are not normal. I mean that literally. They don’t correspond to a standard normal or bell curve. We also know from more in-depth studies that the data points are skewed with “heavy tails”, and are more accurately represented by power laws.
What does that have to do with Ponemon’s cost analysis?
Ponemon has avoided the issues of dealing with skewed data by lopping off the outliers — they don’t look at breach incidents above 100,000 records. Sure, you lose some information, but then the stats are more meaningful to the companies — most of them — that don’t live in the long tail of the curve.
Monster Breaches Are Costly. Very Costly.
Brace yourself. For 2018, Ponemon started looking at the dragon’s tail. They’ve included an analysis of mega data breaches involving incidents of over one million records.
First, let’s get the bad news out of the way. Since Ponemon only had 11 companies in their mega breach sample, they had to perform, gulp, a Monte Carlo analysis. That’s a fancy of way saying they were forced to make some guesses about a few of the parameters in their model, so they are randomly “sampled” to generate them.
The more important point is that in their graph below of breach costs vs. records stolen, the data points show a sub-linear or (more technically, a log-linear) relationship — the costs grow slower than a straight line. Double the number of records stolen, and the total breach cost is less than double.
And that’s exactly what other researchers have seen with breach costs. I also pointed this interesting factoid out in my breach cost series — you can learn more here.
For CFOs and CIOs, there’s a drop of a good news in this slow growing curve. It means that the cost per record drops as more records are involved.
For a data theft of 20 million records, the graph above indirectly tells us the average cost is about $18 per record, and at 50 million records, the per record cost decreases to about $7.
I suppose that may sound benign when quickly said in a presentation, but on the other hand … the total cost for a 50 million record theft is over $350 million.
And that’s something no board of directors wants to hear!
NetDiligence: Real-World Verification
While Ponemon’s theoretical analysis of mega breach costs is interesting, there is a dataset that sheds more light on real-world costs of these huge breaches. This comes to us courtesy of NetDiligence, a data risk analysis firm that has obtained access to actual claims data processed by cyber-insurance companies.
I looked at NetDiligence’s latest report for the period 2014 – 2017 period. It provides further validation that data breach costs at the high-end are, indeed, expensive.
According to NetDiligence, the average breach cost for the 591 claims in their dataset was about $394,000. They also calculated a median cost at a mere $56,000. Hmmm, with 50% of the data or 245 claims above $56K, there have to be monster incidents to explain the fact that average is about seven times higher than the median. This is the sign of a non-normal data set — the heavy-tailed curve that we typically see with breach stats.
I can do a quick back-of-the-envelope analysis to give you a better sense of mega costs lurking in the NetDiligence stats. Feel free to skip this next part if doing a multiplication with an average makes you slightly nauseous.
The total breach costs in the claims dataset is about $233 million (591 x 394,000). There’s a negligible amount of the total cost below the median – at most 56,000*246 or about $14 million. That leaves $219 million in costs above the median, which is then spread out over 245 claims. That means that the upper 50% has a average cost of at least $890,000.
If you make some other assumptions – similar to what I did here — you quickly get to breach incidents in the millions of dollars for the top percentiles.
Anyway, NetDiligence doesn’t give away too many details about individual breach incidents in their analysis. But further down in the report, they reference some of the extreme costs in their claims dataset. This shows up in a table that breaks down costs by business size.
If you look at the “Max” column, you can see that there are several incidents above $10 million. That ain’t chicken feed.
One Last Thing
It’s worth mentioning that Ponemon also includes indirects costs for incidents, which is based heavily on customer churn. This cost doesn’t show up in the cyber insurance claims because it’s based on hard costs — legal fees, fines, credit monitoring, remediation efforts, etc.
In other words, Ponemon’s incident cost analysis will always trend significantly higher than the numbers from actual insurance claims. Ponemon’s costs numbers, though, are probably closer to a real-world cost, particularly for larger companies. And especially for public companies, where breach incidents can affect overall valuations. For example, Yahoo.
The key takeaway is that the headline-making breach incidents (Equifax, Yahoo, etc.) tell us about the very far end of the cost tail. The NetDiligence report in particular proves that there are still expensive data breaches, in the tens of millions of dollars, living in the middle of the tail. And these are likely less publicized, and more typically experienced.
I’ll have more to talk about for both the Ponemon and NetDiligence reports in a future post.