Tag Archives: verizon dbir

Verizon 2018 DBIR: Phishing, Stolen Passwords, and Other Cheap Tricks

Verizon 2018 DBIR: Phishing, Stolen Passwords, and Other Cheap Tricks

Like the rest of the IT security world last week, I had to stop everything I was doing to delve into the latest Verizon Data Breach Investigations Report. I spent some quality time with the 2018 DBIR (after drinking a few espresso), and I can sum it all up in one short paragraph.

Last year, companies faced financially driven hackers and insiders, who use malware, stolen credentials, or phishing as attack vectors. They get in quickly and then remove payment card information, PII, and other sensitive data. It often takes IT staff months to even discover there’s been a breach.

I just played a trick on you.

The above paragraph was taken word for word from my analysis of the 2016 DBIR. Depressingly, this same analysis applies to the 2018 DBIR and has been pretty spot on for the law few years of Verizon reports.

Swiss Cheese

The point is that hackers have found a very comfortable gig that’s hard to defend against.  According to this year’s DBIR, stolen credential and phishing take up the first and third slots in the report table of top 20 actions in breaches. (RAM scrapers, by the way, are in the 2nd position and used heavily in POS attacks.)

How big a problem are stolen credentials, user names and passwords, which were previously hacked from other sites?

In a post late last year, Brian Krebs explored the dark market in hot passwords. A hacker can buy a vanilla user name and password combination for around $15. But the price goes up for active accounts of military personnel to $60, and tops out to $150 for active credentials from an online electronics retailers.

Let’s face it, credential are relatively inexpensive, and, as it turns out, they are also plentiful. A study by Google puts the number of credentials available on the black market at almost two billion.

Obviously, this is very bad news. Until we have wider use of multi-factor authentication, hackers can get around perimeter defenses to harvest even more credentials and other personal data and then sell them back to the blackmarket. In other words, there’s an entire dark economy at work to make it all happen.

And if hacker don’t have the cash to buy credentials in bulk, they can use phishing techniques to get through the digital door. There is a small ray of hope about phishing: the DBIR says that 80% of employee never click. Of course, the bad news is that 20% will.

Dr. Zinaida Benenson, our go-to expert on phishing, reported a similar percentage of clickers in her phishing experiments (which we wrote about last year): anywhere between 20% to 50% clicked, depending on how the messages was framed.

It only takes one employee to take the bait for the hackers to get in. You can run your own Probability-101 calculation, as I did here, to discover that with near certainty a good phish mail campaign will succeed in placing a malware payload on a computer.

In short: standard perimeter security defenses protecting against phishing attacks or hackers using stolen or weak credentials begin to resemble a beloved dairy product from a mountainous European country.

Scripty Malware

According to the DBIR, phish mail is the primary way malware enters an organization: their stats say it carries the hackers’ evil software over 90% of the time. Hackers don’t have to waste time finding openings in websites using injection attacks or other techniques: phishing is very effective and easier to pull off.

This year’s DBIR also has some interesting insights (see below) into the format of the malware that eventually lands inside the organization. The hackers are using scriptware — either JavaScript or VBScript —far more than binaries.

Source: Verizon 2018 DBIR

And it makes sense! It’s incredibly simple to write these scripts — this non-technical blogger could do it — and make them appear as, say, clickable PDF files in the case of JS of VBS, or insert a VBA script directly into a Word or Excel doc that will execute on opening.

You can learn about these malware-free techniques by reading my epic series of posts on this topic.

The attackers can also cleverly leverage the built-in script environments found in Microsoft Office. There’s even a completely no-sweat code-free approach that takes advantage of Microsoft Word’s DDE function used in embedded fields — I wrote about it here.

Typically, this initial payload allows the hackers to get a foot in the door, and it’s evil purpose is to then download more sophisticated software. The malware-free series, by the way, has real-world samples that show how this is done. Feel free to study them.

To quickly summarize: the MS Office scriptware involves launching a PowerShell session and then using the WebClient command to download the next stage of the attack over an HTTP channel.

Needless to say, the malware-free techniques – Office scripts, PowerShell, HTTP —are very hard to detect using standard security monitoring tools. The scripts themselves are heavily obfuscated — see the PowerShell obfuscation series to understand the full impact — and are regularly tweaked so defenses that rely on scanning for specific keywords or calculating hashes are useless.

The Verizon 2018 DBIR validates what I’m saying. Their stats indicate that 70-90% of malware samples are unique to an organization. Or as they put it:

… basically boil down to “AV is dead.” Except it’s not really. Various forms of AV, from gateway to host, are still alive and quarantining nasty stuff every day. “Signatures alone are dead” is a much more appropriate mantra that reinforces the need for smarter and adaptive approaches to combating today’s highly varied malware.

Towards a Better 2018

If you’ve been paying attention, then not too much of what the Verizon DBIR is saying should come as a shock. However, I do encourage you to read the introductory summary and then skip down to the industry vertical section to get more specifics relevant to your situation — mileage does vary. For example, ransomware is rampant in healthcare, and Remote Access Trojans (RATS) are more prevalent in banking.

And now for my brief sermon on what to do about the DBIR’s bleak statistics.

Perimeter defense are not effective in keeping hackers out. You need them, just as you need locks on windows and doors, but the hackers have found simple and cheap methods to get around these security measures.

To make 2018 a better security year, your first step is to admit that expensive firewalls and scanner infrastructure won’t solve everything — admit it right now, take a huge weight off your shoulders, and feel better! — and so secondary defenses have to be in place.

This means finding and putting more restrictive access rights on your sensitive data files to limit what the hackers can potentially discover, and then using monitoring techniques that alert your security teams if the attackers access these files.

Want to move beyond perimeter security? Click here to request a free risk assessment today!

Interview With Wade Baker: Verizon DBIR, Breach Costs, & Selling Board...

Interview With Wade Baker: Verizon DBIR, Breach Costs, & Selling Boardrooms on Data Security

Wade Baker is best known for creating and leading the Verizon Data Breach Investigations Report (DBIR). Readers of this blog are familiar with the DBIR as our go-to resource for breach stats and other practical insights into data protection. So we were very excited to listen to Wade speak recently at the O’Reilly Data Security Conference.

In his new role as partner and co-founder of the Cyentia Institute, Wade presented some fascinating research on the disconnect between CISOs and the board of directors. In short: if you can’t relate data security spending back to the business, you won’t get a green-light on your project.

We took the next step and contacted Wade for an IOS interview. It was a great opportunity to tap into his deep background in data breach analysis, and our discussion ranged over the DBIR, breach costs, phishing, and what boards look for in security products. What follows is a transcript based on my phone interview with Wade last month.

Inside Out Security: The Verizon Data Breach Investigations Report (DBIR) had been incredibly useful to me in understanding the real-world threat environment. I know one of the first things that caught my attention was that — I think this is pretty much a trend for the last five or six years — external threats or hackers certainly far outweigh insiders.

Wade Baker: Yeah.

IOS: But you’ll see headlines that say just the opposite, the numbers flipped around —‘like 70% of attacks are caused by insiders’. I was wondering if you had any comments on that and perhaps other data points that should be emphasized more?

WB: The whole reason that we started doing the DBIR in the first place, before it was ever a report, is just simply…I was doing a lot of risk-assessment related consulting. And it always really bothered me that I would be trying to make a case, ‘Hey, pay attention to this,’ and I didn’t have much data to back it up.

But there wasn’t really much out there to help me say, ‘This thing on the list is a higher risk because it’s, you know, much more likely to happen than this other thing right here.’

Interesting Breach Statistics

WB: Anyone who’s done those lists knows there’s a bunch of things on this list. When we started doing that, it was kind of a simple notion of, ‘All right, let me find a place where that data might exist, forensic investigations, and I’ll decompose those cases and just start counting things.’

Attributes of incidents, and insiders versus outsiders is one I had always heard —- like you said. Up until that point, 80% of all risk or 80% of all security incidents are insiders. And it’s one of those things that I almost consider it like doctrine at that time in the industry!

When we showed pretty much the exact opposite! This is the one stat that I think has made people the most upset out of my 10 years doing that report!

People would push back and kind of argue with things, but that is the one, like, claws came out on that one, like, ‘I can’t believe you’re saying this.’

There are some nuances there. For instance, when you study data breaches, then it does. Every single data set I ever looked at was weighted toward outsiders.

When you study all security incidence — no matter what severity, no matter what the outcome — then things do start leaning back toward insiders. Just when you consider all the mistakes and policy violations and, you know, just all that kind of junk.

Social attacks and phishing have been on the rise in recent years. (Source: Verizon DBIR)

IOS: Right, yes.

WB: I think defining terms is important, and one reason why there’s disagreement. Back to your question about other data points in the report that I love.

The ones that show the proportion of breaches that tie back to relatively simple attacks, which could have been thwarted by relatively cheap defenses or processes or technologies.

I think we tend to have this notion — maybe it’s just an excuse — that every attack is highly sophisticated and every fix is expensive. That’s just not the case!

The longer we believe those kind of things, I think we just sit back and don’t actually do the sometimes relatively simple stuff that needs to be done to address the real threat.

I love that one, and I also love the time to the detection. We threw that in there almost as a whim, just saying, ‘It seems like a good thing to measure about a breach.’

We wanted to see how long it takes, you know, from the time they start trying to link to it, and from the time they get inside to the time they find data, and from the time they find the data to exfiltrating it. Then of course how long it takes to detect it.

I think that was some of the more fascinating findings over the years, just concerning that.

IOS: I’m nodding my head about the time to discovery. Everything we’ve learned over the last couple of years seems to validate that. I think you said in one of your reports that the proper measurement unit is months. I mean, minimally weeks, but months. It seems to be verified by the bigger hacks we’ve heard about.

WB: I love it because many other people started publishing that same thing, and it was always months! So it was neat to watch that measurement vetted out over multiple different independent sources.

Breach Costs

IOS: I’m almost a little hesitant to get into this, but recently you started measuring breach cost based o proprietary insurance data. I’ve been following the controversy.

Could you just talk about it in general and maybe some of your own thoughts on the disparities we’ve been seeing in various research organizations?

WB: Yeah, that was something that for so long, because of where we got our information, it was hard to get all of the impact side out of a breach. Because you do a forensic investigation, you can collect really good info about how it happened, who did it, and that kind of thing, but it’s not so great six months or a year down the road.

You’re not still inside that company collecting data, so you don’t get to see the fallout unless it becomes very public (and sometimes it does).

We were able to study some costs — like the premier, top of line breach cost stats you always hear about from Ponemon.

IOS: Yes.

WB: And I’ve always had some issues with that, not to get into throwing shade or anything. The per record cost of a breach is not a linear type equation, but it’s treated like that.

What you get many times is something like an Equifax, 145 million records. Plus you multiply that by $198 per record, and we get some outlandish cost, and you see that cost quoted in the headlines. It’s just not how it works!

There’s a decreasing cost per record as you get to larger breaches, which makes sense.

There are other factors there that are involved. For instance, I saw a study from RAND, by Sasha Romanosky recently, where after throwing in predictors like company revenue and whether or not they’ve had a breach before — repeat offenders so to speak — and some other factors, then she really improves the cost prediction in the model.

I think those are the kind of things we need to be looking at and trying to incorporate because I think the number of records is probably, at best, describes about a third … I don’t even know if it gets to a half of the cost on the breach.

Breach costs do not have a linear relationship with data records! (Source: 2015 Verizon DBIR)

IOS: I did look at some of these reports andI’m a little skeptical about the number of records itself as a metric because it’s hard to know this, I think.

But if it’s something you do on a per incident basis, then the numbers look a little bit more comparable to Ponemon.

Do you think it’s a problem, looking at it on per record basis?

WB: First of all, an average cost per record, I would like to step away from that as a metric, just across the board.  But tying cost to the number of records probably…I mean, it works better for, say, consumer data or payment card data or things like that where the costs are highly associated with the number of people affected. You then get into cost of credit monitoring and the notifications. All of those type things are certainly correlated to how many people or consumers are affected.

When you talk about IP or other types of data, there’s just almost no correlation. How do you count a single stolen document as a record? Do you count megabytes? Do you count documents?

Those things have highly varied value depending on all kinds of circumstances. It really falls down there.

What Boards Care About

IOS: I just want to get back to your O’Reilly talk. And one of the things that also resonated with me was the disconnect between the board and the CISOs who have to explain investments. And you talk about that disconnect.

I was looking at your blog and Cyber Balance Sheet reports, and you gave some examples of this — something that the CISO thinks is important, the board is just saying, ‘What?’

So I was wondering if you can mention one or two examples that would give some indication of this gap?

WB: The CISOs have been going to the board probably for several rounds now, maybe years, presenting information, asking for more budgets, and the board is trying to ‘get’ what they need to build a program to do the right things.

Pretty soon, many boards start asking, ‘When are we done? We spent money on security last month. Why are we doing it this quarter too?’

Security as a continual and sometimes increasing investment is different than a lot of other things that they look at. They think of, ‘Okay, we’re going to spend money on this project, get it done, and we’re going to have this value at the end of that.’

We can understand those things, but security is just not like that. I’ve seen it a lot this breaking down with CISOs, who are coming from, ‘We need to do this project.’

You lay on top of all this that the board is not necessarily going to see the fruits of their investment in security! Because if it works, they don’t see anything bad at all.

Another problem that CISOs have is ‘how do I go to them when we haven’t had any bad things happen, and asking for more money?’ It’s just a conversation where you should be prepared to say why that is —  connect these things to the business.

By doing these things, we’re enabling these pieces of the business to function properly. It’s a big problem, especially for more traditional boards that are clearly focused on driving revenue and other areas of the business.

IOS: Right. I’m just thinking out loud now … Is the board comparing it to physical security, where I’m assuming you make this initial investment in equipment, cameras, and recording and whatever, and then your costs, going forward, are mostly people or labor costs?

They probably are looking at it and saying,  ‘Why am I spending more? Why am I buying more cameras or more modern equipment?’

WB: I think so! I’ve never done physical security, other than as a sideline to information security. Even if there are continuing costs, they live in that physical world. They can understand why, ‘Okay, we had a break-in last month, so we need to, I don’t know, add a guard gate or something like that.’ They get why and how that would help.

Whereas in the logical or cyber security world, they sometimes really don’t understand what you’re proposing, why it would work. If you don’t have their trust, they really start trying to poke holes. Then if you’re not ready to answer the question, things just kind of go downhill from there.

They’re not going to believe that the thing you’re proposing is actually going to fix the problem. That’s a challenge.

IOS: I remember you mentioning during your O’Reilly talk that helpful metaphors can be useful, but it has to be the right metaphor.

WB: Right.

IOS: I mean, getting back to the DBIR. In the last couple of years, there was an uptick in phishing. I think probably this should enter some of these conversations because it’s such an easy way for someone to get inside. For us at Varonis, we’re been focused on ransomware lately, and there’s also DDoS attacks as well.

Will these new attack shift the board’s attention to something they can really understand—-since these attacks actually disrupt operations?

WB: I think it can because things like ransomware and DDoS, are things that are apparent just kind of in and of themselves. If they transpire, then it becomes obvious and there are bad outcomes.

Whereas more cloak-and dagger stealing of intellectual property or siphoning a bunch of consumer data is not going to become apparent, or if it is, it’s months down the road, like we talked about earlier.

I think these things are attention-getters within a company, attention-getters from the headlines. I mean, from what I’ve heard over the past year, as this ransomware has been steadily increasing, it has definitely received the board’s attention!

I think it is a good hook to get in there and show them what they’re doing. And ransomware is a good one because it has a corporate aspect and a personal aspect.

You can talk to the board about, ‘Hey, you know, this applies to us as a company, but this is a threat to you in your laptop in your home as well. What about all those pictures that you have? Do you have those things backed up? What if they got on your data at home?’

And then walk through some of the steps and make it real. I think it’s an excellent opportunity for that. It’s not hype, it’s actually occurring and top of the list in many areas!

Contrary to conventional wisdom, corporate board of directors understand the value of data protection. (Source: Cyber Balance Sheet)

IOS: This brings something else to mind. Yes, you could consider some of these breaches as a cost of doing business, but if you’re allowing an outsider to get access to all your files, I would think, high-level executives would be a little worried that they could find their emails. ‘Well, if they can get in and steal credit cards, then they can also get into my laptop.’

I would think that alone would get them curious!

WB: To be honest, I have found that most of the board members that I talk to, they are aware of security issues and breaches much more than they were five to ten years ago. That’s a good thing!

They might sit on boards of other companies, and we’ve had lots of reporting of the chance that a board member has been with a company that’s experienced a breach or knows a buddy who has, is pretty good by now. So it’s a real problem in their mind!

But I think the issue, again, is how do you justify to them that the security program is making that less likely? And many of them are terrified of data breaches, to be honest.

Going back to that Cyber Balance Sheet report, I was surprised when we asked board members what is the biggest value that security provides — you know, kind of the inverse of your biggest fear? They all said preventing data breaches. And I would have thought they’d say, ‘Protect the brand,’ or ‘Drive down risk,’ or something like that. But they answered, ‘Prevent data breaches.’

It just shows you what’s at the top of their minds! They’re fearful of that and they don’t want that to happen. They just don’t have a high degree of trust that the security program will actually prevent them.

IOS: I have to say, when I first started at Varonis, some of these data breach stories were not making the front page of The New York Times or The Washington Post, and that certainly has changed. You can begin to understand  the fear. Getting back to something you said earlier about how simple approaches, or as we call it block-and-tackle, can prevent breaches.

Another way to mitigate the risk of these breaches is something that you’ve probably heard of, Privacy by Design, or Security by Design. One of the principles is just simply reduce the data that can cause the risk.

Don’t collect as much, don’t store as much, and delete it when it’s no longer used. Is that a good argument to the board?

WB: I do, and I think there are several approaches. I’ve given this recommendation fairly regularly, to be honest: minimize the data that you’re collecting. Because I think a lot of companies don’t need as much data as they’re collecting! It’s just easy and cheap to collect it these days, so why not?

Helping organizations understand that it is a risk decision! Tthat’s not just a cost decision. It is important. And then of what you collect, how long do you retain it?

Because the longer you retain it and the more you collect, you’re sitting on a mountain of data and you can become a target of criminals just through that fact.
For the data that you do have and you do need to retain … I’m a big fan of trying to consolidate it and not let it spread around the environment.

One of the metrics I like to propose is, ‘Okay, here’s the data that’s important to me. We need to protect it.’ Ask people where that lives or how many systems that should be stored on in the environment, and then go look for it.

If you can multiply that number by like 3 or 5 or 10 sometimes. And that’s the real answer! It’s a good metric to strive for: the number of target systems that that information should reside within. many breaches come from areas where that should not have been.

Security Risk Metrics

IOS: That leads to the next question about risk metrics. One we use at Varonis is PII data that has Windows permissions marked for Everyone. They’re always surprised during assessments when they see how large it is.

This relates to stale data. It could be, you know, PII data that hasn’t been touched in a while. It’s sitting there, as you mentioned.  No one’s looking at it, except the hackers who will get in and find it!

Are there other good risk metrics specifically related to data?

WB: Yup, I like those. You mentioned phishing a while ago. I like stats such as the number of employees that will click-through, say, if you do a phishing test in the organization. I think that’s always kind of an eye-opening one because boards and others can realize that, ‘Oh, okay. That means we got a lot of people clicking, and there’s really no way we can get around that, so that forces us to do something else.’

I’m a fan of measuring things like number of systems compromised in any given time, and then the time that it takes to clean those up and drive those two metrics down, with a very focused effort over time, to minimize them. You mentioned people that have…or data that has Everyone access.

Varonis stats on loosely permissioned folders.

IOS: Yes.

WB: I always like to know, whether it’s a system or an environment or a scope, how many people have admin access! Because we highly over-privileged in most security environments.

I’ve seen eyes pop, where people say, ‘What? We can’t possibly have that many people that have that level of need to know on…for that kind of thing.’ So, yeah, that’s a few off the top of my head.

IOS: Back to phishing. I interviewed Zinaida Benenson a couple months ago — she presented at Black Hat. She did some interesting research on phishing and click rates. Now, it’s true that she looked at college students, but the rates were  astonishing. It was something like 40% were clicking on obvious junk links in Facebook messages and about 20% in email spam.

She really feels that someone will click and it’s just almost impossible to prevent that in an organization. Maybe as you get a little older, you won’t click as much, but they will click.

WB: I’ve measured click rates at about 23%, 25%. So 20% to 25% in organizations. And not only in organizations, but organizations that paid to have phishing trials done. So I got that data from, you know, a company that provides us phishing tests.

You would think these would be the organizations that say, ‘Hey, we have a problem, I’m aware. I’m going to the doctor.’ Even among those, where one in four are clicking. By the time an attacker sends 10 emails within the organization, there’s like a 99% rate that someone is going to click.

Students will click on obvious spammy links. (Source: Zinaida Benenson’s 2016 Black Hat presentation)

IOS: She had some interesting things to say about curiosity and feeling bold. Some people, when they’re in a good mood, they’ll click more.

I have one more question on my list …  about whether data breaches are a cost of business or are being treated as a cost of business.

WB: That’s a good one.

IOS: I had given an example of shrinkage in retail as a cost of business. Retailers just always assume that, say, there’s a 5% shrinkage. Or is security treated — I hope it will be treated — differently?

WB: As far as I can tell, we do not treat it like that. But I’ll be honest, I think treating it a little bit like that might not be a bad thing! In other words, there have been some studies that look at the losses due to breaches and incidents versus losses like shrinkage and other things that are just very, very common, and therefore we’re not as fearful of them.

Shrinkage takes many, many more…I can’t remember what the…but it was a couple orders of magnitude more, you know, for a typical retailer than data breaches.

We’re much more fearful of breaches, even at the board level. And I think that’s because they’re not as well understood and they’re a little bit newer and we haven’t been dealing with it.

When you’re going to have certain losses like that and they’re fairly well measured, you can draw a distribution around them and say that I’m 95% confident that my losses are going be within this limit.

Then that gives you something definite to work with, and you can move on. I do wish we could get there with security, where we figure out that, ‘All right, I am prepared to lose this much.”

Yes, we may have a horrifying event that takes us out of that, and I don’t want to have that. We can handle this, and we handle that through these ways. I think that’s an important maturity thing that we need to get to. We just don’t have the data to get there quite yet.

IOS: I hear what you’re saying. But there’s just something about security and privacy that may be a little bit different …

WB: There is. There certainly is! The fact that security has externalities where it’s not just affecting my company like shrinkage. I can absorb those dollars. But my failures may affect other people, my partners, consumers and if you’re in critical infrastructure, society. I mean that makes a huge difference!

IOS: Wade, this has been an incredible discussion on topics that don’t get as much attention as they should.

Thanks for your insights.

WB: Thanks Andy. Enjoyed it!

My Big Fat Data Breach Cost Post, Part III

My Big Fat Data Breach Cost Post, Part III

This article is part of the series "My Big Fat Data Breach Cost Series". Check out the rest:

How much does a data breach cost a company? If you’ve been following this series, you’ll know that there’s a huge gap between Ponemon’s average cost per record numbers and the Verizon DBIR’s (as well other researcher’s). Verizon was intentionally provocative in its $.58 per record claim. However, Verizon’s more practical (and less newsworthy) results were based on using a different model that derived average record costs more in line with Ponemon’s analysis.

The larger issue, as I’ve been preaching, is that a single average for a skewed, or more precisely, a data set that follows a power law is not the best way to understand what’s going on. For a single number, the median, or the number where 50% of the data set lies below, does a better job of summarizing it all.

Unfortunately, when we introduce averages based on record counts, the problem is made even worse. Long sigh.

Fake News: Ponemon vs. Verizon Controversy

In other words, there are monster breaches in the Verizon data (based on NetDiligence’s insurance claim data) at the far end of the tail that result in hundreds of millions of records — and therefore an enormous denominator in calculating the average.

I should have mentioned last time that Ponemon’s dataset is based on breaches of less than 100,000 records. Since cyber incidents involve some hefty fixed amount costs for consulting and forensics, you’ll inevitably have a higher average when dividing the incident cost by a smaller denominator.

In brief: Ponemon’s $201 vs. Verizon’s $.58 average cost per record is a made up of controversy comparing the extremes of this weird dataset.

As I showed, when we ignore record counts and use average incident costs we get better agreement between Verizon and Ponemon – about $6 million per breach.

There’s a “but”.

Since we’re dealing with power laws, the single average is not a good representation. Why? So much of the sample is found at the beginning of the tail and the median — the incident cost where 50% of the incidents lie below — is not even close to the average!

My power law fueled analysis in the last post led to my amazing 3-tiered IOS Data Incident Cost Table©. I broke the fat-tailed dataset (based on NetDiligence’s numbers) into three smaller segments — Economy, Economy Plus, and Business Class —  to derive averages that are far more representative.

My Economy Class, which is based on 50% of the sample set, has an average incident cost of $1.4 million versus the overall average of $7.6 million. That’s an enormous difference! You can think of this average cost for 50% of the incidents as something like a hybrid of median and mean — it’s related to the creepy Lorenz curve from last time.

Ponemon and Pain

Let’s get back to the real world, and take another look at Ponemon’s survey. Their analysis is based on interviews with real people working for hundreds of companies worldwide.

Ponemon then calculates a total cost that takes in account direct expenses — credit monitoring for affected customer, forensic analysis —and fuzzier indirect costs, which can include extra employee hours and potential lost business.

These indirect costs are significant: for their 2015 survey, it represented almost 40% of the total cost of a breach!

As for the 100,000 record limit, Ponemon is well aware of this issue and warns that their average breach cost number should not be applied to large breaches. For example, Target’s 2014 data breach exposed the credit card number of over 40 million customers for a grand total of over $8 billion based on the Ponemon average. Target’s actual breach-related costs were far less.

One you go deeper into the Ponemon reports, you’ll find some incredibly useful insights.

In the 2016 survey, they note that having an incident response team in place lowers data costs per record by $16; Data Loss Prevention (DLP) takes another $8 off; and data classification schemes lop off an another $4.

Another interesting fact is that a large contributing factor to indirect costs is something called “churn”, which Ponemon defines as current customers who terminate their relationship as the result of loss of trust in the company after a breach.

Ponemon also estimates “diminished customer acquisition”, another indirect cost related to churn, which is the cost of lost future business because of damage to the brand.

These costs are based on Ponemon analysts reviewing internal corporate statistics and putting a “lifetime” value on a customer.

Feel the pain: Ponemon’s data on lost business.

Anyway, by comparing churns rates after a breach incident to historical averages, they can detect abnormal rates and then attribute the cost to the incident.

Ponemon consolidated the business lost to churn, additional acquisition costs, and damage to “goodwill” into a bar chart (above) divided by country. For the US,  the average opportunity cost of for a breach is close to $4 million.

With that in mind, it’s helpful to view the average cost per record breached as a measure of overall corporate pain.

What does that mean?

In addition to actual expenses, you can think of Ponemon’s average as also representing extra IT, legal, call center, and consultant person-days of work and emotional effort; additional attention focused in future product marketing and branding; and administrative and HR resources needed for dealing with personnel and morale issues after a breach.

All of these factors are worth considering when your organization plans its own breach response program!

Some Additional Thoughts

In our chats with security pros, attorneys, and even a small business owner who directly experienced a hacking, we learned first-hand that a breach incident is very disruptive.

It’s not just the “cost of doing” business as some have argued. In recent years, we’ve seen several CEO’s fired. More recently, with the Equifax breach, along with the C-suite leaving or “retiring”, the company’s very existence is being threatened through law suits.

There is something different about a data breach. Information on customers and executives, as well as corporate IP, can be leveraged in various creative and evil ways — identity theft attacks, blackmail, and competitive threats

While the direct upfront costs, though significant, may not reflect the $100 to $200 per record range that shows up in the press, a cyber attack resulting in a data exposure is still an expensive incident — as we saw above, over $1 million on average for most companies.

And for the longer term, Ponemon’s average cost numbers are the only measurement I know of that reflects the accounting for these unknowns.

It’s not necessarily a bad idea to be scared by Ponemon’s stats, and change your data security practices accordingly.

 

 

 

 

 

My Big Fat Data Breach Cost Post, Part II

My Big Fat Data Breach Cost Post, Part II

This article is part of the series "My Big Fat Data Breach Cost Series". Check out the rest:

If I had to summarize the first post in this series in one sentence, it’s this: as a single number, the average is not the best way to understand a dataset. Breach cost averages are no exception! And when that dataset is skewed or “heavy tailed”, the average is even less meaningful.

With this background, it’s easier to understand what’s going on with the breach cost controversy as its being played out in the business press. For example, this article in Fortune magazine, does a good job of explaining the difference between Ponemen’s breach costs per record stolen and Verizon’s statistic.

Regression Are Better

The author points out that Ponemon does two things that overstate their cost per record average. One, they include indirect costs in their model — potential lost business, brand damage, and other opportunity costs. While I’ll get to this in the next post, Ponemons’ qualitative survey technique is not necessarily bad, but their numbers have to be interpreted differently.

The second point is that Ponemon’s $201 per record average is not a good predictor, as is any raw average, and for skewed datasets sets it’s especially not a very useful number.

According to our friends at the Identity Theft Resource Center (ITRC), which tracks breach stats, we’re now reached over a 1000 breach incidents with over a 171 million records taken. Yikes!

Based on Ponemon’s calculations, American business has experienced $201 x 171 million or about $34 billion worth of data security damage. That doesn’t make any financial sense.

Verizon’s average of $.58 per record is based on reviewing actual insurance claim data provided by NetDiligence. This average is also deficient because it likely understates the problem — high deductibles and restrictive coverage policies play a role.

Verizon, by the way, has said this number is also way off! They were making a point about averages being unreliable (and taking a little dig at Ponemon).

The Fortune article then discusses Verizon’s log-linear regression, and reminds us that breach costs don’t grow at a linear rate. We agree on that point! The article also excerpts the table from Verizon that shows how different per record costs would apply for various ranges. I showed that same table in the previous post, and further below we’ll try to do something similar with incident costs.

In the last post, we covered the RAND model’s non-linear regression, which incorporate other factors besides record counts. Jay Jacobs also has a very simple model that’s better than a strict linear line. Verizon, RAND, and Jacobs’ regressions are all far better at predicting costs than just a single average number.

I’ll make one last point.

The number of data records involved in a breach can be hard to nail down. The data forensics often can’t accurately say what was taken: was it 10,000 records or a 100,000? The difference may amount to whether a single file was touched, and a factor of ten difference can change $201 per record to $20!

A more sensible approach is to look at the costs per incident. This average, as I wrote about last ime, is a little more consistent, and is roughly in the $6 million range based on several different datasets.

The Power of Power Laws

Let’s gets back to the core issue of averages. Unfortunately, data security stats are very skewed, and in fact the distributions are likely represented by power laws. The Microsoft paper, Sex, Lies and Cyber-Crime Surveys, makes this case, and also discusses major problems — under-sampling and misreporting — of datasets that are based on power laws: in short, a few data points have a disproportionate effect on the average.

Those who are math phobic and curl up into fetal position when they see an equation or hear the word “exponent” can skip to the next section without losing too much.

Let’s now look at the table from the RAND study, which I showed last time.

An incident of $750 million indicates that this is a spooky dataset. Boooo!

Note that the median cost per for an incident — see the bottom total — is $250,000 while the average cost of $7.84 million is an astonishing 30 times as great! And the maximum value for this dataset contains a monster-ish $750 million incident. We ain’t dealing with a garden variety bell-shaped or normal curve.

When the data is guided by power law curves, these leviathans exist, but they wouldn’t show up in data conforming to the friendlier and more familiar bell curve.

I’m now going to fit a power law curve to the above stats, or at least to the average — it’s a close enough fit for my purpose. The larger point is that you can have a fat-tailed dataset with the same average!

A brief word from our sponsor. Have I mentioned lately how great Wolfram Alpha is? I couldn’t have written this post without it. If I only had this app in high school. Back to the show.

The power law has a very simple form: it’s just the variable x, representing in this case the cost of an incident, taken to a negative exponent power of alpha:  x-α.

Simple. (Please don’t’ shout into your browser: I know there’s a normalizing constant, but I left it out to make things easier.)

I worked out an alpha of about -2.15 based on stats in the above table. The alpha, by the way, is the key to all the math that you have to do.

However, what I really want to know is the weight or percentage of the total costs for all breach incidents that each segment of the sample contributes. I’m looking for a representative average for each slice of the incident population.

For example, I know that the median or 50% of the sample — that’s about 460 incidents — has incident costs below $1.8 million. Can I calculate the average costs for this group? It’s certainly not $7.84 million!

There’s a little bit more math involved, and if you’re interested, you can learn about the Lorenz curve here. The graph below compares the unequal distribution of total incidents costs (the blue curve) for my dataset versus a truly equal distribution (the 45-degree red line).

The Lorenz curve: beloved by economists and data security wonks. The 1% rule! (Vertical axis represents percent of total incident costs.)

As you ponder this graph — and play with it here — you see that the blue curve doesn’t really change all that much up to around the 80% or .8 mark.

For example, the median at .5 and below represents 9% of the total breach costs. Based on the stats in the above table, the total breach cost for all incidents is about $7.2 billion ($7.84 million x 921). So the first 50% of my sample represents a mere $648 million ($7.2 billion x .09). If you do a little more arithmetic, you find the average is about $1.4 million per incident for this group.

The takeaway for this section is that most of the sample is not seeing an average incident cost close to $7.8 million! This also implies that at the tail there are monster data incidents pushing up the numbers.

The Amazing IOS Blog Data Incident Cost Table

I want to end this post with a simple table (below) that breaks average breach costs into three groups: let’s call it Economy, Economy Plus, and Business Class. This refers to the first 50% of the data incidents, the next 40%, and the last 10%. It’s similar to what Verizon did in their 2015 DBIR for per record costs.

Economy Economy Plus Business Class
Data incidents 460 368 92
Percent of Total Cost 9% 15% 74%
Total Costs $648 million $1 billion $5.33 billion
Average costs $1.4 million/incident $2.7 million/incident $58 million/incident

If you’ve made it this far, you deserve some kind of blog medal. Maybe we’ll give you a few decks of Cards Against IT if you can summarize this whole post in a single, concise paragraph and also explain my Lorenz curve.

In the next, and (I promise) last post in this series, I’ll try to tell a story based on the above table, and then offer further thoughts on the Verizon vs. Ponemon breach cost battle.

Story telling with just numbers can be dangerous. There are limits to “data-driven” journalism, and that’s where Ponemon’s qualitative approach has some significant advantages!

Continue reading the next post in "My Big Fat Data Breach Cost Series"

My Big Fat Data Breach Cost Post, Part I

My Big Fat Data Breach Cost Post, Part I

This article is part of the series "My Big Fat Data Breach Cost Series". Check out the rest:

Data breach costs are very expensive. No, wait they’re not. Over 60% of companies go bankrupt after a data breach! But probably not. What about reputational harm to a company? It could be over-hyped but after Equifax, it could also be significant. And aren’t credit card fraud costs for consumers a serious matter? Maybe not! Is this post starting to sound confusing?

When I was tasked with looking into data breach costs, I was already familiar with the great Verizon DBIR vs. Ponemon debate: based on data from 2014, Ponemon derived an average cost per record of $201 while Verizon pegged it at $.58 per record. In my book, that’s an enormous difference. But it can be explained if you dive deeper.

After looking at one too many research paper, presentation and blog post on the subject of data breach costs, I started to see that once you absorb a few underlying ideas, you understand what everyone is yakking about.

That’s a roundabout way of saying that this will be a multi-part series.

Averages Can Cause Non-Average Problems

The first issue to take up is the average of a data sample. In fact, this blog’s favorite statistician Kaiser Fung lectured us on this point a while back. When looking at a data set, a simple average of the numbers works well enough as long as the distribution of the number is not too skewed – has a spike or clump at the tail end.

But as Fung points out, when this is not the case, the average leads to inconsistencies, as in the following hypothetical data set of breach record counts over two years:

Company Number of records breached (2015) Number of records breached (2016 )
1 100 150
2 200 400
3 150 300
4 225 250
5 75 100
6 1000 1200
7 1500 1000
8 8000 1000
9 300 400
10 175 500
Average 1172 530

For 2015, the average of 1172 is off by several multiples for seven of the ten companies! And if we compare this average to the following year’s average of 930, we could incorrectly conclude that breach counts are down.

Why? If we look at those seven companies, we see all their breach counts went, ahem, up.

This usually leads to a discussion of how numbers are distributed in a dataset, and that the median number, where 50% or less of the data can be found, is a better representation than an average — especially for skewed data sets. Kaiser is very good at explaining this.

For those who want to get a head start on the next post in this series, they can scan this paper, which has the best title on a data security topic I’ve come across, Sex, Lies and Cyber-crime Surveys. This was written by those crazy folks at Microsoft. If you don’t want to read it, the point is this: for skewed data, it’s important to analyze how each percentile contributes to the overall average.

Guesstimating Data Breach Costs

How does Ponemon determine the cost of a data breach? Generally, this information is not easily available. However, in recent years, theses costs have started to show up in annual reports for public companies.

But for private companies and for public companies that are not breaking breach costs out in their public financial reporting, you have to do more creative number crunching.

Ponemon surveys companies, asking them to rate the costs for common post-breach activities, including auditing & consulting, legal services, and identity protection fees. Ponemon then categorizes costs into whether they’re direct — for example, credit monitoring — or fuzzier indirect or opportunity costs — extra employee time or potential lost business.

It turns out that these indirect costs represent about 40% of the average cost of a breach based on their 2015 survey. These costs mean something, but they’re not really accounting costs. More on that next time.

Recently, other researchers have been able to get a hold of far better estimate of the direct breach costs by examining actual cyber insurance claims. Companies, such as Advisen and NetDiligence, have this insurance payout data and have been willing to share it.

The cyber insurance market is still immature and the actual payouts after deductibles and other fine print don’t represent the full direct cost of the breach. But this is, for the first time, evidence of direct costs.

Anyway, the friendly people over at RAND — yes, the very same company who worked this out — used these data sets to guesstimate an average breach cost per incident of about $6 million – wonks should review their paper. This tracks very closely with Ponemon’s $6.5 million per incident estimate for roughly the same period.

Per incident cost data based on insurance claims. Note the Max values! (Source: RAND)

Before you start shouting into your browser, I realize I used an average above to estimate a very skewed (and as we’ll see heavy-tailed) set.

In any case, several studies including the RAND one, have focused on per incident costs rather than per record costs. At some point, the Verizon DBIR team also began to de-emphasize the count of records exposed, realizing that it’s hard to get reliable numbers from their own forensic data.

In the 2015 DBIR report, the one where they announced their provocative $.58 per record breach cost claims, the researchers relied on, for the first time, a dataset of insurance claim data from NetDiligence.

Let me just say that the DBIR’s average cost ratio is heavily influenced by a few companies with humongous breached record counts  — likely in the millions —  reflected in the denominator and smaller total insurance payouts for the numerator. As we saw in my made-up example above, the average in this case is not very revealing.

Why not use multiple averages customized over different breach count ranges? I hope you’re beginning to see it’s far better to segment the cost data by record count: you look up in a table to find the costs appropriate for your case. And Verizon did something close to that in the 2015 DBIR to come with a table of data that’s nearer Ponemon’s average for the lower tiers:

Ok, so maybe Verizon’s headline-grabbing $.58 per record breached is not very accurate.

Counting breach data record provides some insight into understanding total costs, but there are other factors: the particular industry the company is in, regulations they’re under, credit protection costs for consumers, and company size. For example, take a look at this breach cost calculator based on Ponemon’s own data.

Linear Thinking and Its Limits

You can understand why the average breach cost per record number is so popular: it provides a quick although unreliable answer for the total cost of a particular breach.

To derive the $201 average cost per record, Ponemon simply added up the costs (both direct and indirect) from their survey and divided by the number of records breached as reported by the companies.

This may be convenient for calculations but as a predictor, it’s not very good. I’m gently walking around the topic of linear regressions, which is one way to draw a “good” straight line through the dataset.

Wonks can check out Jay Jacobs’ great post on this in his Data Driven Security blog. He shows a linear regression beating out the simple Ponemon line with its slope of 201 — by the way, he gained direct access to Ponemon’s survey results. Jacobs’ beta is $103, which you can interpret as the marginal cost of an additional breached record. But even his regression model is not all that accurate.

I want to end this post with this thought: we want the world to look linear, but that’s not the way it ticks.

Why should breach costs go up by a fixed amount for each additional record stolen? And for that matter, why do we assume that 10% of the companies in a data breach survey will contribute 10% to the total costs, the next 10% will add another 10%, etc.

Sure for paying out credit monitoring costs for consumer and replacing credit cards that were reissued by litigious credit card companies, costs add up on a per record basis.

On the other hand, I don’t know too many attorneys, security consultants, developers, or pen testers who say to new clients, “We charge $50 a data record to analyze or remediate your breach.”

Jacobs found a better non-linear model — technically log-linear which is fancy way of saying the record count variable has an exponent in it. In the graph below — thank you Wolfram Alpha! — I compared the regression line (based on the Ponemon data) against the more sophisticated model from Jacobs. You can gaze upon the divergence or else click here to explore on your own.

If you made it this far, congratulations!

In the next post, I hope all this background will payoff as I try to connect these ideas to come up with a more nuanced way to understand data breach costs.

Continue reading the next post in "My Big Fat Data Breach Cost Series"

Verizon Data Breach Digest 2017

Verizon Data Breach Digest 2017

While we’re anxiously waiting for the next edition of the Data Breach Investigations Report (DBIR), Verizon released its annual Data Breach Digest (DBD) earlier this month. What’s the DBD? It condenses the various breach patterns discussed in the DBIR.  In this year’s report, Verizon reduced 12 patterns into a mere four generalized scenarios: the Human Element, Conduit Devices, Configuration Exploitation, and Malicious Software.

Of course, when you start abstracting and clustering information, you end up creating fuzzy caricatures. So they call what they’ve come up with “scenari-catures”.

You can’t accuse the Verizon research team of not having a wacky sense of humor.

If you play along with them, you can use the DBD to get a snapshot view of various breach scenarios. In fact, they’ve created  attack-defend cards (below) each with their own threat persona.

Trade them, collect them! (Verizon Data Breach Bulletin 2017)

By gamifying all this complicated data, C-levels and other executives who are used to dealing in broad abstractions will find these cards very comforting. Think of it as baseball cards for the security minded.

In looking through the DBD, I’ll grudgingly admit that it has real practical value, especially for those who believe that the breach they just experienced is unique.

The DBD shows that they are not alone in being phished into accidentally making a wire transfer to the Cayman Islands or discovering that their systems were hacked because IT never caught up with patches even after they assured you everything was secure. Verizon calls that scenario “the Fiddling Nero”.

I told you they had a sense of humor.  Have as much fun looking through it as I did!

Overheard: “IT security has nothing to learn from the Mirai attack”

Overheard: “IT security has nothing to learn from the Mirai attack”

After my post last week on the great Mirai Internet takedown of 2016, I received some email in response. One of the themes in the feedback was, roughly, that ‘Mirai really doesn’t have anything to do with those of us in enterprise IT security’.

Most large companies probably don’t have hackable consumer-grade CCTV cameras or other low cost IoT gadgetry that can be de-authed and taken over by the neighborhood teenager.  At least we hope not.

I should mention that power and utility companies have leaped ahead  in this area by adding lots of IoT monitoring devices to their 19th century electric infrastructure. It’s a security problem to take up on another day.

Anyway, back-doors and other vulnerabilities that we saw in the WiFi cameras exploited in the Marai incident also show up in plain business-class networking gear.

The Defenseless Perimeter

Over the summer, Cisco disclosed an exploit, known as ExtraBacon, which lets attackers remotely execute code in one of their firewall products. And in July, a zero-day exploit involving self-signed certificate was discovered in a Juniper product. It would let attackers monitor internal network traffic.

Perhaps even more disturbing is the hacking potential of firmware — the hardware-level code on which routers, phones, laptops and other gadgets rely on. Since the firmware is not typically digitally signed, it can be changed by hackers to contain special malware that can then take over the device.

Perhaps an insider in the company’s data center, but working for a cybergang, loads the firmware with a deadly implant onto a router.

Or even more sneakily, a cyber gang hacks into a router manufacturer’s website and replaces the good device firmware with the evil version, which is then downloaded to thousands of routes and firewalls around the world!

If you want to lose more sleep at night, read this article in Wired about the security holes in firmware. By the way, our frenemies at the NSA were way ahead of the curve in weaponizing this low-level code, which then recently found its way to cyber-criminal groups after our nation’s top security agency was itself …  hacked.

The larger point is that in the face of these deep vulnerabilities, the standard perimeter defense provides laughably little protection.

We’ve Defaulted!

The other lesson from the massive Mirai attack is how vulnerable we were as a result of our own IT laziness. Sure, we can excuse harried consumers for treating their home routers and IoT gadgetry like toasters and other kitchen appliances – just plug it in and forget about it.

Unfortunately, even easy-to-use and maintenance-free consumer routers — I’m talking to you Linksys — require some attention that would include changing defaults settings and putting in place complex passwords.

So what excuse do professional IT types have for this rookie-level behavior?

Not much!

Unfortunately, default-itis still plagues IT organizations.

As recently as 2014, the Verizon DBIR specifically noted that for POS-based attacks, the hackers typically scanned for public ports and then guessed for weak passwords on the PoS server or device – either ones that were never changed or were created for convenience, “admin1234”.

This is exactly the technique used in the Mirai botnet attack against the IoT cameras.

Even if hackers use other methods to get inside a corporate network — phishing, most likely — they can still take advantage of internal enterprise software in which defaults accounts were never changed.

That was the case in the mega-hack of Target. The hackers already knew about a default account and password used by a maker of popular IT management software installed on the Target network. The hackers leveraged this default account — which gave them privileged access — to copy credit card data to the exfiltration server.

Two Points

For those in IT who think that the Mirai botnet incident has nothing to do with them or have to convince their managers of this, here are the two points that summarize this post:

  1. The lesson of the Mirai botnet attack is that the perimeter will always have leaks. For argument’s sake, even if you overlook phishing scenarios, there will continue to be vulnerabilities and holes in routers, network devices, and other core infrastructure that allow hackers to get inside.
  2. Human nature tells us that IT will also continue to experience default-itis. Enterprise software is complicated. IT is often under pressure to quickly get apps and systems to work. As a result, default accounts and weak passwords that were set for reasons of convenience — thinking that users will change the passwords later — will always be an issue for organizations.

Conclusion: You have to plan for attackers breaching the first line of defenses, and therefore have in place security controls to monitor and detect intruders.

In a way, we should be thankful for the “script kiddies” who launched the Mirai botnet DDoS attack: it’s a great lesson for showing that companies should be looking inward, not at the perimeter, in planning their data security and risk mitigation programs.

Verizon 2016 DBIR: Same Old Thing

Verizon 2016 DBIR: Same Old Thing

If you’re into security stats like we are, then Verizon’s annual release of their Data Breach Investigations Report (DBIR) is a big deal. Based heavily on data points supplied by US and international security agencies as well a private sources, the Verizon report looks back at the previous year’s breach and incident data. To no one’s surprise, the report yet again paints a troubling picture.

To summarize their results, they quote noted data security expert Yogi Berra, “It’s like déjà vu, all over again.” If you look at our past DBIR-related posts, they would apply this year as well.

In short, companies face financially-driven hackers and insiders, who use malware, stolen credentials, or phishing as attack vectors. They get in quickly and then remove payment card information, PII, and other sensitive data. It often takes IT staff months to even discover there’s been a breach.

And yet again, we’re bad at keeping patches up-to-date to address known CVEs, and lag in other basic block-and-tackle remediations, such as eliminating short or default passwords and restricting sensitive file data to authorized users.

In case you’re wondering, the Verizon team continues to recommend monitoring and auditing controls as a way to reduce risks.

We’re currently combing through the 90-pages of the DBIR for any unusual findings and security insights. As in previous years, we’ll let you know what we come up with in a future post.