Varonis announces strategic partnership with Microsoft to accelerate the secure adoption of Copilot.

Learn more

How to Discover GDPR Data With Varonis

GDPR goes into effect in less than 85 days – but there’s still time to prepare. The first step in getting ready for the upcoming deadline is to discover and...
Michael Buckbee
2 min read
Last updated May 9, 2022

GDPR goes into effect in less than 85 days – but there’s still time to prepare. The first step in getting ready for the upcoming deadline is to discover and classify your GDPR data.

More often than not, we’re seeing that customers have much more GDPR eligible data than they thought they had – or even knew existed. A recent GDPR Readiness Assessment for a mid-sized insurance company revealed some eye-opening results. In the below example, we focused on a single data store with 12 TB of data in 20+ million files across 1.36 million folders.

Get the Free Essential Guide to US Data Protection Compliance and Regulations

Readiness Assessment
Excerpt of GDPR data from a Varonis GDPR Readiness Assessment

On that single data store we found over 15,000 files with GDPR sensitive data. 90% of the files that held German data – ranging from DE passport numbers to Personalausweisnummer (German identity card number) – were open to the entire company…and the German data was in the best shape. France, Spain, and Sweden classification hits were 100% exposed!

How Can I Identify My GDPR Data

How Can I Identify My GDPR Data?

It can be difficult to discover and classify what data falls under the GDPR – so difficult, in fact, that we built GDPR-specific patterns on top of our classification engine to do just that.

The Varonis Data Security Platform maps your data stores, so that you can monitor and analyze data that falls under the GDPR. This map contains the folders and permissions for all storage volumes where GDPR sensitive data can exist, from a NetApp server to EMC Isilon to Windows to Office 365 (and beyond).

Once you have that map of data, you can begin the process of scanning those files for GDPR data. We see GDPR data in word documents, spreadsheets, notepad files, even XML files. Our Data Classification Engine is file type agnostic, so we will find the data even if it’s zipped.

Varonis GDPR Patterns has over 250 patterns and regexes for GDPR data, covering all 28 EU countries. It identifies and flags data that looks like an IBAN number, social security number, passport number, personal ID card, VAT number, mobile phone number, license plate number, tax registry number, and much more.  You’ll be able to review the results in the DatAdvantage console with a GDPR category tag.

Sample of data classification matches for GDPR data – Germany
Sample of data classification matches for GDPR data – Germany

It can take a few weeks to scan all of your unstructured data stores if you run the system 24×7. This is one task where throwing processor power at the problem does make it go faster. You can also distribute the work across several Varonis Collector Servers to multiply the number of CPUs doing the work. The more, the merrier! And don’t worry – the Collector caps the amount of CPU the Data Classification Engine can use, so there’s minimum performance impact, and plenty of space left for the rest of the OS to do work.

On an 8 CPU system, Data Classification Engine can scan around 100GB per hour per Varonis Collector Server. In a day, that comes to 2.4 TB of data per Collector.

Disclaimer: These numbers are based on internal testing, your mileage may vary.

How Can I Find New GDPR Data?

Data Classification Engine continues to scan your data after the initial scan is complete, since users will update and add data faster than you can lock them down. Varonis updates the previously mentioned folder and permissions map daily (or whatever you configure) and then adds modified folders back into the queue to get scanned again. Data Classification Engine does not stop, it doesn’t feel pity or remorse, it will find all the GDPR data, and then it still won’t stop.

Once you discover your GDPR data, you need to figure out what to do with it – how to manage, process, and report on it – which I’ll cover in the next few parts of this series.

If you already know you need to prepare for GDPR, see how you’re doing with a free GDPR Readiness Assessment. We’ll do an assessment of your current state and present a report highlighting GDPR data, potential vulnerabilities, and strategies to protect that data.

What you should do now

Below are three ways we can help you begin your journey to reducing data risk at your company:

  1. Schedule a demo session with us, where we can show you around, answer your questions, and help you see if Varonis is right for you.
  2. Download our free report and learn the risks associated with SaaS data exposure.
  3. Share this blog post with someone you know who'd enjoy reading it. Share it with them via email, LinkedIn, Reddit, or Facebook.

Try Varonis free.

Get a detailed data risk report based on your company’s data.
Deploys in minutes.

Keep reading

Varonis tackles hundreds of use cases, making it the ultimate platform to stop data breaches and ensure compliance.

what-is-c2?-command-and-control-infrastructure-explained
What is C2? Command and Control Infrastructure Explained
A successful cyberattack is about more than just getting your foot into the door of an unsuspecting organization. To be of any real benefit, the attacker needs to maintain persistence…
common-types-of-malware
Common Types of Malware
This piece covers the various types of malware that are available and their characteristics.
koadic:-pen-testing,-pivoting,-&-javascripting,-part-ii
Koadic: Pen Testing, Pivoting, & JavaScripting, Part II
Mshta and rundll32, the Windows binaries that Koadic leverages, have been long known to hackers. If you take a peek at Mitre’s ATT&CK database, you’ll see that rundll32 has been...
koadic:-lol-malware-meets-python-based-command-and-control-(c2)-server,-part-i
Koadic: LoL Malware Meets Python-Based Command and Control (C2) Server, Part I
In my epic series on Windows binaries that have dual uses– talkin’ to you rundll32 and mshta — I showed how hackers can stealthy download and launch remote script-based malware....