What is Data Security?

Data Security is a process of protecting files, databases, and accounts on a network by adopting a set of controls, applications, and techniques that identify the relative importance of different datasets, their sensitivity, regulatory compliance requirements and then applying appropriate protections to secure those resources.

Similar to other approaches like perimeter security, file security or user behavioral security, data security is not the be all, end all for a security practice. It’s one method of evaluating and reducing the risk that comes with storing any kind of data.

Why Data Security?

If the Data Security process is just one of many different ways to structure your organization’s information security systems, what makes it better than competing methods?

Broadly speaking, most other security processes are “user-centric”: they focus on questions like:

  • Is this user allowed to access this data?
  • Is this person authorized to be on this network?
  • Is this person abusing system resources?

Which is great and necessary but struggles with many real-world issues like large organizations having hundreds or thousands of servers with haphazardly applied permissions, antiquated user groups and gaps in knowing who is accessing what.

A data-centric security model is a practical way of approaching this from a different direction.

Data vs User Security Models

Imagine a scenario where a user on your customer service team places a spreadsheet containing customer Personally Identifiable Information like Social Security Numbers or other sensitive records onto a globally accessible shared folder.

User Centric Model: this isn’t a problem, everyone has the proper rights to access that file.

Data Security Model: this is a huge problem as sensitive information is now available to every intern, contractor or “coasting through their two weeks notice until they take a new job at your biggest competitor” employee with network access.

This scenario makes plain the big dependency of a Data Security approach: data classification.

What is Data Classification?

Data Classification is the technical term for knowing what data is held within any given file.

Classification is typically conducted along two different paths:

  1. File Types: certain files such as private SSH keys and PKI certificates are just inherently more sensitive.
  2. File Information: the actual data contained within the file.

File information is by far the more complicated of the two paths as it’s necessary to define what sensitive information “looks” like to the software application doing the classification.

In some cases, this is as straightforward as looking for certain tell-tale strings of data like: “password”, or “credit card” in a file, but more often it’s necessary to define patterns to be matched within a given file.

Regular Expressions

Regular Expressions are one of the primary ways that people communicate patterns to computer programs, so it’s no surprise that they are one of the key components of many classification systems used in data security practices.

US Social Security numbers have a distinct pattern of 3 digits, a dash, 2 digits, a dash followed by 4 digits.


A regular expression to identify a Social Security Number within a document would be:

If the above looks like printable line noise to you, you are not alone. While a comprehensive review of Regular Expressions is beyond the scope of this article, we encourage you to try at least get a feel for their use with one of the online Regular Expression writers, which is an easy way to get instant feedback on your knowledge:

Data Security Techniques

The following are broach techniques used in the field of Data Security to improve security.

Stop Collecting Unneeded Data

The last decade of IT management has seen a shift in how data is considered. Previously it was an asset: having more data was almost always better than less as you could never be sure ahead of time what you might want to do with it.

Today, data is a liability. The threat of a reputation-destroying data breach, loss in the millions or stiff regulatory fines all reinforce the thought that collecting anything beyond the minimum amount of sensitive data is extremely dangerous.

To that end: review all data collection procedures. Document why each data point is needed from a business standpoint.

Purge Stale Data

Data that is not on your network is data that can’t be compromised. Put in place systems that track file access and automatically archive files that haven’t been accessed for years. In the modern age of near yearly acquisitions, reorganizations and “synergistic relocations” it’s quite likely that networks of any significant size have multiple forgotten servers that are kept around for reasons no one is quite sure of.

Quarantine Sensitive Files

Earlier, we described a common scenario where a file containing sensitive data was placed on a share open to the entire company. Systems that continually classify data and take preemptive action to move those files to a secure location are worth their weight in gold as they dramatically shorten the length of time that data is not under the proper control.

Track User Behavior against Data Groups

The general term plaguing rights management within an organization is “overpermissioning’. That one-off, temporary projects or rights grants on the network rapidly become a baroque and convoluted web of interdependencies that result in users collectively having access to far more data on the network than they need for their role.

Systems that profile user behavior and automatically put in place permissions to match that behavior limits the potential damage that any one user (or malicious attacker who compromises their account) can do.

Respect Data Privacy

Data Privacy is a distinct aspect of cybersecurity dealing with the rights of individuals and the proper handling of data under your control. For more, read our Guide to Data Privacy

Data Security Regulations: GDPR, HIPAA and SOX

Regulations such as HIPAA (healthcare), SOX (public companies) and GDPR (anyone who knows that the EU exists) are best considered from a data security perspective. While there are other aspects of them, at their core they require that organizations:

  • Track what kinds of sensitive data they possess
  • Be able to produce that data on demand
  • Prove to auditors that they are taking appropriate steps to safeguard the data

All of which fit not just comfortably within a data security mindset, but all but require it.

Practical Data Security

For companies that have a hold on data and have security obligations due to GDPR or other regulatory requirements, understanding what data security means at Varonis will help you manage and meet data protection and privacy regulations requirements.

The mission at Varonis is simple: your data is our primary focus, and our data security platform protects your file and email systems from cyber attacks and insider threats. We’re fighting a different battle – so your data is protected first. Not last.

We continuously collect and analyze activity on your enterprise data, both on-premises and in the cloud. We then leverage five metadata streams to ensure that your organization’s data has confidentiality, integrity, and availability.

Users and Groups – Varonis collects user and group information and maps their relationships for a complete picture of how user accounts are organized.

Permissions – We add the file system structure and permissions from the platforms that we monitor, and combine everything into a single framework for analysis, automation, and access visualization.

Access Activity – Varonis continually audits all access activity, and records & analyzes every touch by every user. Varonis automatically identifies administrators, service accounts and executives and creates a baseline of all activity. Now you can detect suspicious behavior, whether it’s an insider accessing sensitive content, an administrator abusing their privileges, or ransomware like CryptoLocker.

Perimeter TelemetryVaronis Edge analyzes data from perimeter devices such as VPN proxy servers, and DNS and combines this information with data access activity to detect and stop malware apt intrusions and data exfiltration.

Content Classification – We then scan for sensitive and critical data, and can absorb classification from other tools like DLP or e-Discovery. Now we know where sensitive data lives and where it’s overexposed.

Michael Buckbee

Michael Buckbee

Michael has worked as a syadmin and software developer for Silicon Valley startups to the US Navy and everything in between.