Varonis Announces AI-Powered Data Discovery and Classification

Varonis' new LLM-driven data scanning gives customers a deeper business context with unmatched precision and scale.

3 min read

Last updated August 6, 2024

Today, we are strengthening our industry-leading data classification engine with powerful new AI data classification capabilities.

Using novel machine learning techniques to analyze sentiment and business context, Varonis can automatically discover, understand, and categorize customers’ unique data better than ever.

Without accurate and complete data classification, it’s impossible to prioritize risk, remediate exposures, or enforce downstream security controls. Gartner reports that over 35% of data security projects fail due to inadequate data discovery and classification.

Every Varonis customer is different, each with its own proprietary data types and formats. By combining the power of trainable AI classifiers and Varonis’ battle-tested classification, organizations can reap the benefits of multiple techniques for maximum accuracy, performance, and cost. No rigorous tuning, no black boxes.

Read on to learn more about how our next-gen AI classification works and what sets us apart from first-gen AI classification solutions.

Building on our market leadership

Varonis has long been considered the leading data security solution on the market, with nearly two decades of data classification expertise. Our classification engine is recognized in the Forrester Wave™ for Data Security Platforms for its scalability, accuracy, contextual awareness, and incremental scanning functionality.

Our data classification approach is based on a principle we call the three Cs:

Complete. We perform full scans on huge data stores. No blind spots.
Contextual. We can determine if sensitive data is exposed, misplaced, mislabeled, or under attack.
Current. We know what’s created and changed as it happens, so visibility is updated in real time.

Other solutions rely on sampling — even where it is illogical to do so. They provide limited or no context into exposure, identity, or data access activity, rendering them unaware of new or changed data without performing time-consuming rescans.

A CISO who switched to Varonis from another classification technology said, “Our three-year contract expired before our first scan finished. By then, the results were completely obsolete.”

We pride ourselves on the ability to act on classification results with real-time alerting on sensitive data sharing, misconfigurations, abnormal access, excessive access — anything that puts data in harm’s way or violates policy.

The ability to classify multi-petabyte environments has been essential for our success. We’ve addressed the gaps left by first-gen AI-based classification tools, making Varonis the ultimate classification solution for all your data, wherever it lives.

AI data classification done right

In speaking with customers about their experiences with first-gen AI classification, we identified several challenges with other solutions that were rushed to market or are over-reliant on general-purpose LLMs. These conversations translated into functional requirements for our AI.

Minimal training requirements

First-gen AI models required hundreds or thousands of true positives and true negatives to train custom classifiers. Training requirements for Varonis custom classifiers are minimal, requiring very few true positives to achieve high levels of accuracy.

Cost-effective local scanning and data sovereignty

AI, when used inefficiently, can require significant compute. This has led many vendors to require customer data to be sent to the vendor’s cloud computing infrastructure to be classified. Varonis AI models are computationally efficient enough to run on local compute. Not only does this reduce cost, but it also means data doesn’t need to leave the customer’s environment to be scanned.

Transparency and flexibility

Users of first-gen AI classification reported that it was hard to know whether AI models were identifying the required data sets consistently, especially when combined with sampling, as is the practice with many vendors.

In other cases, when customers were able to verify that the AI was not identifying the required data sets consistently, they had no recourse but to wait for the vendor to assist — the AI models were a “black box.” Varonis AI models are reasonably transparent and adjustable for customers.

The magic combo of AI and pattern-matching

With the addition of AI classification, Varonis expands its already vast classification capabilities to provide teams with a full arsenal to choose the right tool for the job.

AI specializes in determining context and sentiment. However, AI can be less efficient and less accurate than rule-based classification methods when used to identify many data elements our customers are tasked with finding, such as credit card numbers, credentials, account numbers, and other identifiers.

The real magic is in combining the two. In current testing, adding trainable classifiers to our existing classification policies increased default accuracy from ~95% to better than ~99%, reducing both false negatives and false positives.

Ready to secure your data?

The right data classification strategy can help your company prevent breaches, investigate incidents quickly, and ensure you're meeting increasingly stringent regulations. By focusing on coverage, accuracy, and scale, the Varonis Data Security Platform can help you overcome your biggest security risks with virtually no manual effort.

Combine LLM-based and rule-based classification for fast and accurate results
Understand context around sensitive data exposure, permissions, and access activity
Automatically remediate exposures, enforce least privilege, and apply security policies
Automatically label data to enforce downstream DLP and DRM
Continuously monitor sensitive data and respond to abnormal behavior

If you have any questions, don’t hesitate to contact us.

What should I do now?

Below are three ways you can continue your journey to reduce data risk at your company:

Schedule a demo with us to see Varonis in action. We'll personalize the session to your org's data security needs and answer any questions.

See a sample of our Data Risk Assessment and learn the risks that could be lingering in your environment. Varonis' DRA is completely free and offers a clear path to automated remediation.

Follow us on LinkedIn, YouTube, and X (Twitter) for bite-sized insights on all things data security, including DSPM, threat detection, AI security, and more.

Rob Sobers Rob Sobers is a software engineer specializing in web security and is the co-author of the book Learn Ruby the Hard Way.