3 Steps to Secure Your Snowflake Data

Written by Eugene Feldman | May 29, 2025 5:31:36 PM

The modern enterprise relies on data to power everything from operational decisions to strategic initiatives. Organizations increasingly turn to cloud data platforms like Snowflake to gain scale, performance, and flexibility.

However, many enterprise security teams mistakenly assume that since Snowflake is a secure system, they don’t need to take any actions to secure their Snowflake data. In reality, Snowflake data is vulnerable if the right security measures aren’t taken.

In 2024, a large-scale campaign targeting Snowflake users led to data breaches at an estimated 165 companies. In this blog, we’ll discuss the shared responsibility model and how you can secure your sensitive Snowflake data to prevent data breaches.

Who is responsible for data security within your Snowflake instance?

Like most cloud platforms, Snowflake uses a shared responsibility model when it comes to data security.

Snowflake maintains robust security for the underlying platform, including infrastructure security, encryption of data at rest and in transit, and providing the mechanisms for access control. However, it’s the customers’ responsibility to leverage Snowflake access controls and other tools to make sure that they:

Only allow the right people to access the Snowflake instance
Identify all sensitive data and handle it in ways that comply with government and corporate policies
Right-size data access so that each Snowflake user can only access the data they need (least privilege)

Traditional shared responsibility model

Snowflake provides native data governance and security features such as role-based access control, usage monitoring, column- and row-level security, and network policies. It also supports single sign-on (SSO) and multifactor authentication (MFA).

However, organizations with complex Snowflake environments struggle to secure their Snowflake data using only these native capabilities because:

Data classification is manual and not scalable.
Roles and permissions are complex, leading to over-permissioning and data exposure.
Logs require a lot of processing to effectively detect malicious and abnormal access.

Organizations are forced to either build one-off security processes or accept a lack of visibility into where their sensitive data resides, overly broad user access, and an inability to detect malicious activity such as insider threats and compromised credentials.

What’s complex about securing Snowflake data?

It’s hard to figure which data is sensitive

Snowflake lacks robust native data classification capabilities to automatically discover and classify sensitive information. Admins must rely on Snowflake’s tags and comments to manually classify and tag databases, schemas, tables, and columns.

These manual processes become completely impractical for even medium-sized Snowflake environments.

It’s hard to achieve and maintain least privilege

In Snowflake, roles define both the set of objects that a user can access and the set of privileges (Create, Read, Update, Delete) that users have for each object.

Roles can be granted to users as well to other roles, creating inheritance chains with permissions cascading downward. A user gets the permissions of the roles assigned to them, as well as all the permissions of all the roles granted to the roles that have been assigned to the user.

The most challenging aspects of Snowflake access controls are:

Multi-level inheritance: Permissions can cascade through several layers of roles
Multiple inheritance paths: Users can receive the same privilege through different role combinations.
Privilege aggregation: Effective permissions are the sum of all inherited privileges.
Dynamic role changes: As roles evolve over time, the impact of earlier provisioning decisions becomes increasingly difficult to track.
Object-level privileges: Different privileges on the same object (SELECT vs. MODIFY) create additional complexity.
Future grants: Snowflake allows granting privileges on future objects in a schema, creating unpredictable access patterns as new objects get created.

As organizations scale their Snowflake deployments, the complexity created by nested Roles makes it a virtual certainty that users get many more permissions than they need.

Example: How nested roles lead to unintended permissions

Let's say we have a data analyst who needs access to marketing data, so he gets assigned a MARKETING_ANALYST role.
At some point, the organization creates a CAMPAIGN_MANAGER role that includes the MARKETING_ANALYST role for convenience (i.e, fewer checkboxes for objects and privileges to check during the role creation process).
Our data analyst works on a Tableau marketing dashboard connected to Snowflake and needs access to campaign data to build performance metrics and visualizations, so he's also granted the CAMPAIGN_MANAGER role.
Marketing decides to run hyper-targeted campaigns that use PII data for more granular targeting. They add additional access to customer PII data to the CAMPAIGN_MANAGER role.

Unintended consequence: Our data analyst now has access to PII data that he never requested and probably isn't even aware of.

This scenario isn't theoretical — it happens all the time in large enterprises where role structures evolve organically over time. The marketing team was simply trying to enable more effective campaigns, with no malicious intent or awareness of the security implications.

Yet, the outcome is a clear violation of least privilege principles that the organization is responsible for maintaining under the shared responsibility model. Moreover, it may constitute a violation of GDPR and other data privacy regulations, potentially subjecting the company to fines and reputational damage.

Over-permissioning is even a bigger problem in the age of AI

As organizations adopt machine learning and AI, over-permissioning becomes an immediate threat rather than a potential risk. Before the age of AI, an over-permissioned user might never actually access the sensitive data that they shouldn't have access to.

Most people usually only access what they need for their immediate tasks, leaving any excessive permissions unused. AI makes it much more likely that over-permissioned access leads to exposure:

AI consumes all the data it can access: Unlike humans, AI systems systematically analyze all the data they can access, making it virtually guaranteed that over-permissioned access will result in sensitive data exposure.
AI recognizes patterns across datasets: AI surfaces relationships between seemingly unrelated data points that most people would miss, potentially exposing sensitive insights from data that appears non-sensitive.
AI stores the data it analyzes: Once an AI system processes data, it becomes a part of its knowledge and creates lasting exposure beyond the initial access.
AI processes data in huge volumes: Since an over-permissioned AI agent processes data much faster than an over-permissioned person, sensitive data gets exposed at a tremendous scale and speed.

In the age of AI, data exposure due to excessive permissions is no longer a risk but rather a certainty. If an AI system can access sensitive data, it will access it, process it, store it, and expose it.

Thus, for organizations that run AI systems on Snowflake data, addressing over-permissioned users is an urgent imperative rather than a risk reduction/compliance exercise.

It’s hard to identify malicious or abnormal data access patterns

While Snowflake provides extensive logging that covers queries, object-level access event history, and log-in history, it’s up to the customer to make sense of the log data to figure out what’s normal, what’s unusual, and what’s malicious. However, this is extremely hard to do, especially in large Snowflake deployments.

There are three key challenges:

Large volume of logs: Large organizations deal with hundreds of thousands of queries per day across thousands of objects.
No contextual awareness: While Snowflake offers extremely comprehensive logs, it doesn’t correlate different types of events like queries, logins, and data access. It also doesn’t have any metadata about the access events.
Technical barriers: While a lot of the data is technically available, it takes a lot of data engineering and data analysis time and effort to identify abnormal data access patterns and to create any useful alerts.

#1 Identify sensitive data

While Snowflake database, schema, and table names may indicate that they contain sensitive data, there can be columns, views, and data objects that are not clearly labeled but do contain sensitive or highly regulated data like PII, PHI, or PCI.

Additionally, as data pipelines evolve and new datasets are ingested, sensitive information can appear in unexpected locations. This means you can't rely on database, schema, table, and column names to identify sensitive data, but instead, you must find a way to continuously scan your entire Snowflake environment and identify sensitive data as soon as it gets into your Snowflake instance.

#2 Ensure users can only access the data they need

You must be able to identify all your sensitive data in Snowflake, determine who can access it, and right-size permissions so that only those users who genuinely need access to a particular resource get it.

However, since Snowflake's role-based access model creates complex inheritance chains where users accumulate permissions through multiple role assignments, manually reviewing and right-sizing each user's effective permissions across all inheritance paths becomes impossible as your Snowflake deployment grows.

#3 Ensure access rights aren't misused

The majority of data breaches involve compromised credentials or insider threats.

Organizations must have a way to analyze user behavior patterns in Snowflake to ensure that access rights are only being used for legitimate purposes.

This is especially critical in the AI era, where automated systems may access vast amounts of data in ways that differ significantly from normal human usage patterns. However, Snowflake offers no native activity monitoring and only offers raw event logs that come in with a 45-minute to 3-hour delay and are only preserved for up to 90 days.

How Varonis helps organizations secure their Snowflake data

Varonis empowers security teams and data administrators to identify sensitive data, achieve and maintain least privilege, and detect and prevent unauthorized activity in Snowflake environments.

Varonis offers organizations complete security, visibility, and control over sensitive data in Snowflake, delivering the automated capabilities required to secure Snowflake data at enterprise scale. Varonis helps organizations:

#1 Identify sensitive data

Varonis uses advanced AI models and pattern matching to automatically discover and classify sensitive data across all Snowflake databases, schemas, tables, and columns. This includes structured data that may not be obviously labeled as sensitive, such as custom fields, derived tables, data that becomes sensitive when combined with other datasets, as well as unstructured data such as free-text fields and file stores.

AI classification identifies novel data types while our comprehensive library of 100+ out-of-the-box classification policies efficiently identifies PII, PHI, financial data, and other regulated information.

#2 Ensure users can only access the data they need

Rather than requiring manual analysis of complex role structures, Varonis automatically parses through Snowflake's complex role hierarchies and determines effective permissions for every user on every data resource. This automated approach enables organizations to achieve and maintain least privilege, even in complex Snowflake environments.

#3 Ensure access rights aren't misused

Varonis establishes behavioral baselines for every user and system accessing Snowflake, detecting abnormal patterns that may indicate compromise, insider threats, or AI systems accessing data inappropriately.

For example, while a data analyst running regular marketing reports is normal, it's suspicious if they suddenly access customer PII tables they've never used before, especially if this occurs outside normal business hours or involves unusually large data extracts.

Varonis dramatically reduces false positives by focusing alerts specifically on threats to sensitive data and providing rich context about what makes each activity suspicious.

See how Varonis can help your organization secure Snowflake data

To protect your sensitive data in Snowflake, you must be able to identify where it lives, right-size who can access it, and detect how it's being accessed and modified.

Only Varonis provides all these critical capabilities in a single platform, addressing the key challenges of sensitive data identification and abnormal access detection that Snowflake's native tools struggle with.

If you're concerned about potential risks in your Snowflake environment, a free Snowflake Data Risk Assessment is the best way to get started. In less than 24 hours, you'll have a comprehensive, risk-based view of your most critical data assets and a clear roadmap to automated remediation that strengthens your security posture while maintaining business productivity.

View full post