The modern enterprise relies on data to power everything from operational decisions to strategic initiatives. Organizations increasingly turn to cloud data platforms like Snowflake to gain scale, performance, and flexibility.
However, many enterprise security teams mistakenly assume that since Snowflake is a secure system, they don’t need to take any actions to secure their Snowflake data. In reality, Snowflake data is vulnerable if the right security measures aren’t taken.
In 2024, a large-scale campaign targeting Snowflake users led to data breaches at an estimated 165 companies. In this blog, we’ll discuss the shared responsibility model and how you can secure your sensitive Snowflake data to prevent data breaches.
Like most cloud platforms, Snowflake uses a shared responsibility model when it comes to data security.
Snowflake maintains robust security for the underlying platform, including infrastructure security, encryption of data at rest and in transit, and providing the mechanisms for access control. However, it’s the customers’ responsibility to leverage Snowflake access controls and other tools to make sure that they:
Traditional shared responsibility model
Snowflake provides native data governance and security features such as role-based access control, usage monitoring, column- and row-level security, and network policies. It also supports single sign-on (SSO) and multifactor authentication (MFA).
However, organizations with complex Snowflake environments struggle to secure their Snowflake data using only these native capabilities because:
Organizations are forced to either build one-off security processes or accept a lack of visibility into where their sensitive data resides, overly broad user access, and an inability to detect malicious activity such as insider threats and compromised credentials.
Snowflake lacks robust native data classification capabilities to automatically discover and classify sensitive information. Admins must rely on Snowflake’s tags and comments to manually classify and tag databases, schemas, tables, and columns.
These manual processes become completely impractical for even medium-sized Snowflake environments.
In Snowflake, roles define both the set of objects that a user can access and the set of privileges (Create, Read, Update, Delete) that users have for each object.
Roles can be granted to users as well to other roles, creating inheritance chains with permissions cascading downward. A user gets the permissions of the roles assigned to them, as well as all the permissions of all the roles granted to the roles that have been assigned to the user.
The most challenging aspects of Snowflake access controls are:
Unintended consequence: Our data analyst now has access to PII data that he never requested and probably isn't even aware of.
This scenario isn't theoretical — it happens all the time in large enterprises where role structures evolve organically over time. The marketing team was simply trying to enable more effective campaigns, with no malicious intent or awareness of the security implications.
Yet, the outcome is a clear violation of least privilege principles that the organization is responsible for maintaining under the shared responsibility model. Moreover, it may constitute a violation of GDPR and other data privacy regulations, potentially subjecting the company to fines and reputational damage.
As organizations adopt machine learning and AI, over-permissioning becomes an immediate threat rather than a potential risk. Before the age of AI, an over-permissioned user might never actually access the sensitive data that they shouldn't have access to.
Most people usually only access what they need for their immediate tasks, leaving any excessive permissions unused. AI makes it much more likely that over-permissioned access leads to exposure:
In the age of AI, data exposure due to excessive permissions is no longer a risk but rather a certainty. If an AI system can access sensitive data, it will access it, process it, store it, and expose it.
Thus, for organizations that run AI systems on Snowflake data, addressing over-permissioned users is an urgent imperative rather than a risk reduction/compliance exercise.
While Snowflake provides extensive logging that covers queries, object-level access event history, and log-in history, it’s up to the customer to make sense of the log data to figure out what’s normal, what’s unusual, and what’s malicious. However, this is extremely hard to do, especially in large Snowflake deployments.
There are three key challenges:
While Snowflake database, schema, and table names may indicate that they contain sensitive data, there can be columns, views, and data objects that are not clearly labeled but do contain sensitive or highly regulated data like PII, PHI, or PCI.
Additionally, as data pipelines evolve and new datasets are ingested, sensitive information can appear in unexpected locations. This means you can't rely on database, schema, table, and column names to identify sensitive data, but instead, you must find a way to continuously scan your entire Snowflake environment and identify sensitive data as soon as it gets into your Snowflake instance.
You must be able to identify all your sensitive data in Snowflake, determine who can access it, and right-size permissions so that only those users who genuinely need access to a particular resource get it.
However, since Snowflake's role-based access model creates complex inheritance chains where users accumulate permissions through multiple role assignments, manually reviewing and right-sizing each user's effective permissions across all inheritance paths becomes impossible as your Snowflake deployment grows.
The majority of data breaches involve compromised credentials or insider threats.
Organizations must have a way to analyze user behavior patterns in Snowflake to ensure that access rights are only being used for legitimate purposes.
This is especially critical in the AI era, where automated systems may access vast amounts of data in ways that differ significantly from normal human usage patterns. However, Snowflake offers no native activity monitoring and only offers raw event logs that come in with a 45-minute to 3-hour delay and are only preserved for up to 90 days.
Varonis empowers security teams and data administrators to identify sensitive data, achieve and maintain least privilege, and detect and prevent unauthorized activity in Snowflake environments.
Varonis offers organizations complete security, visibility, and control over sensitive data in Snowflake, delivering the automated capabilities required to secure Snowflake data at enterprise scale. Varonis helps organizations:
Varonis uses advanced AI models and pattern matching to automatically discover and classify sensitive data across all Snowflake databases, schemas, tables, and columns. This includes structured data that may not be obviously labeled as sensitive, such as custom fields, derived tables, data that becomes sensitive when combined with other datasets, as well as unstructured data such as free-text fields and file stores.
AI classification identifies novel data types while our comprehensive library of 100+ out-of-the-box classification policies efficiently identifies PII, PHI, financial data, and other regulated information.
Rather than requiring manual analysis of complex role structures, Varonis automatically parses through Snowflake's complex role hierarchies and determines effective permissions for every user on every data resource. This automated approach enables organizations to achieve and maintain least privilege, even in complex Snowflake environments.
Varonis establishes behavioral baselines for every user and system accessing Snowflake, detecting abnormal patterns that may indicate compromise, insider threats, or AI systems accessing data inappropriately.
For example, while a data analyst running regular marketing reports is normal, it's suspicious if they suddenly access customer PII tables they've never used before, especially if this occurs outside normal business hours or involves unusually large data extracts.
Varonis dramatically reduces false positives by focusing alerts specifically on threats to sensitive data and providing rich context about what makes each activity suspicious.
To protect your sensitive data in Snowflake, you must be able to identify where it lives, right-size who can access it, and detect how it's being accessed and modified.
Only Varonis provides all these critical capabilities in a single platform, addressing the key challenges of sensitive data identification and abnormal access detection that Snowflake's native tools struggle with.
If you're concerned about potential risks in your Snowflake environment, a free Snowflake Data Risk Assessment is the best way to get started. In less than 24 hours, you'll have a comprehensive, risk-based view of your most critical data assets and a clear roadmap to automated remediation that strengthens your security posture while maintaining business productivity.