Stealthbits

Scanning for Sensitive Data in Snowflake with Stealthbits AnyData

Blog >Scanning for Sensitive Data in Snowflake with Stealthbits AnyData
Scanning for Sensitive Data in Snowflake with Stealthbits AnyData

Having multiple public/private clouds and data repositories has become ubiquitous in professional environments. For most, gone are the days of storing all data on local filers or even in a limited set of online repositories. The reality is that each organization’s sensitive data is being stored in many cloud databases, object storage repos, SMB implementations, version control, CRM software, and more.

These days the list seems to be never-ending – Azure Storage, GitHub, Snowflake, Salesforce, Google Cloud/Drive, AWS, Dropbox, Wasabi, Backblaze, and many, many more.

While the popularity of public and private cloud infrastructure comes with many benefits, it also means that your data, especially your sensitive data, is now fragmented across many locations, types of storage, and vendors. If your organization were to receive a Data Subject Access Request (DSAR), how are you prepared to locate all that user’s personally identifiable information (PII) in a timely manner?

This is where data loss prevention (DLP) and data classification software come into play, however, these engines traditionally target very specific environments, potentially only on-premises or for a limited subset of cloud repositories. This is where Stealthbits’ AnyData connector comes in, which allows our Sensitive Data Discovery (SDD) engine to be pointed at any cloud, on-premises storage, online database, CRM software, etc. that allows the data to be fetched via an API.

No longer are users limited to scanning pre-defined environments, and as an example let’s discuss how AnyData can be used to scan Snowflake databases. Snowflake considers itself a data warehouse as a cloud service (a software-as-a-service (SaaS) offering), and they’re rapidly growing in popularity.

An Introduction to AnyData

At a high-level, AnyData gives users the ability to take StealthAUDIT’s sensitive data engine and point it at any data source to scan for sensitive data. By leveraging PowerShell and the target source’s API, AnyData fetches data (i.e. text, files, objects, etc.) from the target and scans that data in real-time for pre-defined and user-defined criteria (SSNs, credit card numbers, national IDs, medical information, financial records, etc., for example).

This is groundbreaking stuff! Traditional DLP software requires complex integrations to target specific platforms, but with AnyData you just need to fetch the data in PowerShell and feed it to the Get-SensitiveData function provided by the AnyData PowerShell module. For many targets, we’ve provided the PowerShell for you (as StealthAUDIT jobs) so all you need to do is point AnyData at a target and start the scan. However, AnyData also gives users the flexibility to write their own PowerShell targeting any API; the possibilities are endless!

The PowerShell to feed AnyData is also designed with performance and local storage availability in mind. AnyData fetches one file or chunk of text from the target at a time, scans it, returns the results, and then discards the local copy of that data. This avoids the need to download everything from the cloud all at once, which could potentially require unrealistic amounts of local storage.

It’s also worth noting that while AnyData will primarily be used in the StealthAUDIT jobs that scan entire cloud repositories, it can also be pointed at individual files or strings of text for real-time scans by importing the AnyData module into an interactive PowerShell session. This can be useful for testing a new sensitive data criterion or verifying a larger workflow before scripting it out and deploying it to production.

AnyData + Snowflake

Now that we’ve established a general understanding of how AnyData complements and expands on existing StealthAUDIT sensitive data discovery workflows, we can dive into the specifics of targeting a Snowflake database environment.

In the case of Snowflake, the API is their ODBC driver. This provides easy access to the schema as well as database and table enumeration. Once we’ve enumerated the database(s), we perform a sample on several tables simultaneously to determine which columns contain sensitive data.

We then import this data to the Access Information Center (AIC) for a Resource Audit, which provides a friendly graphical view of the discovered sensitive data across your Snowflake databases.

That’s all there is to it! The goal of AnyData is to simplify the sensitive data discovery process for as many sources as possible. Whether it’s cloud or on-prem, storage or version control, file, or text-based, AnyData abstracts away the complexities of DLP workflows.

Stealthbits Technologies

IDENTIFY THREATS. SECURE DATA. REDUCE RISK.

Stealthbits’ StealthAUDIT data access governance solution includes a sensitive data component that helps organizations identify where sensitive data is located, who has access to it, how it’s being accessed, and what they’re doing with it.

AnyData is the newest aspect of StealthAUDIT’s sensitive data capabilities and opens a world of possibilities for tracking down sensitive data and classifying it regardless of where that data is stored.

StealthAUDIT includes:

Host Discovery: Identify the different platforms within the network that may contain various unstructured and structured data repositories to ensure a comprehensive view of your organization’s sensitive data.

Sensitive Data Discovery + AnyData: Analyze content for patterns or keywords that match built-in or customized criteria.

Remediation Actions: Automate all or portions of the tasks you need to perform to remediate sensitive data violations. Learn more about Stealthbits’ Data Access Governance here.

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe

DON’T MISS A POST. SUBSCRIBE TO THE BLOG!

 

Loading

© 2020 Stealthbits Technologies, Inc.

Start a Free Stealthbits Trial!

No risk. No obligation.

FREE TRIAL