Data Access Governance (DAG) has many different types of use cases, with most falling into three main categories: data security, regulatory compliance, and operational efficiency. There has been a lot written about security due to the increasing frequency of ransomware attacks, and a lot is being written about compliance, most recently around privacy – but we haven’t talked much about the operational efficiency use case.
A good DAG program allows organizations to manage more data with fewer people and to identify data that can be safely archived or deleted, freeing up storage resources. In this blog series, we will walk through the Best Practices for storage reclamation by identifying and deleting data that is no longer in use throughout your organization.
The amount of unstructured data managed by organizations continues to grow at a rapid pace. File servers provide a simple, scalable approach for sharing documents; however, file systems are rarely cleaned up. Over the years, the data stored on these file servers builds up and becomes dated, leading to difficulties in finding files when they are needed, increased costs for storage space, and significant risk if sensitive information is stored within these files.
Eventually, it becomes necessary to clean up unnecessary files, which can be a daunting task when there are millions of files and thousands of users who may be accessing them. Most organizations fail to address this problem and eventually face the consequences including increased storage costs or loss due to a breach or insider threat.
Here are the five capabilities needed to efficiently clean up a file server with minimal impact to end users.
A file server cleanup can be performed for any number of reasons, but it is important to have a clear goal in mind prior to engaging in a clean-up project. Some of the scenarios where file cleanups may be necessary include:
The following capabilities will enable a successful file cleanup effort, regardless of the business driver.
The simple question, “Where is business data stored?” can be unexpectedly hard to answer. Employees can store data wherever they have rights, and it can be difficult to enforce standards as to what data goes where. To start a file server cleanup project, some basic visibility is needed.
Identifying all file servers within the organization is a necessary first step for storage reclamation. Ideally, this information is centrally managed in a Configuration Management Database (CMDB). If not, a discovery scan of the entire environment will be required. Beyond just identifying where servers exist, it is useful to understand the size of the file servers, their operating systems, the number of shared folders, and the sharing protocols used (CIFS/NFS), all of which will be needed to plan the cleanup.
Once the file servers have been identified, it is necessary to investigate the files stored within them. Typically, this will be done by approaching a small subset of servers at a time rather than inspecting files across all file servers simultaneously. Some of the attributes that should be inspected are outlined below.
Attribute | Value |
File Extension | File extensions can identify which file types are stored on the system and help identify application, business, and personal file types. |
File Size | Knowing where the largest files exist is useful for achieving the most storage savings. |
Owner / Author | Communication with the person who created the file may be needed during the cleanup process. |
Date Modified | This indicates how stale a file is. This does not take into account file reads. |
Date Accessed | This indicates the last time a file was accessed. This may not be accurate in all cases. |
Date Created | Useful for finding where files are actively being created. |
Tags / Keywords | Provides data classification details to determine where sensitive or confidential data resides. This is only valuable if an organization is leveraging a data classification solution. |
Any files containing sensitive information such as personally identifiable information (PII), Payment Card Industry (PCI) data, Personal Health Information (PHI) or other confidential data should be treated with special care during a cleanup campaign. These files should be managed under their own retention and security policies, and monitored more closely than non-sensitive files. Common approaches for identifying sensitive data include:
Awareness of sensitive data enables a much more secure cleanup campaign. Policies can be enacted for where sensitive data is stored, who has access to it, how long it is maintained, and how the data is monitored.
In the next blog, we will continue working through the list and will continue the discussion starting with monitoring Activity and File Usage and how they relate to storage reclamation.
As the VP of Product Marketing, Darin is responsible for product messaging and positioning as well as generating industry and market awareness for STEALTHbits products. He is an experienced leader who has worked in software for over 21 years.
Prior to joining STEALTHbits, he was VP of Marketing for Quorum and SecureAuth, and has held positions in product management & product marketing at Oracle, and Quest Software.
Adopting a Data Access Governance strategy will help any organization achieve stronger security and control over their unstructured data. Use this free guide to help choose the best available solution available today!
Read more© 2022 Stealthbits Technologies, Inc.
Leave a Reply