top of page
  • Writer's pictureNth Generation

Data Retention. Wait, what?

Updated: Dec 14, 2021

By: Jerry Craft, vCISO, Sr. Security Consultant | Nth Generation


As a virtual CISO, I have the pleasure of meeting with many different executives across a variety of business verticals every day. In an average day, I meet with executives and discuss business risk. One risk that keeps coming up but getting pushed aside is data retention. Time and time again, I find that some executives are unsure how, or if, they should be concerned about data retention or data classification.


For example, I was speaking with a governance committee as part of their normal vCISO meeting cadence. We launched into our agenda items, and item fourteen was data retention. The IT team was trying to drive down operational costs by getting a reduction of data in Office 365 email. So, IT wanted to delete everything from the past 5 years.


The CIO sat back and said, “I can see the reasoning, but I don’t know what we really hope to gain, what risk we are trying to solve, and why it hurts to keep the data.”


I started my conversation with him outlining the basics:


Data Retention is a conceptual tool we can use to control risk. It is a control mechanism that allows us to manage the amount of data risk we must protect, collect, delete, and manage over time.


Data Classification is the first step of any data retention program. We must assign a value to data so that we can prioritize protection, collection, deletion, and management over time.


Are we forced to do it? Yes and no. Some states have laws that point to data retention and there is some Federal guidance as well. But since the company I am discussing is in California, there is CPRA. Source


What about other compliances? Yes, many compliances have data retention goals, not “mandates”, but guidance, goals, and requirements.


What if we keep all data? Then you will need to protect, collect, (not) destroy, and manage the data the same. That HIPAA record is equally as valuable as that email asking your friend what’s for lunch. All data is equal and needs to be maintained equally. Source

What are the risks?

  • Compromise: If you only have 7 years of data, then only that data can be compromised. Recently we have seen compromises on “highly confidential” systems that only had one month of data because of data retention requirements, and thus only one month of data was stolen. A small consolation prize but a good one, especially if you have 24 years of data like the reference below. Source

  • Privacy: Again, privacy comes up all the time as a hot topic in government conversations. There will be more federal and state guidance soon.

  • Lawsuits: It may be privacy related, or even some other type of lawsuit, but if you don’t follow your data retention rule (especially if you don’t make one), then you will be held to it. Source

  • Corporate Overload: It has been found that 95% of corporate data is unmanaged and unnecessary for data retention purposes. Source

  • No CISO Role: CISO’s play a central role in records management and data retention. We support the investigations done in many enterprises. Ask me about the time I told the FBI, “Sorry that doesn’t exist anymore.” Source

  • No Breach / Forensic Evidence: If you had criminal activity on your network and state that you keep everything, you are at risk if you don’t have log data. Source

  • EDiscovery – Friend or Foe: Yes, the old discussion that you will need to produce something for a future lawsuit at someone’s request. Source

And there are other risks as well, but none of these result in pure dollar loss which is hard for executives to quantify. Much like breach planning, you must assume these risks will come true and be willing to accept the risk of falling short. So, many executives will accept the risk and delay these data-related projects but let me be clear about upcoming trends.


Data is growing. It is not slowing down. And with analytics projecting to be a core business function, it is important to realize that this risk is going to grow exponentially. Source

Thus, understanding and quantifying what data classifications are necessary to run business and drive analytics, versus those less important data points (e.g. lunch meeting calendar events), become critical. You want to invest in protection, collection, destruction, and management to those analytical values that have a true return on investment, not every memo about nonbusiness activities, such as football pool memos.

Challenges:

So, what do you do about these new data retention challenges? How do you start? First, realize that some of that data doesn’t need to be there for 20 years, and begin a data classification project. Identify those systems that have true “confidential data” and why. Some of these systems are easy to qualify. A Customer Relationship Management system (CRM) is easily a “confidential” data source. Take that first CRM system and document it in policy as a confidential data system and define some basic parameters.

  1. What is backup retention for this confidential data source?

  2. What is the data retention for this confidential data source?

  3. How will we protect this resource? Antivirus? Security monitoring? Data Loss Prevention (DLP)?

  4. Make this “confidential data source” a standard.

  5. Now review what you have done with this one system and confirm it is correct, and not missing any information. Perfection is the enemy, so don’t think perfection. Perfection will stop you from starting this project.

Another area where organizations get stuck is when they don’t know where the confidential data resides. As you know, I never write about technology or services, but this is one case where there are obvious advantages. If you don’t know where the data is, then you need to find it. That means you need a scanner to look at all the nooks and crannies of your infrastructure and see where every IT administrator puts data, and what permissions, settings, and data resides on that system.


Varonis Data Risk Assessment is a free service by Nth. I have been using Varonis since 2008. I bought it at my previous operational role as CISO in a bank because I had no idea where that data was located. I had no way of knowing all the finite details of the settings, permissions, or data collected in the system. So, I had Varonis come out and scan my infrastructure and in no time, I had the answer to my questions. It wasn’t too ugly because it was all in normal places, but now I knew how to write policy about the “dos and don’ts” of managing confidential data.

You can request more info on the service here.

That’s how I found the data, and that’s how I inform others.

bottom of page