Rich Vining

Overcoming the Risk that Redundant Personal Data Brings Under GDPR

Blog Post created by Rich Vining Employee on Mar 16, 2017

Most companies in Europe, and multinational companies doing business in Europe, are grappling with how to meet the daunting challenges of the General Data Protection Regulation (GDPR), which goes into effect in May 2018. The penalties for non-compliance are set to be extremely painful: up to 4% of annual turnover, or revenue, for the entire organization. That’s an unthinkable amount of money for a Google, Apple, Facebook or Amazon, but it will be equally painful for any company found to violate the GDPR.

Blog_CDM for GDPR_image1.jpgConsidering the vast number of GDPR requirements to protect the personally identifiable information (PII) of the European residents that organizations interact with, including customers, partners and employees, it is becoming obvious that there is not, and there will not be a simple, comprehensive solution that assures compliance. Each article in the regulation poses its own challenges, and therefore each will require a custom approach to compliance. It will require new processes, conversions or migrations of the data these companies already have, and a range of new technologies to help automate the processes and enable compliance.


One of the more interesting, and challenging components of the GDPR is the requirement to access and copy (article 15), correct (article 16), delete (article 17), restrict usage of (article 18), or move (article 20) a PII record when requested by a protected individual. Among other things, a European resident will have the rights to be forgotten and to move their data from one service provider to another. That’s a tough ask for any company that has the PII in its systems, and will require major changes to policies and processes, and to the systems that manage and execute them.


Copy Data Management

Included within the new right to be forgotten is the requirement that the data controller delete all copies of the PII upon request, no matter where they are. Do you know where every copy of any file is? Much less which data is in the file? For example, most organizations retain dozens of backup copies of all of their data for operational and disaster recovery capabilities. They also create copies that are provided to test and development (and DevOps) departments, to finance and legal functions, and to 3rd-party suppliers such as advertising agencies. Sales reps and others probably have PII on their laptops or their sync-and-share accounts, which may be on 3rd-party services (e.g. DropBox) that your corporate IT department is unaware of.


Analysts estimate that more than 60% of all data storage capacity is consumed by copy data, and that as much as 500PB of capacity will be shipped in 2018 to support copy data.


Is every copy under your direct control, whether it’s in your facility, in the cloud or with a 3rd-party? Do you know how to find them, produce them and delete them on-demand in a reasonable time frame? If they contain PII of residents of the European Union, you will need to be able to show that you can, prior to May 2018. That’s not much time to redesign policies and processes that are core to your business.


Again, this is just one aspect of the wide-ranging GDPR, but it is one for which there is no simple answer. It will take a number of approaches, programs and technology changes to achieve compliance just in this area.


Blog_CDM for GDPR_image2.jpg

Hitachi Data Instance Director is an example of a workflow automation

system that manages and orchestrates the movement and copies of data.

3 Ways to Reduce Risk By Reducing Copies

One approach that seems prudent is to reduce the number of copies that the organization creates and retains. Here are 3 ideas to consider:

  • Full backup copies: How many full backup copies of your data do you really need? Can you reduce the number and still meet recovery point objectives? Can you reduce the retention period? Even better, can you eliminate the need for full backups altogether with an incremental-forever backup model that copies each new file or data object only once?
  • IT-created copies: It is now possible, through copy data management (CDM) techniques, to create virtual copies for secondary re-purposing, such as test and development. The original  data remains under the control of the data owner, with policy-based workflows that control the retention and refreshing of the virtual copies.
  • Distributed copies: GDPR compliance is going to force organizations to consider banning the use of 3rd-party data repositories, such as public archive and file sync-and-share services, and eliminate the threat that “shadow IT” poses. At the very least, it is incumbent on the organization to ensure that any outside entities that possess its PII data also comply with applicable GDPR provisions. A safer approach is to deploy data services, including sync-and-share, archiving and remote office backup, that are totally controlled by corporate IT.


One Piece of a Complex Puzzle

Achieving GDPR compliance is going to require a number of new processes and policies, new applications to identify personally identifiable information, control it, find it, move it and delete it. New approaches to data storage for compliance may be required. There will be consultants and lawyers involved. It’s all very complicated.


However, one thing is clear. Reducing the number of copies of PII can only help to simplify the challenge. If your organization is going to be impacted by GDPR, or other similar data privacy regulations in non-EU countries, a modern data protection and copy data management capability should be on your near-term wish list.

Blog_CDM for GDPR_image3.jpg

To learn more about the EU General Data Protection Regulation and how you can turn some of the investments in compliance into a competitive advantage, please join the webinar, EU GDPR: Hype? Cost? Or the opportunity to get more value out of your data? on 28 March 2017 at 1500 Brussels time, or on demand after the event.


Rich Vining is a Sr. Product Marketing Manager for Data Protection Solutions at Hitachi Data Systems and has been publishing his thoughts on data storage and data management since the mid-1990s. The contents of this blog are his own.