When is anonymised data not anonymous?


Over the last decade there have numerous instances of companies finding themselves in significant legal, commercial, and moral quandaries for (mis)using the personal information of consumers.

When the European General Data Protection Regulation (GDPR) came into effect, our inboxes were filled with articles containing dire warnings of the consequences of not meeting the GDPR’s stringent requirements. Since the GDPR was enacted, the California Privacy Rights Act of 2020, New Zealand’s Privacy Act 1993, Singapore’s Personal Data Protection Act 2012 and Australia’s Privacy Act 1988 (Cth) have all created similar standards for those jurisdictions also.

Yet the temptation to monetise consumer data is ever increasing. With the growth of AI and machine learning, the value of consumer data is growing all the time. Many large organisations understand that data is now among the most powerful of all intangible assets and are increasingly recognising that they are sitting on potential Smaug-like troves of value. Accordingly, more and more of these organisations are integrating a data strategy into their commercial model: “big data” can be a very powerful tool in the competitive arsenal. As just one example, EverEdge recently advised a large mall owner on the value of its data assets. Our client has come to the conclusion that there is potentially more margin to be made from monetising data than actually operating the mall.

However, merely having data does not a millionaire make. There are multiple challenges on the path to successful monetisation which we have written about here and here.  One of the most significant however which we haven’t touched on before is the issue of anonymisation.

Although personal consumer information has become increasingly protected from outright monetisation over time there is an exception: as a general rule, data that has been sufficiently anonymised or “de-identified” are not subject to the protections and limitations that apply to personal data. However, businesses relying on this exception must be very clear where the line between protected personal information and unprotected anonymised information sits. As with any asset, with great value there is also significant technical, legal and reputational risks if a data-based intangible asset is not managed properly.

In 2014, in the process of formulating the GDPR, an EU working group published guidelines on anonymisation techniques. These guidelines stated that, in order for the anonymisation threshold to be reached, it should not be possible to:

  • single out an individual;
  • link records relating to an individual, or
  • infer information about an individual.

This also resonates in New Zealand law, where Principle 10 of the New Zealand Privacy Act 1993 (unchanged in the new Act of 2020) states that the data holder must ensure that individuals are “not identified”. Correspondingly, Principle 6 of the Australian Privacy Principles also contains similar requirements.

The concepts of anonymisation and de-identification are increasingly used in the privacy space, sometimes interchangeably. However, they relate to two quite different processes, with anonymisation being a much more challenging standard than de-identification. De-identification involves hiding or masking elements of the data that may allow an individual to be identified. Anonymisation relates to a process of transforming data into a form where individual data cannot be reverse engineered, even by cross referencing the anonymised data with other datasets.

An example of de-identification is where the Ministry of Health in NZ published data on COVID cases where individuals were masked behind their National Health Identification (NHI) number. The Ministry made a claim that such data was “anonymised” but of course if someone was able to cross reference that data with a database of NHI information, the published information would no longer be anonymous. The difference between anonymisation and de-identification really exists on a continuum with the effort required to reverse engineer and identify individuals being the relevant factor. The more difficult that process becomes, the closer the data is to anonymised.

A decision of the European Court of Justice in 2016 gave some guidance on where this line sits. There were two main categories where de-identification could be regarded as anonymisation:

  • where the effort required to identify an individual would be so prohibitive that it is in effect impractical.
  • Where any step required to identify an individual is prohibited by law.

There are a number of other factors to take into account when assessing whether your level of de-identification or anonymisation is sufficient. For example, the sensitivity of the data – users will be far less concerned about data on their favourite toothpaste being released, as opposed to almost any information about their children. You may also want to consider the vulnerability of the individuals to whom the data relates, and the uniqueness of the data points – the more unique, the higher the level of de-linking from the raw data sets.

The line between identifiable information and properly anonymised data is not a clear one. It depends on a myriad of factors and is dependent on the type of data and the type of business you are operating. The growing importance and value of data will encourage more businesses to look at the data they have and formulate ways to exploit that data. However, that process must start with a proper understanding of the data being commercialised and how it can be made safe for external consumption.

Data is increasingly one of the most valuable of all intangible assets and has the potential to turbocharge business results. In fact, all five of the top five companies by market capitalisation in the US are either data plays or have data playing a critical role in their business models. However, with reward also comes with considerable risk that must be managed properly. One thing all companies can be certain of is that a failure to adequately protect data or a release of data (whether intentional or not) that identifies individuals is likely to result in terrible publicity and a loss of trust of users not to mention material negative legal consequences. In the current climate around privacy, and in the light of some very public failures by businesses to adequately protect data including some of the world’s largest entities like Facebook, LinkedIn, Yahoo and Alibaba, this is an issue that no commercial entity can take lightly.

Written by Ben Lenihan, Senior Manager & Legal Counsel, EverEdge .  

Recommended Reads

Thinking smarter about data and customer trust in the age of AI

“Don’t follow the crowd” is great advice in theory, but devilishly difficult in practice when…

Scarcity: The ultimate reward for a strong intangible asset base

Few things annoy a wealthy person more than being placed on a waiting list. But…

Why certifications are the bedrock of intangible assets

We’re surrounded by very particular kind of brand all day, but just can’t see it.…

How to see the wood for the trees in company valuations

It’s a law of physics that no tree can ever grow into the stratosphere. At…

Without non-competes, how can you protect your intangible assets?

Non-compete clauses may soon be a relic of the past if the US Supreme Court…

Free 1hr Consultation

Intangible assets are a company’s greatest source of hidden value and hidden risk. Make the valuable visible in your organisation.

Sign-up for a free 1-hour consultation today.

Subscribe to Newsletter