Anonymisation is the practice of transforming identifiable data into non-identifiable or non-informative data, most commonly by reducing the amount of information that either is present in the dataset or inferable from it. In practice, identifiers should be removed, altered, aggregated or otherwise obscured. Note that identification does not just mean that a person can be named; it also covers the case where information about a person can be attached with certainty to them. Anonymisation is also known in some jurisdictions as de-identification.

p. 14Under the GDPR, fully anonymised data (called anonymous information) is not treated as personal data, because individuals cannot be identified in it. However, the expansive definition of ‘identifiable’ in GDPR means that anonymisation, under its definition, must be irreversible. This requires, for instance, that the original, identifiable dataset should be deleted. This gives the term ‘anonymisation’ an absolute meaning.

A more intuitive definition of anonymisation focuses on the risk of reidentification of persons represented within the anonymised dataset. This means that the anonymisation process is intended, not to make anonymisation irreversible, but rather to lower the risk of reidentification (or de-anonymisation) to acceptable (e.g., negligible) levels. Under this conception, anonymisation is a risk management process.

It can be shown mathematically that anonymisation in this latter sense is reversible because an adversary could use auxiliary knowledge to identify data subjects in the anonymised dataset, and it can never be known in advance what data is available to the adversary. Several real-world scandals have occurred in which inadequately anonymised datasets have been released, and reidentification has taken place (for example, with AOL in 2006, Netflix in 2007 and data about New York yellow cabs released as part of a Freedom of Information request in 2014). These have been argued (for instance by Paul Ohm) to show that anonymisation is not an adequate defence of privacy, although in each case reidentification was possible principally because the anonymisation was inadequately planned and executed.

A more sophisticated approach called functional anonymisation considers that the risk lies in the relationship between data and the data environment. This opens up the possibility that, rather than altering the data, the data environment may be controlled, so that the ability of outsiders to interrogate the dataset is limited, for example by access controls or restrictions on linking to auxiliary datasets.

Further reading:

See also: INFERENCE, PSEUDONYMISATION

  • Elliot, M., Mackey, E. and O’Hara, K., 2020. The Anonymisation Decision-Making Framework: European practitioners’ guide, 2nd edition. United Kingdom Anonymisation Network, https://ukanon.net/framework/.

    • Search Google Scholar
    • Export Citation
  • Elliot, M., O’Hara, K., Raab, C., O’Keeffe, C.M., Mackey, E., Dibben, C., Gowans, H., Purdam, K. and McCullagh, K., 2018. Functional anonymisation: personal data and the data environment, Computer Law and Security Review, 34(2), 20421, https://doi.org/10.1016/j.clsr.2018.02.001.

    • Search Google Scholar
    • Export Citation
  • Hintze, M. and El Emam, K., 2018. Comparing the benefits of pseudonymisation and anonymisation under the GDPR. Journal of Data Protection and Privacy, 2(2), 14558.

    • Search Google Scholar
    • Export Citation
  • Ohm, P., 2010. Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Review, 57, 170177, https://heinonline.org/HOL/LandingPage?handle=hein.journals/uclalr57&div=48&id=&page=.

    • Search Google Scholar
    • Export Citation
  • Elliot, M., Mackey, E. and O’Hara, K., 2020. The Anonymisation Decision-Making Framework: European practitioners’ guide, 2nd edition. United Kingdom Anonymisation Network, https://ukanon.net/framework/.

    • Search Google Scholar
    • Export Citation
  • Elliot, M., O’Hara, K., Raab, C., O’Keeffe, C.M., Mackey, E., Dibben, C., Gowans, H., Purdam, K. and McCullagh, K., 2018. Functional anonymisation: personal data and the data environment, Computer Law and Security Review, 34(2), 20421, https://doi.org/10.1016/j.clsr.2018.02.001.

    • Search Google Scholar
    • Export Citation
  • Hintze, M. and El Emam, K., 2018. Comparing the benefits of pseudonymisation and anonymisation under the GDPR. Journal of Data Protection and Privacy, 2(2), 14558.

    • Search Google Scholar
    • Export Citation
  • Ohm, P., 2010. Broken promises of privacy: responding to the surprising failure of anonymization. UCLA Law Review, 57, 170177, https://heinonline.org/HOL/LandingPage?handle=hein.journals/uclalr57&div=48&id=&page=.

    • Search Google Scholar
    • Export Citation
Reference & Dictionaries