ICO blog: The risk of revealing too much

View our latest blog on redaction and information disclosure: ‘Now you don’t see it, now you do’ – the dangers of hidden data

It’s been a busy few weeks in Wilmslow, with data protection becoming headline news as politicians debated whether names should be included in a high-profile report.

That discussion was around whether more personal information should be included. For some time the ICO has been looking at a specific problem at the other end of the scale: organisations revealing too much personal information.

The issue relates to responses to freedom of information (FOI) requests provided in spreadsheets, which are inadvertently revealing personal information. Public authorities will often respond to requests by supplying the information requested in spreadsheet format. Sometimes that will be in the form of a ‘pivot table’, which can neatly summarise the information, without revealing the underlying personal information the summary is based on.

Unfortunately, it has come to our attention that public authorities are not always properly removing the underlying data before disclosing. Pivot tables, both in Microsoft Excel and other spreadsheet programs, retain a copy of the source data used. This information is hidden from view, but is easily accessible.

An example

Let’s look at a simple example. A public authority has been asked for a breakdown of which departments claim the most in expenses. The data has been provided on a spreadsheet:

expenses-spreadsheet-1

The public authority uses a pivot table to total the information, which it then sends to the requestor:

pivot-table-2

It appears that the public authority hasn’t shared any personal data. However, by simply double-clicking on the table, the requestor can view the original source data, including the personal details of who made the expenses claims:

pivot-table-2

The problem has come to prominence on freedom of information disclosures made using the WhatDoTheyKnow website, but it is important for any disclosure made under the act, not just via WhatDoTheyKnow: any disclosure under the Freedom of Information Act should be treated as a disclosure to the world.

The risk could also emerge in other scenarios outside of FOI such as data sharing between two organisations, so while primarily aimed at the public sector, it is important that data controllers in the private sector consider this guidance.

The ICO is actively considering a number of enforcement cases on this issue.

We’re working closely with the WhatDoTheyKnow team. Their constructive approach in relation to the issue is appreciated, and we’ll continue to liaise with them about possible breaches. A few weeks ago WhatDoTheyKnow posted a blog containing useful guidance, and we’d very much support these key messages.

Five key messages

We have five key messages for organisations (with a hat tip to WhatDoTheyKnow):

  1. Disclosure of hidden personal data in pivot table spreadsheets may be a breach if the Data Protection Act. The data is not secure and is easily accessible, even if not immediately viewable.
  2. Avoid using pivot tables for any disclosures or data sharing involving personal data. Consider using CSV files.
  3. Check the file sizes before disclosure – larger than expected file sizes should be a trigger for further checks.
  4. Ensure your organisation has the right procedures and checklists in place for staff involved in disclosing data.
  5. Consider running quick training sessions or drop in surgeries to ensure staff understand how to safely prepare spreadsheets for release.

In short, make sure the right checks are in place before you send. We’ve published two updates to our guide to freedom of information this week to highlight the importance of checking before disclosure. We will revisit the need for further guidance when we have completed our enforcement cases.

We’d also recommend that organisations use the redaction toolkit guidance produced by the National Archives, as well as our general guidance about anonymising data in our related code of practice.

It’s worth mentioning another issue we’re currently focused on: the imminent dataset amendments to the Freedom of Information Act. These amendments will require public authorities to disclose datasets in open reusable formats, which in practice means using a format such as CSV (comma separated variable) will be a requirement. This should remove many of risks of hidden data, as the spreadsheet formatting is taken away, making it clear what information has been included.

We’re expecting these changes to the act to happen in August. There’ll be ICO guidance to accompany the amends, and no doubt an accompanying blog.

Steve WoodSteve Wood‘s department develops the outputs that explain the ICO’s policy position on the proper application of information rights law and good practice, through lines to take, guidance, internal training, advice and specific projects.
This entry was posted in Steve Wood and tagged , , . Bookmark the permalink.