Information Age: News, analysis & insight for IT & business leaders

'Demand driven' open data poses privacy threat – report

14 September 2011  

An external review for the Cabinet Office suggests that the privacy challenges of open data will increase as public demand for datasets intensifies

As public interesting in open data increases and the goverment comes under greater pressure to publish more datasets, threats to the privacy of citizens will increase, a report commissioned by the Cabinet Office, has found. 

Entitled "Transparent Government, Not Transparent Citizens" and written by Kieran O'Hara from the University of Southampton, the report makes a number of recommendations for maintaining privacy as the government's public data programme, data.gov.uk, becomes increasingly "demand-driven".

O'Hara's main thrust is that privacy concerns must be considered at every step in the process of publication of public data. "Privacy protection should ... be embedded in any transparency programme, rather than bolted on as an afterthought," he writes.

The report suggests that the technological definition of 'privacy' must be included in government thinking on public data. Legal definitions of privacy have proved inadequate, it says, and O'Hara recommends that "technologically-trained experts should be brought into procedures for deciding whether or not to release particular datasets."

He also recommends that the Information Commissioner's Office (ICO) obtain a greater technical awareness – although O'Hara stressed that the ICO is currently making progress on this through the appointment of a technology Policy Advisor and the creation of a Technology Reference Panel.

Other recommendations include creating a data asset register to allow the government to keep track of its datasets; setting up transparency panels to determine the privacy threat posed by data; and investigating the vulnerability of anonymised databases to "deanonymisation", whereby an individual's identify can be figured out by matching information across multiple sources.

The report points to work on deanonymisation by two computer scientists, Narayanan and Shmatikov from the University of Texas, which found that individuals could be identified based on anonymous film reviews on rental site Netflix.

"Our conclusion is that very little auxiliary information is needed [to] de- anonymize an average subscriber record from the Netflix Prize dataset," they wrote in a paper quoted in O'Hara's report. "With 8 movie ratings (of which 2 may be completely wrong) and dates that may have a 14-day error, 99% of records can be uniquely identified in the dataset."


Comments 

There are currently no comments on this article

People who read this also read...

 

White Papers

Read article

'Think Lean' When Developing Management System Documentation

Learn how to efficiently and effectively implement a document management system for your organization.

Read article

11 Hiring Trends for 2011

In this document, you'll get the insider info you need to give potential employers what they want and beat your competition in 2011. You'll learn about the most valuable certifications and the game-changing skills that can lead to more job security and stability.

Read article

12 Hiring Manager Secrets to Getting the IT Job You Want

Learn how you can make yourself a more attractive candidate now with PrepLogic's free 12 Hiring Manager Secrets to Getting the Job You Want.

More
Advertisement
div class="banner">