UK statistics and open data: MPs' inquiry report published

Post: 17 March 2014

This morning the Public Administration Select Committee (PASC), a cross-party group of MPs chaired by Bernard Jenkin, published its report on Statistics and Open Data.

This report is the product of an inquiry launched in July 2013. Witnesses gave oral evidence in three sessions; you can read the transcripts and written evidence as well.

The PASC press release leads with criticism of the Government’s decision to let the Postcode Address File (PAF) out of public hands by including it in the recent sale of Royal Mail. The report says:

“The PAF should have been retained as a public data set, as a national asset, available free to all, for the benefit of the public and for the widest benefit of the UK economy. Its disposal for a short-term gain will impede economic innovation and growth.”

“Public access to public sector data must never be sold or given away again.”

An open national address dataset is the closest thing the UK’s open data community has to a unifying cause célèbre, so this is a message that will find a receptive audience.

However PASC has also dug quite deeply into the core problems with the UK Government’s open data policy. While I don’t agree with every aspect of the analysis, I was pleasantly surprised by the extent to which PASC has taken on board the evidence submitted to its inquiry.

The report is broadly supportive of the arguments for open data, and makes some helpful recommendations for improving government policy. Below I have picked out what I think are some key points.

Open data fundamentals

PASC has embraced the principle of ‘open data by default’:

“There should be a presumption that restrictions on government data releases should be abolished. It may be necessary to exempt certain data sets from this presumption, but this should be on a case-by-case basis, to provide for such imperatives as the preservation of national security or the protection of personal privacy.”

and seems to have been persuaded by the arguments against charging for public data:

“It is short-sighted in the extreme for Government to seek to maximise fee income from data while those fees penalise in particular small companies that can prove the most innovative, and which could establish the UK as global leader in this new economic sector.”

“Charging for some data may occasionally be appropriate, but this should become the exception rather than the rule.”

It has recognised that disruption of existing information markets is not a good excuse for avoiding open release:

“It is also acknowledged that wider access to more data and information will be disruptive to the structure of existing markets, leading to some firms winning and some firms losing. But our evidence suggests that, in all probability, consumers will gain.”

PASC also endorses Stephan Shakespeare’s proposal that the Government should adopt a 'twin-track' approach to data release, as:

“a practical and realistic way of maintaining the momentum on open data, which recognises that 'the perfect should not be the enemy of the good: a simultaneous "publish early even if imperfect” imperative AND a commitment to a “high quality core”’.“

The 'right to data’

PASC says there is an inherent 'right to data’ and that the Government should bring forward legislation:

"The Government needs to recognise that the public has the inherent ‘right to data’, like Freedom of Information. The Government should clarify its policy and bring forward the necessary legislation, without delay.”

Unfortunately the Committee does not make it clear what it considers the 'right to data’ to be. I am not sure PASC has understood the distinction between the 'right’ to access data in a reusable format (provided for in last year’s changes to the Freedom of Information Act), and the idea of an information rights framework for open data itself, which is what several witnesses argued for.

I have written previously about the new dataset provisions in FOI and why they don’t go nearly far enough. It would have been more helpful if PASC had expressed clear support for a legal right to open data (by default at least). However it has at least recognised that there is a need for additional legislation of some kind.

Public sector procurement and outsourcing

PASC recognises the complications for open data created by contracting and outsourcing of public services:

“Open data principles should be applied not only to government departments but also to the private companies with which they make contracts.

"We recommend that companies contracting with the Government to provide contracted or outsourced goods and services should be required to make all data open on the same terms as the sponsoring department. This stipulation should be included in a universal standard contract clause …”

This is along the right lines, though my own preference would be to specify a requirement that IP ownership of data created from the contract will be retained by the sponsoring department. That approach would support future decisions to release open data, rather than just following from practice within the sponsoring department at the time of contract negotiations.

Who is responsible?

PASC highlights “a lack of coordination on open data at Ministerial and official level, though this is improving.” Using the PAF decision as an example, it says:

“The Cabinet Office leads on the policy, but its mechanisms to hold Departments to account are weak …”

“Despite the enthusiastic rhetoric emanating from the Cabinet Office, our evidence indeed indicated something more serious - a lack of understanding of open data among most Ministers and apparently most officials.”

and also notes:

“There is an unwieldy plethora of open data bodies which tends to slow both decision-making and consultation.”

The report casts doubt on the early focus of the current Government’s open data strategy:

“There is no sign of the promised emergence of an army of armchair auditors. There is little or no evidence that the Cabinet Office is succeeding in encouraging greater public engagement in using data to hold the public sector to account.”

PASC says open data needs to be treated “as a major government programme in its own right”, and recommends that:

“The Minister for the Cabinet Office should be given explicit responsibility for all aspects of open data policy, including the commercial aspects.”

This is an attractive idea but I’m not sure it’s plausible. The Cabinet Office’s lack of influence over delivery departments is long-standing and entrenched. Formalising responsibility for open data policy will not overcome that problem.

I cannot see BIS or the Treasury relinquishing their influence over the commercial aspects of open data policy, particularly as those aspects are less about open data itself and more about conflicts with other policy agendas such as privatisation.

I did like this remark, though:

“The Transparency Strategy Board is too large to be effective in driving progress. A small group from that Board should work as a Programme Implementation Board.”

Perhaps we should call that smaller group the Data Strategy Board? (Regular observers of UK open data policy will understand the joke.)

Data.gov.uk and measuring open data progress

Data.gov.uk gets a bit of a kicking in the PASC report; not so much for the development side but for the underlying concepts and the Government’s tendency to use the site to meet the needs of Whitehall rather than those of data users:

“It is often pointed out that more than 13,000 datasets can now be found on data.gov.uk, but it is unclear how many of these represent simple republishing of data already published on other government sites. Some data sets are small and others large. And it is possible for departments to get more data out by publishing it in smaller bundles or updating it more frequently, in such a way that there is little or no extra public benefit. In these circumstances, measuring progress on this important agenda is difficult if not impossible.”

This makes refreshing reading, given the number of times Ministers and senior civil servants (and more often then not journalists as well) have cited the number of datasets listed on Data.gov.uk as if that were a meaningful metric for something.

PASC recommends a few alternatives. One is:

“We invite the Government to publish a clear list of open data, indicating when each data series became open in each case.”

This is a lovely idea but hopelessly impractical at this point. The Cabinet Office has had (and continues to have) great difficulty pulling together a cross-government inventory of public data assets. I doubt it will have much appetite for persuading departments to unearth the publication history of all the open data now listed on Data.gov.uk.

PASC also recommends the adoption of not one but two five-star rating systems:

“We recommend above that the Government adopt the 'five-star’ system along the lines proposed by Involve, for open data engagement. A second 'five-star’ rating system, developed by Full Fact for assessing the usability of government statistics, would support the efforts of statisticians to play a more active role in open data. This system should also be adopted by the Cabinet Office in assessing departmental progress on open data.”

These two systems are presumably in addition to the original five-star deployment system for linked open data devised by Tim Berners-Lee back in 2006, and the variant of that system already implemented (with questionable success) on Data.gov.uk.

My concern is that more of this kind of stuff will just feed the Cabinet Office’s managerialist tendencies …

This PASC recommendation is rather better:

“The Cabinet Office must give a much higher priority to ensuring that more interesting and relevant data is made open, and that the release mechanisms encourage people to use it and, where appropriate, hold Government and local authorities to account. Beginning in April 2014, targets should be set for the release of totally new government datasets not the republishing of existing ones.”

Simple when you think about it, isn’t it? We should judge the progress of UK open data policy on the steady release of new and useful public datasets.

Photo credit: Big Ben by Carlesmari, CC BY 3.0