Open Data or Not? A hard look at the Scottish Data Zone Boundaries licence

Post: 14 January 2013

Data licensing can be tricky.

There are data sets that are clearly open data and data sets that are clearly not. 

Then there are those that have potential, if you look at them in the proper light …

The Indices of Multiple Deprivation (IMD) are a multi-domain measure of relative levels of deprivation within small geographic areas of the UK.

Indices for England, Scotland, Wales and Northern Ireland are produced separately by Government and have slightly difficult criteria, but it’s possible to make comparisons across the data for different countries.

IMD data sets are used widely in the public sector and for academic research. I’ve used IMD data myself as inputs into risk models for the insurance industry. The crime domain in particular is useful in predicting geographic variability of economic losses to theft, civil disorder, etc.

Last month the 2012 version of the Scottish IMD was released by the Scottish Government. Alasdair Rae has put together a useful website that visualises the SIMD 2012 data interactively on a map.

In England and Wales the small-area geography used in the IMD is the Lower Super Output Area. LSOA boundaries are unambiguously available as open data, and may be downloaded from the ONS website and elsewhere.

However in Scotland the unit of geography underlying the IMD is called a Data Zone. The Data Zone boundaries currently in use were produced in 2004 but built up from 2001 Census output areas.

There’s background information on Data Zones here and here. This is what they look like:

image

Pretty much all administrative geography for England, Scotland and Wales is now available as open data. So I was surprised to see the restrictive licensing terms presented on the download page for the Scottish Government’s Geography Data (which includes the Data Zone boundaries).

You can read the licence on the Scottish Neighbourhood Statistics website. (Click on Go to Data Download, then Download Geography. The SNS site doesn’t like direct links.)

The Geography Data licence starts like this:

We the Scottish Government grant you a non-exclusive non-transferable licence (without the right to sublicense) to copy and use the Data which is derived from Ordnance Survey data and as such is subject to the terms and conditions of the licence agreement between The Scottish Government and Ordnance Survey.

No transferability or sub-licensing, so right away that says it’s not an open licence. Without transferability it would be very difficult to put the Geography Data on a public website in a legally compliant manner.

The wording tells us clearly that the download includes Ordnance Survey derived data. However it’s ambiguous whether the Scottish Government is claiming any additional database rights.

The licence then says:

For any use of this material a Click-Use PSI Licence is also required.

with details of how to arrange that via the Office of Public Sector Information.

The problem is that the Click-Use Licence system was phased out from 2010, when the Open Government Licence (OGL) was introduced. The link provided on the SNS’s download page now resolves to a National Archives page about the OGL.

Click-Use Licences were typically issued for a five-year period, so in theory there could be re-users out there using the SNS Geography Data under those terms. However the OPSI is no longer issuing new Click-Use Licences.

Interpreted literally that means nobody who downloads the Geography Data subsequently can comply with the licensing terms on the download page.

It’s fairly obvious the licence on the download page is out of date. Does that mean we can substitute the terms of the Open Government Licence? That would enable us to do away with the restrictions on transferability and sub-licensing, both of which are supported by the OGL.

There is certainly a convention that the OGL simply replaces the Click-Use Licence, and we have statements to that effect from National Archives.

But I’ve never been entirely comfortable that the Government nailed that down properly, and there are substantial differences between the Click-Use Licence and the OGL. The Click-Use Licence does not specifically set out the scope of permitted re-use, and it seems to exclude mapping data. National Archives has also muddied the waters by introducing additional licences to the UK Government Licensing Framework.

Back to the Geography Data licence:

You may only use the Data for your own internal business use, that is use of Data for the internal administration and operation of your business and not for any commercial purpose, and not for financial profit or gain. Financial gain would include any profit whether direct or indirect, or benefit from the use or publication of the Data in any form.

This tells us that when the licence was written there was no intention to automatically allow full commercial re-use of the Geography Data, as would be permitted under the OGL.

So this addendum:

For any other use of the Data you will need to obtain permission from Ordnance Survey

The remainder of the licence is boilerplate to emphasise key terms from the Click-Use Licence.

Doesn’t look much like open data, does it?

This is why we shouldn’t always take data licences at face value. Licences are contracts, and subject to argument and interpretation.

In this case a literal reading of the licence would prevent anyone from actually re-using the data even for internal business purposes, because the Click-Use Licence is no longer available. It’s a reasonable assumption this is not the intention either of the Scottish Government or the Ordnance Survey. So there’s something wrong with the licence.

This is where it helps to understand the historical context, and the data specifications themselves. Reading the background information gives us some idea of which Ordnance Survey data was used to produce the Data Zones and, although it’s not conclusive, it looks as if it was simply older versions of data that is now included in Ordnance Survey’s open Boundary-Line product.

I did the sensible thing and sought an opinion from the Ordnance Survey. I had to persist a little but in due course received a detailed and helpful e-mail.

The key points from the Ordnance Survey e-mail are as follows:

Although we are unable to confirm which Ordnance Survey data has been used to create this data, we are happy to treat this data as being part of OS Boundary-Line and covered by the Open Data Licence.

As there is additional data present, we would always recommend that you seek permission of the map’s owner (in this case SNS) to ensure that there is no additional copyright enforced by them. It would appear that, as the link provided within SNS’ download licence (which refers to “Click-Use PSI Licence) is now for the Open Government Licence, the data is covered by the Open Government Licence …

On the strength of the above I would personally be confident in treating the SNS Geography Data as open data. I think Ordnance Survey permission to apply the OS OpenData Licence was the only real barrier.

It is significant that the Ordnance Survey is willing to apply the OS OpenData Licence to older versions of Boundary-Line, and presumably also to older versions of the other OS OpenData products. That makes practical sense but I don’t think it is stated explicitly on the OS website.

As regards "additional data present”, it would certainly be better to have a re-written licence on the download page itself to remove any remaining ambiguity. However based on the existing wording, and with due regard to the remarks in the OS’s e-mail, I think there is a perfectly defensible argument that the Scottish Government has left the full licensing decision to Ordnance Survey.

_______________________________

Update 11 July 2013:

The Scottish Government has now updated the Data Zones 2001 metadata record on Data.gov.uk to confirm that the dataset is re-usable under the OS OpenData licence.

Win.