Ten awesome open data resources (missing from Data.gov.uk)

Post: 24 September 2012

Looking for UK open data? You probably know all about Data.gov.uk, the UK Government’s main portal for public data. Data.gov.uk lists more than 8,600 public data sets, most of them re-usable under the Open Government Licence.

Depending on your area of interest, you might also want to trawl through a more specialised data hub like the National Statistics website, the Department for Transport website or the NHS Information Centre.

But that’s far from everything. As I highlighted in my previous post we don’t have a proper inventory of UK public sector information. It’s mainly up to individual data holders whether they register their open data on Data.gov.uk.

That means there are still quite a few significant public data resources, open for re-use either explicitly or effectively, that you won’t find in any of the usual places.

Below is a personal list of ten UK open data resources, all currently missing from Data.gov.uk, that I’ve found useful and think deserve wider attention.

image

(Pic by Jonathan Gray, Creative Commons CC0 1.0 Universal Public Domain Dedication)

1. EduBase (Department for Education)

EduBase is the DfE’s up-to-date database of educational establishments across England and Wales. It contains records for every school: addresses, geographic coordinates, school characteristics, pupil census data, and more. Colleges and universities are in there as well.

If you register on the website you can download the complete database up to twice a year, for free. The data is licensed for re-use under the terms of the Open Government Licence.

2. Free Public Data Product (Companies House)

The Free Public Data Product was released by Companies House at the start of June 2012. It’s a monthly “snapshot" containing basic information on all live companies on the Companies House register. That includes addresses, dates of incorporation and SIC codes.

Companies House says the data is "provided free of charge and will not be supported”. There’s no specific mention of licence terms on the website, but the data is described as openly re-usable in BIS’s Open Data Strategy and elsewhere.

Companies House released this data very, very quietly and it’s not that surprising they haven’t registered it on Data.gov.uk. As a Trading Fund, Companies House are not exactly enthusiastic about open data. But you probably guessed that from the amount of effort they put into naming the product.

3. Oil and Gas Data and Maps (Department of Energy and Climate Change)

Oil and Gas Data and Maps is a collection of DECC data sets that includes maps of both onshore and offshore oil and gas exploration and production (raster and vector), plus well data, field data, stratigraphy and lithography data, and so on.

This data should be useful for anyone with an interest in the energy sector, including investors, insurers and risk managers.

Or if you’re environmentally minded, it’s a great resource for monitoring which companies are exploring and drilling where around the UK. I used some of the DECC spatial data in a web article I wrote last year on shale gas exploration.

Most of the data in this collection is open for re-use, but check the documentation with individual data sets.

4. Postbox Data (Royal Mail)

As you might expect Royal Mail maintains database records of all of their postboxes. This data is not available for download in bulk from the Royal Mail’s website. However Royal Mail have provided the full data several times in response to FOI requests.

Most recently in July Royal Mail provided a data set of postbox locations and collection times from the Central Collections Management Database, and additional data on meter and private boxes from their Final Plate Database.

The FOI correspondence is silent on licensing, but as a matter of existing practice Royal Mail don’t seem to be discouraging open re-use.

5. Wind Development Data (Department of Energy and Climate Change)

DECC also maintains a collection of data sets to support the renewables side of the energy industry.

The Aviation Safeguarding Maps page provides access to location data for aerodromes and airfields, and wind farm grid references. Data on the DECC is subject to the Open Government Licence unless otherwise stated. (You will note that the MOD Safeguarding Data on the same page is not open data.)

There’s also a Windspeed Database based on a 1 km square resolution. What I like most about this database is the DECC’s information warning. They’ve clearly inherited the data and have only a vague idea how it was produced.

6. Registered Places of Worship (General Register Office)

The General Register Office maintains a list of all registered places of worship in England and Wales, in compliance with the Places of Worship Registration Act 1855. The list includes addresses and religious denominations (excluding the Anglican Communion), and is useful for a number of geodemographic applications.

This is one of those data sets that’s safe for open re-use but you have to get hold of it first. The GRO doesn’t make a habit of publishing the list, but will release it under FOI.

The most recent list in general circulation is from an April 2010 request. Unfortunately it’s a PDF document. However if you submit a new request the GRO should provide the list in a more re-usable format, per new provisions in the FOI Act.

7. Flood Risk Objections (Environment Agency)

The Environment Agency holds an enormous amount of data related to flood risk, very little of which is open for re-use. One of the exceptions is a monthly spreadsheet of objections to planning applications on flood risk grounds.

This is quite a useful source of information for anyone considering buying (or financing) a home in a new development. Although local planning authorities are obliged to consult the Environment Agency about developments within the flood zone, the EA normally has no power to actually block a development.

Availability of the objections data means anyone with an interest in a new development can easily check whether an objection was raised, and if so investigate the planning history further to make sure the EA’s flood concerns were addressed.

At the moment the most recent monthly data at the above link is from October 2011. However I’ve made an inquiry and the EA have said the page will be updated with the missing information shortly.

8. Government Art Collection (Department for Culture, Media and Sport)

The Government Art Collection is a collection of over 10,000 works of art placed in British Government buildings in the UK and abroad. The DCMS maintains a searchable database with pictures, descriptions and locations of the art.

The information in the database is re-usable under the Open Government Licence. There is no link for bulk download of the data, but it’s a doddle to scrape.

9. 2011 Boundary Data (Census Dissemination Unit at Mimas)

The Census Dissemination Unit (CDU), based within Mimas at the University of Manchester, provides access and support for UK Census data and related data within the academic sector. Earlier this year CDU also opened up its services to the public.

2011 Boundary Data is a page of downloads of boundary data for England and Wales. Although the base boundary data is available from other sources, this is the best collection currently available for free to the public. The boundary data has been optimised for use at different resolutions and is available in various GIS formats including KML.

CDU also makes available reformatted versions of the 2011 Census data released by National Statistics, and Combined Census and Boundary Data.

10. HM Government e-petitions (Cabinet Office)

In May the Government Digital Service, a team within the Cabinet Office, released an API for the UK’s e-petitions system. Peter Herlihy of GDS blogged about it, as did Richard Parsons on the eDemocracy blog.

The API provides JSON feeds that include a master list of live petitions and anonymised statistics on supporters of each petition. Although the statistics are only at postcode district level, the data has substantial potential for re-use. Most obviously, the data can be used to analyse the geographic distribution of sentiment on contentious public issues.