The Department for Levelling Up, Housing & Communities (DLUHC)* publishes Energy Performance of Buildings Data for England and Wales at epc.opendatacommunities.org.

The site is updated quarterly and provides bulk access to data from domestic and non-domestic Energy Performance Certificates (EPCs) and Display Energy Certificates (DECs).

The majority of the data is from the 22 million or so domestic EPCs that have been lodged on the public register up to the end of 2021. The data contains EPC ratings for residential properties, along with a multitude of building attributes related to energy efficiency.

According to DLUHC's licensing information, the domestic EPC data is re-usable under the Open Government Licence, with the exception of the address and postcode fields which are covered by more restrictive Royal Mail terms.

Previously, I would have said the Royal Mail terms made the EPC dataset inoperative as open data, because the address information was essential to the general utility of the dataset at record level.

However, since November 2021, DLUHC has appended UPRNs to most of the EPC data.

UPRNs (Unique Property Reference Numbers) are identifiers derived from Ordnance Survey's AddressBase. AddressBase itself is a commercial product, but the UPRNs and their point coordinates are open data.

Inclusion of the UPRNs means the EPC records can be geolocated at address level and combined with open data (including postcodes) from other sources without re-use of the Royal Mail fields. The domestic EPC dataset should therefore be operative as open data, because the data can now be exploited for a wide range of purposes.

Except …


DLUHC maintains that domestic EPC records are personal data

The problem is that DLUHC's licensing information also says:

Please note that this data contains personal data. If processing falls within the scope of the General Data Protection Regulation, or the Data Protection Act 2018, you will become a data controller and must comply with the data protection legislation.

A further page on data protection suggests this means all of the domestic EPC data, not just specific fields:

Address level data concerning the energy performance of buildings constitute personal data for the purposes of the General Data Protection Regulation (GDPR) and Data Protection Act 2018 (DPA 2018). Anyone using personal data must comply with the data protection legislation.


Can personal data be open data?

Technically, there is no reason why personal data cannot be licensed for re-use as open data. Intellectual property and data protection are separate legal regimes, and an open licence does not circumvent the obligations of licensor and licensee to comply with data protection law.

In practice, open licensing of personal data is unusual. Even if the licensor can establish a lawful basis on which to publish personal data, the viable re-use cases will be limited. The statutory restrictions on processing of personal data mean that personal data cannot function effectively as open data, even if its re-use is covered by an open licence.

In this case, DLUHC has carried out a Data Protection Impact Assessment and considers that it can publish the EPC data for re-use even if it contains personal data.


The Open Government Licence does not cover personal data

However, DLUHC has applied the Open Government Licence to re-use of the EPC data.

The OGL contains a series of exceptions, the first of which is: "This licence does not cover personal data in the Information".

Effectively, DLUHC is maintaining two incompatible positions. The EPC data cannot be both personal data and licensed for re-use under the OGL.

This creates a dilemma for re-users. Legally, we can only re-use the data under the OGL if we are confident that DLUHC is wrong in its view that the EPC data is personal data.

(I am confident DLUHC is wrong. But I am not your lawyer.)


Why does DLUHC think the EPC data is personal data?

DLUHC's guidance provides a list of the fields in the domestic EPC dataset.

In fairness, property data is an area in which it is sometimes difficult to separate information about the owner or occupant of a property from information about the property itself.

However, to my mind, none of the fields in the published EPC data "relate to" an identifiable individual within the meaning of the definition of personal data in data protection law.

The DPIA does not explain how the EPC data relates to an identifiable individual. DLUHC starts from the premise that individuals can be identified from the data by combining it with other publicly available information, but offers no analysis to support that premise:

For the purposes of this DPIA, MHCLG has treated the EPB data as personal data where the data set contains the address of the building. This is because that data, when combined with other publicly available information, (e.g. the electoral register), which would disclose information relating to the individual concerned, (e.g. information about the building in which that person lives), could enable the occupier of the building to be identified.

The privacy notice for the Energy Performance of Buildings Data does not provide any additional detail. (It also does not conform to the minimum requirements for privacy information as set out in UK GDPR and ICO guidance.)


Has DLUHC dug itself into a hole?

DLUHC's treatment of EPC data as personal data seems to have originated early in the life of the EPB registers, when support for the registers was outsourced to Landmark and there was resistance to sharing the data widely.

After initial releases of bulk data starting in 2016, DLUHC stopped publishing for two years, ostensibly due to "recent changes to data privacy regulation".

Publication restarted following a 2019 decision notice from the ICO. However, that decision hinged on resolution of Royal Mail's IP rights. It has never been clear why DLUHC thought the transition from DPA 1998 to GDPR made a difference to the status of the EPC data.

Another ICO decision notice from 2011 confirmed that DLUHC took in-house legal advice on this subject, but that it was entitled to withhold the advice.

The Energy Performance of Buildings (England and Wales) Regulations 2012 contain statutory requirements for disclosure of bulk access data. Schedule 1 does refer to personal data, but it's unclear whether this means all EPC data or specific information (such as names and contact details for EPC holders) that is not included in the published bulk data.

In general, the 2012 regulations are difficult to reconcile with the current arrangements for publication of bulk EPC data.

My guess is that DLUHC followed some bad data protection advice from its legal team ten years ago – or, possibly, had good advice that it did not follow. That position has become entrenched.


How has Scotland approached this problem?

Scotland's EPC data is pretty much the same as the data published for England and Wales. The data is re-usable under the Open Government Licence, with more restrictive Royal Mail terms applied to the address and postcode fields.

However, the Scottish Government has abandoned the notion that the EPC data is personal. A note on the register homepage says:

Following advice from the Information Commissioner's Office, EPC and recommendations report are not considered to be 'personal data' and therefore the data 'opt-out' option is no longer in force. However, access will be restricted for the buildings where building owners used the 'opt-out', the restriction will remain in place until the EPC is updated or replaced.

(On the downside, the Scottish EPCs still don't have UPRNs.)

[Update 03/04/2022: the Scottish Government has disclosed to me a copy of its advice from the ICO.]


What can DLUHC do to resolve this problem?

There are two routes DLUHC can take to resolve the conflict in its positions on data protection and licensing for the EPC bulk data.

The first is to review and recant its view that the published EPC data is personal data at the point of release.

The sensible thing would be for DLUHC to seek an opinion from the ICO, as Scotland has done.

The second route is to license the EPC data (excluding the address and postcode fields) under another open licence, such as the Creative Commons Attribution License (CC BY). CC BY is compatible with the Open Government Licence but does not exclude personal data.

[Update 17/05/2022: I have asked DLUHC's EPC team for permission to re-use the data under CC BY. They have refused to engage with my request.]

[Update 31/05/2023: DLUHC has given me permission to re-use the data without licensing conditions, following a decision notice from the ICO. See my additional blog post for details.]


* DLUHC was the Ministry of Housing, Communities and Local Government (MHCLG) prior to September 2021 and the Department for Communities and Local Government (DCLG) prior to January 2018. This post uses DLUHC throughout.