Broken links on Data.gov.uk

Post: 23 May 2014

Broken links in the Data.gov.uk catalogue have been a growing problem for some time. Recently the DGU team has added reporting functionality, and individual broken links are now flagged with an error message:

image

The DGU report lists the number of broken dataset links per publisher. The table below shows the ten publishers with the highest number of datasets with broken links:

image

Of course, some publishers have many more datasets in the DGU catalogue than others. This table shows all publishers with more than 100 published datasets in the catalogue, along with the percentage of their datasets with broken links:

image

This table shows ministerial departments and the percentage of their published datasets with broken links:

image

Here’s the full list in a Google Spreadsheet:

Data.gov.uk catalogue broken links (22/05/2014)

(This is basically the DGU broken links report combined with additional information from a DGU catalogue data dump.)

Comments

Any large collection of links will suffer link rot over time if not actively maintained. However there seem to be some additional drivers behind DGU’s broken link problem:

Some publishers have high numbers of broken links due to specific issues. For example:

In my view the scale of the problem with broken links is also a function of the underlying approach to management of the DGU catalogue. Publishers are encouraged to provide direct links to datasets, rather than to a landing page on their own sites. Superficially this makes sense because users can simply grab the data without navigating to another site. However it means there are a lot of different links to maintain. I think most data publishers probably have a low sense of “ownership” for the user’s experience on DGU. Some users may save time by downloading data directly from DGU, but this is not necessarily a worthwhile trade-off. Even if the links do work, the contextual information provided by the DGU catalogue alone may be inadequate or out of date. Serious data users still need to research and confirm the latest position before making use of any dataset available via Data.gov.uk.