One of the key aspects of our efforts to publish government expenditure data has been to work with departments so the files published are in open formats, well-structured, machine readable and timely. We have gained a lot of ground and have had tremendous support from departments in improving data quality overall across our datasets.
Today we announce the availability of an automatic reporting tool for spending data. This is the result of collaboration between data.gov.uk and OpenSpending.org to increase the visibility of the spend data and to increase the ease of browsing of the substantial volume of datasets that make up the reporting of Government expenditure in datagov.uk.
The tool lists all public bodies registered as data publishers on data.gov.uk and details how they have followed the HM Treasury reporting guidelines. It also makes the whole of the reported data available for search and analysis both on data.gov.uk and on the OpenSpending site.
The main purpose of the tool is to allow both users and departments to ascertain several key points:
- Quality of the data (i.e. adherence to HMT reporting guidelines, well-structured data)
- Status of reporting (i.e. how complete the reports are or if there is a reporting period missing)
Having all of these datasets organised under a single catalogue at Data.Gov.UK in simple spreadsheet format enabled us to create an extraction system to clean the data on a regular basis. We have cleaned over 6000 column names to aid compliance with HMT guidance (http://nomenklatura.okfnlabs.org/uk25k-column-names)
The report generator then highlights, in red, departments who are registered as a publisher on Data.gov.uk but have failed to publish any information on their spending. Those who have published data which cannot be interpreted as spending data (e.g. PDF format or not complying with the template (http://www.hm-treasury.gov.uk/d/transparency_annexa100910.xls - provided by HMT) are rated as yellow and those departments whose records have been updated as per the publication requirements (latest data must have been published as recently as 3 months ago) are rated green. The first stage of this release deals with central departments, who are obliged to report all spending over £25k. Subsequent stages to follow soon after will monitor local councils and other government bodies, which have different reporting requirements. The interface will be useful both inside and out of government, to ensure transparency regulations are met and to better understand where gaps in data may alter the completeness of the picture offered by government data.
The code is available here: https://github.com/openspending/dpkg-uk25k