New Public Sector Transparency Board and Public Data Transparency Principles

Posted on 25/06/2010 22 comments

The Public Sector Transparency Board, which was established by the Prime Minister, met yesterday for the first time.

The Board will drive forward the Government’s transparency agenda, making it a core part of all government business and ensuring that all Whitehall departments meet the new tight deadlines set for releasing key public datasets. In addition, it is responsible for setting open data standards across the whole public sector, listening to what the public wants and then driving through the opening up of the most needed data sets.

Chaired by Francis Maude, the Minister for the Cabinet Office, the other members of the Transparency Board are Sir Tim Berners-Lee, inventor of the World Wide Web, Professor Nigel Shadbolt from Southampton University, an expert on open data, Tom Steinberg, founder of mySociety, and Dr Rufus Pollock from Cambridge University, an economist who helped found the Open Knowledge Foundation.

In the words of Francis Maude:

“In just a few weeks this Government has published a whole range of data sets that have never been available to the public before. But we don’t want this to be about a few releases, we want transparency to become an absolutely core part of every bit of government business. That is why we have asked some of the country’s and the world’s greatest experts in this field to help us take this work forward quickly here in central government and across the whole of the public sector.”

At their first meeting yesterday they discussed some new Public Data Transparency Principles.

Working definition of “Public Data”

"Public Data" is the objective, factual, non-personal data on which public services run and are assessed, and on which policy decisions are based, or which is collected or generated in the course of public service delivery.

Draft Public Data Principles

  • Public data policy and practice will be clearly driven by the public and businesses who want and use the data, including what data is released when and in what form – and in addition to the legal Right To Data itself this overriding principle should apply to the implementation of all the other principles.
  • Public data will be published in reusable, machine-readable form – publication alone is only part of transparency – the data needs to be reusable, and to make it reusable it needs to be machine-readable. At the moment a lot of Government information is locked into PDFs or other unprocessable formats.
  • Public data will be released under the same open licence which enables free reuse, including commercial reuse – all data should be under the same easy to understand licence. Data released under the Freedom of Information Act or the new Right to Data should be automatically released under that licence.
  • Public data will be available and easy to find through a single easy to use online access point (data.gov.uk) – the public sector has a myriad of different websites, and search does not work well across them. It’s important to have a well-known single point where people can find the data.
  • Public data will be published using open standards, and following relevant recommendations of the World Wide Web Consortium. Open, standardised formats are essential. However to increase reusability and the ability to compare data it also means openness and standardisation of the content as well as the format.
  • Public data underlying the Government’s own websites will be published in reusable form for others to use – anything published on Government websites should be available as data for others to reuse. Public bodies should not require people to come to their websites to obtain information.
  • Public data will be timely and fine grained – Data will be released as quickly as possible after its collection and in as fine a detail as is possible. Speed may mean that the first release may have inaccuracies; more accurate versions will be released when available.
  • Release data quickly, and then re-publish it in linked data form – Linked data standards allow the most powerful and easiest re-use of data. However most existing internal public sector data is not in linked data form. Rather than delay any release of the data, our recommendation is to release it ‘as is’ as soon as possible, and then work to convert it to a better format.
  • Public data will be freely available to use in any lawful way – raw public data should be available without registration, although for API-based services a developer key may be needed. Applications should be able to use the data in any lawful way without having to inform or obtain the permission of the public body concerned.
  • Public bodies should actively encourage the re-use of their public data – in addition to publishing the data itself, public bodies should provide information and support to enable it to be reused easily and effectively. The Government should also encourage and assist those using public data to share knowledge and applications, and should work with business to help grow new, innovative uses of data and to generate economic benefit.
  • Public bodies should maintain and publish inventories of their data holdings – accurate and up-to-date records of data collected and held, including their format, accuracy and availability.

They are asking everyone to help shape and define these important principles and have set up a commentable version on our wiki. Please use the talk page to discuss the principles and the wiki page to make any changes needed.

By Transparency Board

Tags

Add new comment

Comments (22)

Open & Linked Data

Please be aware that UNIT4 (authors/solution providerr of Agresso finance system) are developing a free of charge utility that will enable our customers (including our 90+ local government customers) to produce data from Agresso in a linked format.

All of our customers can already produce/publish open data from Agresso themselves at no additional cost.

We are currently working directly with Royal Borough of Windsor & Maidenhead and will be producing further information in the near future on these developments.

Note this will be for all customers irrespective of sector and will also cover the provision of non finance data.

If anyone is interested in this work then please contact me for further information

Anwen Robinson
MD
UNIT4 Business Software Ltd
anwen.robinson@unit4.com

APPSI's views on the Public Data Transparency Principles

At the meeting of APPSI on 22 July 2010 members heard a presentation by The National Archives staff on the Transparency Agenda. It was subsequently agreed that APPSI should express some views to the consultation now underway on the Public Data Transparency Principles and work programme. This note provides those views:

• APPSI has long argued that the government requires a strategy to prioritise information garnering rather than relying entirely on serendipitous data harvesting of what is readily available. We understand that there is no strategy in place to prioritise datasets for incorporation in data.gov.uk. We regard this as wasteful and unlikely to deliver the maximum benefit in the short or medium term.

• We welcome the Public Data Transparency Principles. But government’s working definition of ‘public data’ contradicts the ethos of the Principles in that it does not address the issue of public good. The existing definition is almost entirely predicated upon the management and policy needs of government. It also makes clear that the data are those created as a by-product of public service delivery. Taken at face value, all this is a reversion to the Rayner Review of the 1980s. Given the Public Data Principles, the Prime Minister’s letter to departments of 31 May 2010 (see: http://www.number10.gov.uk/news/statements-and-articles/2010/05/letter-t...) and existing and putative legislation, we suspect this phrasing is an oversight and urge that government should reconsider this definition. A version more in tune with the Principles would be: ‘Public data’ are the objective, factual, non-personal data collected by government at all levels to meet policy, service delivery and public accountability purposes, to enhance the capacity of individuals to be active citizens and to facilitate innovation.

• The first Public Data Principle: Public data policy and practice will be clearly driven by the public and businesses who want and use the data, including what data is released when and in what formats can not be met without effective consultation with users – current and latent. Such consultation is difficult – as the long experience in the official statistics world makes clear. Without it however success will only be by luck. We understand that the Transparency Board will consider user representation. We urge a more purposeful and planned engagement with the user community rather than simply providing data in the hope that this will meet needs.

• In order for government to make data freely available it is important that the public task, which generates the information, is clearly defined. We are pleased to hear that this matter is under active discussion and look forward to seeing the results.

• APPSI’s members from the devolved administrations pointed out that the Transparency Agenda is very Whitehall-centric and more needs to be done to establish a relationship with those administrations.

• One member commented that, based on his experience, data.gov.uk is very confusing as the data is available in formats that can’t easily be re-used and metadata is very limited in explaining the characteristics (hence reliability) of the data. He recognised that this might be transitory given the early stage of development of the web site. Has there been any investigation of the usability of the web site and the active use of the data therein?

• It was agreed amongst APPSI members that measuring the economic and social value of data.gov.uk would be difficult, not least because of the shift of policy outcomes emphasis between administrations. Given the significance of the whole workstream, the expenditure of public funds and the strong political support, APPSI members nevertheless believe it would be responsible for a benchmark to be established now so that changes wrought by data.gov.uk could be assessed effectively at some stage (e.g. in three year’s time).

• In addition, APPSI members debated the trade-offs between continuing to publish data in existing, internationally-defined standards specific to a discipline and re-engineering them into the more universal form underpinning data.gov.uk. We concluded that the relative merits of these might be case-specific, that the resources required for any re-engineering were not clear to us and that indeed both approaches might end up running in parallel.

Advisory Panel on Public Sector Information
24 August 2010

data

It is surprising that a new government agency established to set standards for public data fails to recognise that "data" is a plural.

definitions

it is good to set out principles, as others have commented there will be a need for detailed guidance.

as a starting point, in the spirit of access to data, please can we be told the salaries of those appointed to the Transparency Board, how many days per week they will be spending on this, whether they are now civil servants and if not where they are counted in official figures, as well as details of the fair and open competitive advertising which took place to appoint these people.

Please can we have copies of all those offering tender bids or applying for the posts, and details of the process used to decide whom to appoint.

Please can we have greater clarity on what will be cut in order to provide the additional access proposed - nothing is free, certainly not data provision nor government websites!

Users’ views

The following response is on behalf of the members of the Demographics User Group (see www.demographicsusergroup.co.uk )

DUG has been an active supporter of better access to public sector information for more than a decade, and is delighted with the progress being made with www.data.gov.uk

We welcome the 10 draft Public Data Principles, and are especially pleased to see that the first is: “Public data policy and practice will be clearly driven by the public and businesses who want and use the data, including what data is released when and in what form – and in addition to the legal Right To Data itself this overriding principle should apply to the implementation of all the other principles”.

In order to achieve this it will be necessary for the Transparency Board to put significant effort into encouraging dialogue with a wide range of existing and potential user communities (the public, business, local government, etc.) and expertise (specialists, mainstream analysts, and occasional / new users) to establish priorities.

We also recognise that at this stage, the emphasis is on making use of data that are a by-product of public service delivery (rather than seeking a fresh view of information that is needed for the public good). In this context, we urge the Board to be creative in using existing data to create new information. For example, HMRC’s files of individual records should be evaluated to establish whether they can be used to create aggregate statistics on Incomes for small areas. There are many other similar examples of the potential to use personal administrative records to create anonymous statistics of great value to decision-makers and the public.

Keith Dugmore

What is the public sector?

The principles do not address the fundamental issues of what the public sector is...

+ Does it includes statutary advisors (EG NDPBs)?
+ What about NGOs delivering government business in the big society?
+ Parish Councils?

Should we use an exisiting definition such as that in the EIRs?

Finally not all public sector organisations are Crown organisations so the Data.gov.uk licence can't be used by them as it is a crown licence.

numbers and words

Hi,

more specific advice please. Going to publish statistical data in a set of tables in .csv format.

Issue is there is some commentary and notes included in the tables - what is the best way to tackle this mix of numbers and words to make it easily re-usable

A single access point?

The fourth principle could be misinterpreted to imply one and only one access point. Surely not what is intended? There must be room for more specialised places that have more detailed and more specialised metadata related to particular domains (e.g. health, statistics, etc.). Perhaps the word "single" is not needed - we do need a place where everything can be found, but is surely should not be the only place.

Not just machine readable

I've written a few quick notes here about the importance of not ignoring the non-machine readable steps on the way of linked and open data - as for many use cases simply having access to 'facts' from government data will be valuable.

I've also in that post suggested it might be useful to encourage authorities to publish a list of datasets they have, but don't yet have open, in order to enable users to better drive the prioritisation of data release and reformatting.

OS Mastermap

I assume then the government owned OS mastermap will be one of the data sets freely made available for reuse very soon.

Is there a non-discrimination clause in there?

Just wanted to make sure there's something to make sure that the principle of non-discrimination is embedded in the principles

Accessibility by the public and cost to publish

This site aims at data reuse by technical people but the public ask for data in a readable form (in Local Government). Some may be able to understand how to use Excel (csv) but other will need in PDF format. The standard should be to be able to convert into many forms and not just dictate.
Secondly much of our data is locked away in older systems that do not allow extracts without additional development costs or a resource to do this manually - something we do not have as budgets are cut. Something to bear in mind please!

Encourage other countries

One more principle from my side :-). All countries should compulsorily take part in opening their Govt data to the public and this initiative should not restrict itself to U.K., U.S. and Australia.

PSI Data Principles

As a UK tax payer I fully accept that data collected at public expense should be made available for re-use, and free of extraneous licence conditions or fees. However as a UK PSI employee I do not see why UK assets should be free to use by non-UK based organisations in business which do not benefit UK citizens.

My question to the Board is this: In light of the UKs fiscal situation, why aren't we utilising these incredibly valuable assets to maximise UK licensing revenues based on intended use. Intended use UK = FOC / Intended use anywhere else = Licence fee?

Higher Education and Research

The scope of this could be clearer. Is it intended to cover all publicly funded bodies, including those in the Higher Education sector for example? What about research data, especially pre-publication? And will there be FOI-like exemptions for data that is commercially sensitive?

Commenting on the Draft Principles

Just a minor point, but if you had put named HTML anchors alongside each principle, it would have been easier for commentators on third party sites to post links that point directly back to specific principles allowing you to more easily capture comments appearing on remote sites, and enabling the automated archiving of pointwise micro-discussion on things like Twitter?

WriteToReply

The Write to Reply version just put up has anchors for each point:

http://writetoreply.org/doodlings/draft-public-data-principles/

Perhaps the Public Sector Transparency Board could consider taking into account comments left over on Write to Reply - as the Wiki page still seems not to be editable (at least I can't edit it from this log-in)...

Public sector information

What is the relationship between this new site, and the existing Information Asset Register at
http://www.opsi.gov.uk/iar/index

Will the IAR initiative be shut down under the recent Cabinet Office initiative on web site costs?
(http://www.cabinetoffice.gov.uk/newsroom/news_releases/2010/100624-websi...)

What resources will Central Govenment make available to its departments, agencies and non-departmental public bodies to create and maintain the inventories of data holdings (metadata, last data principle). Many of the IAR records look pretty unloved and uncared for i.e. out of date. Creating and maintaining metadata is itself a time-consuming task if its results are to be of any use.

Commenting on principles

Many thanks for sharing the principles at this stage.

One small technical issue. I'm logged in and able to edit other Wiki pages - but don't seem to be able to edit or add to the talk page of the Commentable principles. Is edit access restricted? Or is this a glitch with individual user accounts?

One small process issue. It would be great to know how you are planning to use comments people make, and how and when you will be taking them into account.

A few substantive issues

1) Relationship to other principles - many of these seem to map on the '8 principles of government transparency' derived a number of years back. Would it be better to adopt those principles with clarifications / additions for the data.gov.uk context, rather than producing a completely new list of principles?

2) Meta-data - It would be good to give more explicit encouragement to the provision of good meta-data, and the progressive improvement of information around the data in the catalogue to help people make sense of datasets.

For .e.g. there is now a lot of information scattered across the web in blog posts; note pads; PDFs etc. on how to make sense of COINS. An encouragement that would lead to the Treasury, for example, adding links to this information to the COINS record in data.gov.uk would be a good thing.

Some explicit (lightweight) base-line standards for meta data quality would be useful I think...

3) Linked Data. Linked data is not always the most end-user friendly format: and focussing on linked data prioritises the role of developer intermediaries over citizens who just want access to specific elements of data.

I would encourage a broader principle here, that doesn't undermine the value of linked data, but recognises the broader range of format needs.

Something along the lines of Release data quickly, and then work to make sure it is available in open standard formats, including Linked Data formats.

This would also capture the idea that it does not have to be government that does the format conversation. If, as in the COINS case again, bodies such as WDDMG and The Guardian have created interfaces onto the data that output CSV / JSON, then Govt should be encouraged to link to those, rather than try and create it's own rendering of those formats; with the govt responsibility being to monitor the continued availability and accuracy of those data sources - not to replicate them.

If focussing on Linked Data only then there is a strong obligation on the data.gov.uk project to develop and user-test tools that mean individuals with absolute minimal technical knowledge can get hold of data in formats they can use with simple consumer tools (Excel; Google Maps etc.)

4) Actively encouraging re-use I would encourage more reflection on whether 'The Government should work with business to help grow new, innovative uses of data and to generate economic benefit' sits best as a principle.

This seems to be to be a possible 'programme' or 'policy' for government to carry out; but to accord it status as a principle seems to be committing government to a substantive use of resources to subsidise a particular sector of private enterprise...

5) A minor point - the last principle would probably sit better further up in the list...

Thanks and praise

Thanks for providing this comment box which one can use in privacy without having to remember anything or get involved with any strange rituals eg " logging in/ registering".
Long may comment boxes like this one serve parts of the public which otherwise might not be reached at all.

But with reference to;

"Public Data" is the objective, factual, non-personal data on which public services run and are assessed, and on which policy decisions are based, or which is collected or generated in the course of public service delivery."

One wonders if all our notes and comments, submitted to public and government services in the spirit of voluntarism albeit from the privacy of wherever we happen to be may also be treated as "public data". I hope so.

A good start

At a high level this is a good start and sets the overall tone and direction.

However government bodies and departments will need more detailed guidance on what to make available (eg budget data) and will need considerable technical help to convert and publish data in machine readable format.

Paul Cook
Director of Finance & ICT
Surrey and Sussex Probation Trust

Local government standards?

While this is clearly aimed at central government, it'd be good to see a similar or identical document adopted and promoted by CLG for use in local government. For example, most of the council spending data that has been released lacks an explicit open licence, some is aggregated rather than fine-grained and some is only available in PDF format or through a proprietary platform.