ODUG Benefits Case for the Open Release of DVLA Bulk Data

Note: The independent views in this blog are those of the Open Data User Group (ODUG) and are not an expression of the opinions of the Cabinet Office or the government. 

ODUG has received a number of requests for DVLA and other vehicle related data and we are today publishing a new benefits cases calling on the DVLA to release its bulk data as Open Data under an Open Government License (OGL).

This example fleshes out public sector data licensing issues with the flavour of most of the big issues ODUG looks into.

In general, where there is a commercial model or cost recovery mechanism in place for Public Sector Data the incumbent licensing regime falls way behind the current needs of the data industry, which is moving in directions which many public sector data licensing organisations don’t understand, at a pace which they cannot keep up with.

My Generic License Agreement to use paid for Public Sector data

This license grants me the right to use some public data to develop

  1. Either general purpose or specific;

1.1.    Products (which may be tightly defined); and/or

1.2.    Services (which may be tightly defined to specify the data and/or parts of the data which I might add value to; or combine with other data)

  1. To my specific markets (which may be restricted)
  2. Using direct and/or indirect delivery channels (each of which might be separately licensed, with a different pricing structure, according to the end-user of my product or service)

Providing that;

  1. My products and services do not compete with certain existing licensees on generic or bespoke license agreements; and/or
  2. Compete with the data holders:

5.1.    Own joint ventures;

5.2.    And/or partnerships; which may generate a bit of money for the Treasury at the moment

  1. And I probably won’t even have the rights to joint ownership for any improvements which I make to the original data (correcting format errors for example)

Therefore the commercial licensing of our Public Sector Information could be described as old fashioned, complicated and burdensome. It was probably designed for 1990s Britain. So the data, funded by our taxes, does not generate a fraction of the value that it should for the Treasury and the licensing and administrative functions in place to support its commercial deployment waste galactic levels of time and resources, generating endless paperwork and commercial complexity for data holders, licensees and sub-licensees alike. I hope you get my drift in the ‘pro forma’ below.

The open data premise is compelling and valuable to the economy because it avoids all of the above:

All this complexity gets stripped away allowing public data to flow through the data ecosystem where it can, without restriction, be put to good use by innovators, businesses large and small, public sector bodies, charities, academics and the public. 

Clearly our public data and its usage must protect individual privacy through suitable anonymisation where necessary. And this, content-based, aspect of public data re-use becomes the fundamental challenge, rather than the existing plethora of complex licensing terms and conditions, derived data restrictions, specific end-user audit and reporting conditions and so on. 

The benefits case we are publishing today shows that the DVLA has a small number of bulk data licensees whose license fees are recouped by DVLA on a cost recovery basis. DVLA had previously set out that third parties could source their data from these bulk data licensees and our example start-up business did just this, and started to deliver an online ‘pay-per-click’ vehicle checking service.

However, following complaints by a market competitor (an existing licensee) the DVLA suspended the start-up’s business for two months, although they are now back-on-line with some specific DVLA imposed restrictions on the use of the data. This is an example of Public Data Licensing complexities stifling opportunities for growth. In my view TotalCarCheck (the start-up) are to be commended for their resolve and tenacity in keeping the business afloat rather than throwing in the towel. Hopefully their revenues will soon allow them to cover their legal bills!

Our proposal to the DVLA is that they should open up this data to the innovator and developer market on parity-for-all basis, as Open Data under an Open Government License. We believe DVLA could significantly reduce their costs, and that the maximum revenue loss to them would be around £0.5m per annum (they have plenty of other revenue). Their bulk data released as Open Data would be put to good use by innovative small businesses and freely available to public sector organisations too.

Come on DVLA – please get with the open data programme and help us create more opportunities for economic growth in the data product and services market!

DVLA data.

Although I welcome open data I believe that making DVLA data free to everyone may be a mistake. As it stands you are able to purchase data for the purpose of resale from the current license holders. Case in point is http://www.carcheckuk.co.uk who are able to run a commercial service. The problem with making the data freely available would remove accountability. As it stands the information needed to assist in cloning a vehicle such as chassis and engine numbers requires a purchase which makes for a certain level of accountability. The check purchaser can be traced. I also believe that poor quality services may be created using this information. Example. The DVLA data only shows a written off vehicle if it cat C and does not include CAT D write offs. Consumers could easily be mislead into believing the check they purchase would check this and be left unaware that the vehicle they are about to purchase could have previously been written off.

I can see the need to require personal data to be resticted from general public use. But any information that i can get by standing in front of a vehicle should be considered public domain. The argument for the VIN number is moot as this is on display in most windscreens. So as long as sensitive data is not revealed then that data should be open  for innovation.

For example, if i'm stood in front of a car i can tell it's Reg No, VIN number, Make, Model, Colour, Body Style, Trim Type and Colour, Number of Doors, Year of Registration, Engine Size etc. Other data like Number of Owners, Previous Colours etc should also be available.

This sort of data essentially can be used to help speed up certain process  in numerous sectors within the vehicle trading and leasing industries, as well as reducing errors by using a consitent dataset.

DVLA data

I have to agree with Clearfly. From my experience prices for basic vehicle data vary widely from supplier to suplier, in most cases smaller companies choose not to provide a registration look up as they believe it not cost effective. In the vehicle inspection market www.mycarinspections.co.uk buck that trend and use vehicle data to provide quotes online. They use a look up in conjunction with a xml quote sheet provides vehicle inspection prices only using the vehicle make and model data. There are a number of companies who operate in this field however, in addition to My Car Inspections,  only the www.theaa.com and www.rac.co.uk use a automatic look up service. If this data could be made available, even in its basic form it will provide great benefits. More motortrade companies are likely to adopt an online look up which is likely to improve customer uptake by improving the end users experience.

I agree with the posts

I agree with the posts supporting a limited release of data.  I assume the first reply"imissmarmite" is related to the site linked and would lose out if data was made freely available, so I do understand their concern.

The main use of this data would NOT be by people reselling it to consumers, it would be companies who need to make the very most basic checks but at a high volume - eg. make, model, type, first registration date and last change of keeper date.  At a cost of several pence for a lookup for the PAYG services this can get expensive if a lookup is triggered from a public website (you need to implement rate limiting etc. by session/IP).

Even if people were to set up a "free vehicle check" service using the data, it's not a new business problem for one company to need to differentiate itself from another that they perceive as being lower quality.   There are already various levels of service offered, some appear to already be charging for searches from the basic sets of data.

As "Clearfly" mentions I can't see vehicle cloners being helped much.  Right now you can clone any vehicle simply by looking at it, I can't see how freely available data would increase what is already a serious criminal offence, although it might make it easier for those who are already intent on doing it.  The immense savings to businesses by having the data will be far outweighed by any such assistance.  I'm sure any police officer could list a dozen ways why it wouldn't be a problem, and how cloning could be made much harder if the government cared abot solving it.

It would still be possible to create an extremely useful set of data by releasing the vehicle information without the VIN, then companies will still be able to get the make/model/type/colour/dates from the VRM but it would protect the DVLA's existing customers who offer a service where the VIN is checked.  Another alternative is to hash the VIN so that a user-supplied VIN can be checked against the data without needing to release every VIN (although I could think of a few issues there).

I also saw a comment elsewhere that some current providers log/alert the police when a stolen vehicle is searched for.  While this is of questionable use in itself, surely it would be better to be made more widely available?  Again, if a list of stolen vehicles or VINs is sensitive information (my local police tweet stolen vehicle registrations often...) then an API could be set up for registered 3rd party users to optionally submit/check a vehicle.  If a hit is recorded then the data user can be contacted for additional information.

I agree that it should not be a free service

There are too many openings for abuse if you make it a totally free service. Industries have sprung up on the basis that people have to pay to get a complete vehicle check. At http://freehpi.org , http://freecarchecks.com  and http://vehiclecheckfree.co.uk  they give free reports for a certain amount of information and then people can choose whether to make a more indepth report at a charge. Its a helpful insight at first glance whether you should proceed based on the first report to get the more indepth one. I think for people who buy more than one car a year they should have a bulk option to search for a set amount, to have to do the report on each car would cost quite a bit.

 The last thing the UK needs to do is become like the states where everything about everybody and anything is available to the public domain, it is just an opening for abuse. thats jma : )

Technically correct

OK where to start. There are numerous arguments for allowing the DVLA data out in the wild both for and against. In my experience the DVLA data is poor quality at best and when combining with multiple data sets becomes unmanageable. Should this data be allowed for free, then the data cleansing side of this would go away and the general public would end up making large purchasing decisions on un clean data. HPI Check such as TextReg would no longer need to use a paid for data provider and just supply dirty results for the general public.

HPI text checks would be served through untraceable providers and people like TextReg.co.uk would not be able to supply an audit trail for the Police in the event of a stolen vehicle. All in all, I think where the market is now for this data is where it should be.

A subset of the DVLA data should be free

My company develops B2B and B2C applications. I would like to see a subset of the DVLA data made freely available.

Our use case is car portals. (Think AutoTrader.co.uk and Motors.co.uk), but at a more regional level.

One of the barriers to entry is the time it can take for car dealers or end users to have to fill in their car details when adding stock to the web site. It makes perfect sense to have all of the technical data available via a VRM lookup.

Our requirements would not need access to personal information, which I do believe needs to be controlled.

I'm not sure if this is the right forum to ask what progress is being made with the DVLA or if they have responded to the benefits case?

VRM lookup is not just for the motor trade

My company specialises in data visualisation for non-specialists. With VRM lookup we could make an augmented reality application that revealed the actual volume of carbon dioxide emitted by cars in real-time. This would help people make sense of transport - their own choices and issues around planning. For this application we would only need to be able to fetch the CO2 emissions for the car. There will be other non-motor-trade applications of VRM lookup (e.g. academic research in transport, geography, sociology, etc.) that would be enabled by opening this data, or subsets of it.

