Click one of the letters above to advance the page to terms beginning with that letter.

This glossary is intended to be an authoritative explanation of the meaning of technical terms, for all users of data.gov.uk.  Users are encouraged to improve it by suggesting a better way of explaining the definitions, and by adding new definitions.

A

Aggregated data

A form of anonymisation of unit records involving combinations such that individual details are not disclosed.

1 comment
Anonymisation

The Process of adapting data so that individual people or businesses cannot be identified from it.

Application Programming Interface (API)

A specification intended to be used as an interface by software components to communicate with each other. An API may include specifications for routines, data structures, object classes, and variables.

Attribution licence

A licence that requires that the original source of the licensed material is cited (attributed).

Authoritative

Able to be trusted as being accurate or true; reliable: e.g. "clear, authoritative information".

go to top

B

Big data

A loose term, not formally defined, for high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing, that can give enhanced insight and decision making.

Big data analytics

The process of examining and interrogating big data assets to derive insights of value for decision making.

go to top

C

Click-use

The online licensing system for Crown and Parliamentary copyright information developed by the Office of Public Sector Information in 2001. Click-Use was replaced by the Open Government Licence and the Open Parliament Licence but remains historically significant.

Commercial use/re-use

Use that is intended for or directed toward commercial advantage or private monetary compensation. For the purposes of the UK Government Licensing Framework, 'private monetary compensation' does not include the exchange of the Information for other copyrighted works by means of digital file-sharing or otherwise provided there is no payment of any monetary compensation in connection with the exchange of the Information.

Compiled database right

The legal protection provided by EC and UK law to a collection of databases (which have been compiled from a number of different sources and normalised to facilitate cross searching).

Content

Published information

Copyright

Part of the family of intellectual property rights including trademarks, designs and patents. Copyright applies automatically when a work is created in a material form. Copyright applies to literary works, such as website articles/annual reports; artistic works maps, drawings, paintings and photographs; films; sound recordings and typographical arrangements. The first owner of copyright will normally be the artist/author or organisation that created the work (except for Crown copyright). Copyright subsists in a work regardless of the level of artistic or literary merit. The standard term of copyright is the life of the author plus 70 years.

1 comment
Core-reference data

Authoritative or definitive data necessary to use other information, produced by the public sector as a service in itself due to its high importance and value.

1 comment
Costs - Fixed

Costs which do not vary with the level of activity in the short run.

Costs - Full

The total cost of all the resources used in providing a good or service in any accounting period (usually one year). This will include all direct and indirect costs of producing the output (both cash and non-cash costs), including a full proportional share of overhead costs and any selling and distribution costs, insurance, depreciation, and the cost of capital, and any selling and distribution costs, insurance, depreciation, and the cost of capital, including any appropriate adjustment for expected cost increases.

Costs - Marginal

The incremental cost of providing one further unit of a good or service.

Creative Commons

A US non-profit organisation which offers a suite of licences to copyright holders to enable them to license their work. The licences offered are all free and licences offered allow the copyright holder to stipulate the certain conditions on how the work may be re-used.

1 comment
Crown Copyright

Crown copyright covers material created by civil servants, ministers and government departments and agencies. It is legally defined under section 163 of the Copyright, Designs and Patents Act 1988 as works made by officers or services of the Crown in the course of their duties. Copyright can also come into Crown ownership by means of an assignment or transfer of the copyright from the legal owner of the copyright to the Crown.

go to top

D

Data (can be singular or plural in common usage)

Factual information, especially information organised for analysis or used to reason or make decisions. In computer science Numerical or other information represented in a form suitable for processing by computer. (The terms data, information and knowledge are frequently used for overlapping concepts. The main difference is in the level of abstraction being considered. Data is a broad term, embracing others, but is often the lowest level of abstraction, information is the next level and, finally, knowledge is the highest level.) See also Raw data, Derived data, Metadata.

2 comments
Data discovery

The process of finding out what data exists and how it can be accessed.

Data sharing

The disclosure of data from one or more organisations to a third party organisation or organisations, or the sharing of data between different parts of an organisation.

Database rights

An intellectual property right which applies to databases defined by the Copyright and Rights in Databases Regulations 1997 as 'a collection of independent works or materials arranged in a systematic or methodical way and that are individually accessible by electronic or other means'. Database rights apply only to the collection of works, not to the individual works contained within it. Database right protection lasts for 15 years from when the database was completed but the 15 year period will restart if the database is altered significantly.

Dataset

A collection of data, usually presented in tabular form, presented either electronically or in other formats.

1 comment
De-anonymisation

The process of determining the identity of an individual to whom a pseudonymised dataset relates.

1 comment
Definitive

Of recognized authority or excellence

Delegations of Authority

Authority granted by the Controller of Her Majesty's Stationery Office to Crown bodies enabling them to license the re-use of information which they produce. Crown bodies with complete delegations to license information include trading funds, however some departments have partial delegations to license the use of particular information. All Crown bodies with delegations of authority are subject to the supervision of the Information Fair Trader Scheme.

Derived data

A data element adapted from other data elements using a mathematical, logical, or other type of transformation, e.g. arithmetic formula, composition, aggregation. See also Value-added data.

2 comments
Digital rights management

A class of access control technologies that are used by hardware manufacturers, publishers, copyrightholders and individuals with the intent to limit the use of digital content and devices after sale.

Disclosive

Data is potentially disclosive if, despite the removal of obvious identifiers, characteristics of this dataset in isolation or in conjunction with other datasets might lead to identification of the individual to whom a record belongs.

Document

Any content whatever its medium (written on paper or stored in electronic form or as a sound, visual or audiovisual recording).

go to top

F

Free at point of use

Where there is no charge or fee to the end-user for the use or re-use of information.

Freemium

A business model by which a product or service (typically a digital offering such as software, media, games or web services) is provided free of charge, but a premium is charged for advanced features or functionality

go to top

G

Geospatial data

Also known as spatial data or geographic information, it is the data or information that identifies the geographic location of features and boundaries on Earth, such as natural or constructed features, oceans, and more. Spatial data is usually stored as coordinates and topology, and is data that can be mapped. 

1 comment
go to top

I

Information

Interpretation and analysis of data that when presented in context represents added value, message or meaning. See also Data.

Information Asset Registers (IAR)

Registers specifically set up to capture and organise metadata about the vast quantities of information held by government departments and agencies. A comprehensive IAR includes databases, old sets of files, recent electronic files, collections of statistics, research and so forth.

Information Fair Trader Scheme (IFTS)

A scheme to set and assess standards for public sector bodies in allowing the re-use of their information. Any public sector body may apply to become IFTS accredited. However, all Crown bodies that hold a delegation of authority from the Controller of HMSO must become IFTS accredited. ITFS measures members' performance against the six principles of maximisation, simplicity, transparency, fairness, challenge and innovation. It considers both the commercial re-use of public sector information and non-commercial citizen access to information.

Information provider

The person, creator or organisation providing the information for re-use under the Open Government Licence or the Non-Commercial Government Licence.

Intellectual property (rights)

A set of property rights that grant the right to protect the created materials. Intellectual property rights comprise trade marks, patents, registered designs copyright and database rights.

go to top

L

Licence (noun)

A legal document giving permission to use information

License (verb)

The act of giving a formal licence (usually written) authorisation.

Linked data

The term used to describe the recommended best practice for exposing, sharing and connecting items of data on the semantic web using unique resource identifiers (URIs) and resource description framework (RDF).

1 comment
go to top

M

Metadata

Data that describes or defines other data. Anything that users need to know to make proper and correct use of the real data, in terms of reading, processing, interpreting, analysing and presenting the information. Thus metadata includes file descriptions, codebooks, processing details, sample designs, fieldwork reports, conceptual motivations, etc., in other words, anything that might influence the way in which the information is used.

Modelled data

Information created by mathematical representation of data relationships; sometimes more reliable and internally consistent than sampled observations.

1 comment
Mosaic/jigsaw effect

The process of combining anonymised data with auxiliary data in order to reconstruct identifiers linking data to the individual it relates to.

1 comment
go to top

N

Non-Commercial Government Licence

The Non-Commercial Government Licence offers a legal solution to enable the provision and use of public sector information under a common set of terms and conditions at no charge for Non-Commercial use only. It enables any public sector information holder to make their information available for use and re-use under its terms. The main requirement for re-users is to attribute the information provider and source.

Non-commercial use

Use that is not intended for or directed toward commercial advantage or private monetary compensation. For the purposes of the UK Government Licensing Framework, 'private monetary compensation' does not include the exchange of the Information for other copyrighted works by means of digital file-sharing or otherwise provided there is no payment of any monetary compensation in connection with the exchange of the Information.

1 comment
go to top

O

Ontology

Formal representation of knowledge as a set of concepts within a domain, and the relationships among those concepts.

Open access (academic)

Provision of free access to peer-reviewed academic publications.

Open data

Data which can be used, re-used and re-distributed freely by anyone - subject only at most to the requirement to attribute and share-alike. There may be some charge, usually no more than the cost of reproduction.

4 comments
Open Data Commons

An Open Knowledge Foundation project run by its Advisory Council and like the Foundation is a not-for-profit effort working for the benefit of the general open knowledge community. Open Data Commons is the home of a set of legal 'tools' to help others provide and use open data.

Open Government Licence (OGL)

The Open Government Licence offers a legal solution to enable the provision and use of public sector information under a common set of terms and conditions. It enables any public sector information holder to make their information available for use and re-use under its terms. The main requirement for re-users is to attribute the Information Provider and source.

go to top

P

Personal data

As defined by the Data Protection Act 1998, data relating to a specific individual where the individual is identified or identifiable in the hands of a recipient of the data.

1 comment
Pseudonymised data

Data relating to a specific individual where the identifiers have been replaced by artificial identifiers to prevent identification of the individual.

Public domain

Works that are publicly available and in which the intellectual property rights have expired or been waived

Public sector bodies

State, regional or local authorities, bodies governed by public law and associations formed by one or several such authorities or one or several such bodies governed by public law.

2 comments
Public Sector Information (PSI)

The wide range of information that public sector bodies collect, produce, reproduce and disseminate in many areas of activity while accomplishing their Public Task.

Public task

Public task information consists of information that a public sector body must produce, collect or provide to fulfil its core role and functions, whether these duties are statutory in nature or are established through custom and practice. The term ‘public task’ features in the Re-use of Public Sector Information Regulations 2005 (SI 2005 No. 1515) and the INSPIRE Regulations 2009 (SI 2009 No. 3157).

go to top

R

Raw data 

In the context of PSI, raw data is data collected which has not been subjected to processing or any other manipulation beyond that necessary for its first use. Raw data, i.e. unprocessed data, is a relative term; data processing commonly occurs by stages, and the 'processed data' from one stage may be considered the 'raw data' of the next.

Re-use (noun/verb)

The use by persons or legal entities of documents held by public sector bodies, for commercial or non-commercial purposes other than the initial purpose within the public task for which the documents were produced. Exchange of documents between public sector bodies purely in pursuit of their public tasks does not constitute re-use.

Resource Description Framework (RDF)

RDF, a W3C standard, is the foundation of several technologies for modelling distributed knowledge and is meant to be used as the basis of the Semantic Web

go to top

S

Sample of Anonymised Records (SARs)

A set of unit records available for research where key information has been removed to ensure anonymity.

Semantic Web

A web of data that can be processed directly and indirectly by machines, providing a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is based on the Resource Description Framework (RDF).

Share-alike licence

A Creative Commons style licence that requires users of a work to provide the content under the same or similar conditions as the original.

1 comment
Star rating

In UK Linked Data, a system of ranking data sources that indicates ease of machine readability. APPSI subjective score for quality of a definition (qv).

1 comment
Synthetic population

A particular application of simulated data that attempts to generate a complete base micro-view of individual subjects of interest.

go to top

T

Taxonomy

The science or technique of classification.

Third party rights

Information, the rights for which are not owned by the Information Provider or Licensor.

Trading Funds

An organisation (either within a government department or forming one) which is largely or wholly financed from commercial revenue generated by its activities. Its Estimate shows its net impact, allowing its income from receipts to be devoted entirely to its business.

go to top

U

Uniform Resource Identifier (URI)

The generic term for all types of names and addresses that refer to objects on the World Wide Web. A URL is one kind of URI.

Uniform Resource Locator (URL)

A type of URI that identifies a resource via a representation of its network location

Unit records

Individual items of information from surveys or observations that often contain confidential details.

go to top

V

Value-added information (or data)

Raw data to which value has been added to enhance and facilitate its use and effectiveness by or for users. Also called derived data.

1 comment
go to top