Skip to main content

NOTE

these pages are primarily for internal audiences rather than users of the Data Platform; we will host user-facing documentation separately.

Data Product Specification

Use the specification section in your Data Product YAML definition file to provide a name, title and description for your Data Product, and tags to help users find it. If you wish, you can also write a short value proposition outling why the product exists and what data analysis or visualisation might arise from it. In general, the specification section contains information that users of the data will see.

You must also provide contact information for the product.

Example

name: example_data_product
description: Example Data Product contains published prison population from 2001 to present
status: production
email: data.product.contact@justice.gov.uk

including optional fields: yaml name: example_data_product description: Example Data Product contains published prison population from 2001 to present status: production email: data.product.contact@justice.gov.uk tags: {"Sandbox": "True"}

Notes

Naming and tagging

The product’s name must be reasonably short, and unique - we currently check for uniqueness when Data Product requests are created.

You must only use alphanumeric characters or underscores. Don’t use spaces - spaces will be replaced with underscores.

The product description can be longer, give more information and will help users find your data.

NOTE: the product-version attribute is not yet documented here. This will be automatically incremented when the Data Product is updated. Versioning logic is documented here, but in brief, breaking changes will constitute Major version bumps (v1.0 -> v2.0), whereas backwards-compatible changes and updates will be Minor version bumps (v1.0 -> v1.1).

tags are a dictionary of tag keys (and a string bool for whether the tag is active) to further aid discoverability (think of them as search keywords).

Product status

  • status: use one of the following to indicate to users the status of the data:
    • draft
    • development
    • testing
    • production
    • sunset
    • retired

Most Data Products will be tagged as production - the other statuses will flag to users that special handling is required. If status is omitted, we assume it is production.

Contact information

You must supply an email address to indicate the point of contact between consumers and maintainers of the Data Product. It could be the email address of the Data Product owner, some other individual familiar with the Data Product, or a distribution list, but the email address must be monitored and responsive.