Skip to main content

NOTE

these pages are primarily for internal audiences rather than users of the Data Platform; we will host user-facing documentation separately.

Defining Data Product Governance

Use the governance section in your Data Product YAML definition file to indicate to users, and to the Data Platform itself, how the data should be handled, and where it has come from. A correctly specficied set of governance attributes gives you and platform users confidence in handling and using the data.

Example

domain: HMPPS
dataProductOwner: data.product.owner.name@justice.gov.uk
dataProductOwnerDisplayName: Data Product Owner
dpiaRequired: false
retentionPeriod: 400

including optional fields: yaml domain: HMPPS dataProductOwner: data.product.owner.name@justice.gov.uk dataProductOwnerDisplayName: Data Product Owner dataProductMaintainer: data.product.maintainer.name@justice.gov.uk dataProductMaintainerDisplayName: Data Product Maintainer dpiaRequired: false dpiaLocation: s3://data-platform-data/civil-courts-data/v1/ retentionPeriod: 400

Notes

Ownership and handling attributes

required

  • dataProductOwner - the email address of the owner for this Data Product.
  • dataProductOwnerDisplayName - display name of the owner for this Data Product.
  • dpiaRequired - true or false

optional:

  • dpiaLocation - location of the DPIA document associated with the Data Product
  • dataProductMaintainer - Secondary party who is able to approve DPIA access requests, but who may or may not be legally responsible for the data
  • dataProductMaintainerDisplayName - display name of the maintainer for this Data Product

DO NOT SEND SECRET OR TOP SECRET DATA TO THE PLATFORM.

Data retention

You should indicate to the platform how long the data should be kept before removal using the retentionPeriod field.

  • retentionPeriod - number of days before the data is removed from the platform based on the date the Data Product was added to the platform

For example: retentionPeriod: 400 is implemented as “delete any data older than 400 days, based on when the data was added to the Platform”.

You can specify 0 for data which is allowed to be kept indefinitely - for example anonymised management information or published statistics.

Data Products owners will be notified of retention policies being applied using the contact information supplied in the Data Product specification.

Data lineage

It is important to users of the data that they know where it has come from. This is the purpose of the lineage attribute:

  • domain - defines at a high level which part of the organisation the data has come from. Must be one of: “HMCTS”, “HMPPS”, “LAA”, “OPG”, “HQ”

Other attributes

Further reading

Index of documention for Data Product defintion

Example Data Product

This page was last reviewed on 19 October 2023. It needs to be reviewed again on 19 April 2024 by the page owner #data-platform-notifications .
This page was set to be reviewed before 19 April 2024 by the page owner #data-platform-notifications. This might mean the content is out of date.