NOTE
these pages are primarily for internal audiences rather than users of the Data Platform; we will host user-facing documentation separately.
Defining Data Product Governance
Use the governance
section in your Data Product YAML definition file to indicate to users, and to the Data Platform itself, how the data should be handled, and where it has come from. A correctly specficied set of governance attributes gives you and platform users confidence in handling and using the data.
Example
domain: HMPPS
dataProductOwner: data.product.owner.name@justice.gov.uk
dataProductOwnerDisplayName: Data Product Owner
dpiaRequired: false
retentionPeriod: 400
including optional fields:
yaml
domain: HMPPS
dataProductOwner: data.product.owner.name@justice.gov.uk
dataProductOwnerDisplayName: Data Product Owner
dataProductMaintainer: data.product.maintainer.name@justice.gov.uk
dataProductMaintainerDisplayName: Data Product Maintainer
dpiaRequired: false
dpiaLocation: s3://data-platform-data/civil-courts-data/v1/
retentionPeriod: 400
Notes
Ownership and handling attributes
required
dataProductOwner
- the email address of the owner for this Data Product.dataProductOwnerDisplayName
- display name of the owner for this Data Product.dpiaRequired
-true
orfalse
optional:
dpiaLocation
- location of the DPIA document associated with the Data ProductdataProductMaintainer
- Secondary party who is able to approve DPIA access requests, but who may or may not be legally responsible for the datadataProductMaintainerDisplayName
- display name of the maintainer for this Data Product
DO NOT SEND SECRET OR TOP SECRET DATA TO THE PLATFORM.
Data retention
You should indicate to the platform how long the data should be kept before removal using the retentionPeriod
field.
retentionPeriod
- number of days before the data is removed from the platform based on the date the Data Product was added to the platform
For example: retentionPeriod: 400
is implemented as “delete any data older than 400 days, based on when the data was added to the Platform”.
You can specify 0
for data which is allowed to be kept indefinitely - for example anonymised management information or published statistics.
Data Products owners will be notified of retention policies being applied using the contact information supplied in the Data Product specification.
Data lineage
It is important to users of the data that they know where it has come from. This is the purpose of the lineage attribute:
domain
- defines at a high level which part of the organisation the data has come from. Must be one of: “HMCTS”, “HMPPS”, “LAA”, “OPG”, “HQ”
Other attributes
Further reading
Index of documention for Data Product defintion