Skip to main content

File Delivery Options

This document outlines the options available in terms of how file-based deliveries are made.

Defaults

  • File Format: CSV
  • File Compression: GZIP
  • Folder structure: See here
  • Files are automatically partitioned
  • Success File is by default provided

File Formats

CSV

We deliver .csv with a header column and , as the delimiter.

(ND)JSON

We deliver New-Line Delimited JSON with the .json extension.

File Compression

We offer to compress each file delivered with GZIP.

Folder Structure

This describes the folder structure used to separate consecutive deliveries of data. Files are partitioned by date and time of the shipment.

[/PREFIX]/[dataset_version_string][is_backfill]/[DELIVERY_DATE_AND_HOUR]/[PERIOD_START_DATE]/
  • PREFIX is optional and customisable
  • dataset_version_string is a unique name for the version of the dataset received
  • is_backfill if the delivery is a backfill _backfill will be appeneded to the dataset_version_string
  • DELIVERY_DATE_AND_HOUR the date and hour of the delivery (UTC) e.g. 2023/01/01/13/ would be data with delivery started on Jan 1st, 2023 at hour 13 UTC.
  • PERIOD_START_DATE time partitions of data, indicates start date of each observation period, e.g. 2022/12/01 would be data describing December 2022.

File Partitioning

Files are automatically partitioned into several chunks/files. These files are numbered.

There is no guaranteed sort between the chunks/files in a single delivery.

Success File

If enabled, a success file can be provided.

The success file is simply a file that is written after all files within one delivery has been successfully written to the target.

For a backfill of data when multiple time periods of data is delivered at once, one success-file will be provided per period, in its respective folder.

In the backfill example below, there is one success-file for each period of data. Data is delivered for November and December 2022.

[<PREFIX>/]<DATASET_VERSION>_backfill/2023/02/24/14/2022/11/01/_SUCCESS
[<PREFIX>/]<DATASET_VERSION>_backfill/2023/02/24/14/2022/11/01/<DATASET_VERSION>000000000000.csv.gz
[<PREFIX>/]<DATASET_VERSION>_backfill/2023/02/24/14/2022/12/01/_SUCCESS
[<PREFIX>/]<DATASET_VERSION>_backfill/2023/02/24/14/2022/12/01/<DATASET_VERSION>000000000000.csv.gz

The name of the success file is _SUCCESS and contains no particular information.