File Delivery Options
This document outlines the options available in terms of how file-based deliveries are made.
Defaults
- File Format: CSV
- File Compression: GZIP
- Folder structure: See here
- Files are automatically partitioned
- Success File is by default provided
File Delivery Strategies
When configuring a file feed one of the following strategies must be selected for how new and updated data gets shared.
Forward
Only data new in time is delivered. As time passes only the net new data forward in time is delivered.
When locations are added or updated bacfkills for these will not be sent.
Everything
For each delivery, all data is sent, this makes ingestion straightforward specifically for smaller datasets.
File Formats
CSV
We deliver .csv
with a header column and ,
as the delimiter.
File Compression
We offer to compress each file delivered with GZIP
.
Folder Structure
This describes the folder structure used to separate consecutive deliveries of data. Files are partitioned by date and time of the shipment.
[/PREFIX]/[DATASET]/[DELIVERY_DATE_AND_HOUR]/[PERIOD_START_DATE]/
PREFIX
is optional and customisableDATASET
the name of the data delivered in the folder, e.g.foot_traffic_week
ortrade_area
DELIVERY_DATE_AND_HOUR
the date and hour of the delivery (UTC) e.g.2023/01/01/13/
would be data with delivery started on Jan 1st, 2023 at hour 13 UTC.PERIOD_START_DATE
time partitions of data, indicates start date of each observation period, e.g.2022/12/01
would be data describing December 2022.
Custom folder paths and file-names can be accomodated upon request, as long as it utilises building blocks seen above
File Partitioning
Files are automatically partitioned into several chunks/files. These files are numbered.
There is no guaranteed sort between the chunks/files in a single delivery.
Success File
If enabled, a success file can be provided.
The success file is simply a file that is written after all files within one folder has been successfully written to the target.
Before success file each file gets validated by checksum to ensure its integrity after moving the data over the wire.
[<PREFIX>/]<DATASET>/2023/02/24/14/2022/11/01/_SUCCESS
[<PREFIX>/]<DATASET>/2023/02/24/14/2022/11/01/<DATASET>000000000000.csv.gz
[<PREFIX>/]<DATASET>/2023/02/24/14/2022/12/01/_SUCCESS
[<PREFIX>/]<DATASET>/2023/02/24/14/2022/12/01/<DATASET>000000000000.csv.gz
The name of the success file is _SUCCESS
and contains no particular information.