Data Ingestion

With the products and their cataloged datasets, it is possible to perform a data ingestion process. In this process, transformations such as reprojection and tiling can be applied to the data to be simpler to be loaded by the ODC and become more usual for different user analyses.

As in all other steps presented in this tutorial, the completion of the ingestion process requires a YAML file. For this case, stac2odc has no utilities available yet

To get around this issue and make the ingestion process easy, use the file CB4_64_16D_STK-1.ingest.yaml is shown below, which contains the description of this operation.

Code snippet 3 - File with the description of the ingestion process of the product CB4_64_16D_STK-1.
source_type: CB4_64_16D_STK_1
output_type: CB4_64_16D_STK_1_ingested

description: CBERS-4 ingested data

location: '/data/ingested/'
file_path_template: 'CB4_64_16D_STK_1_ingested/CB4_64_16D_STK_1_ingested_{tile_index[0]}_{tile_index[1]}_{start_time}.nc'

global_attributes:
    title: CBERS-4 (WFI) Cube Stack 16 days (v001) - Tiled
    summary: CBERS-4 data product
    source: CBERS 4 version 1
    institution: INPE
    instrument: WFI
    cdm_data_type: Grid
    keywords: REFLECTANCE,CBERS,EARTH SCIENCE
    platform: CBERS-4
    processing_level: L2
    product_version: 'v001'
    product_suite: INPE CBERS4
    project: BDC
    naming_authority: bdc.inpe
    acknowledgment: CBERS4 is provided by the National Institute for Space Research (INPE).

storage:
    driver: NetCDF CF

    crs: +proj=aea +lat_0=-12 +lon_0=-54 +lat_1=-2 +lat_2=-22 +x_0=5000000 +y_0=10000000 +ellps=GRS80 +units=m +no_defs
    tile_size:
            x: 100000.0
            y: 100000.0
    resolution:
            x: 64
            y: -64
    chunking:
        x: 200
        y: 200
        time: 1
    dimension_order: ['time', 'y', 'x']

measurements:
    - name: BAND13
      dtype: int16
      nodata: -9999
      resampling_method: nearest
      src_varname: 'BAND13'
      zlib: True
      attrs:
        alias: "blue"

    - name: EVI
      dtype: int16
      nodata: -9999
      resampling_method: nearest
      src_varname: 'EVI'
      zlib: True
      attrs:
        alias: "evi"

    - name: BAND14
      dtype: int16
      nodata: -9999
      resampling_method: nearest
      src_varname: 'BAND14'
      zlib: True
      attrs:
        alias: "green"

    - name: NDVI
      dtype: int16
      nodata: -9999
      resampling_method: nearest
      src_varname: 'NDVI'
      zlib: True
      attrs:
        alias: "ndvi"

    - name: BAND15
      dtype: int16
      nodata: -9999
      resampling_method: nearest
      src_varname: 'BAND15'
      zlib: True
      attrs:
        alias: "red"

Note

More YAML for ingestion is available in bdc-odc repository.

The ingestion process present in the file CB4_64_16D_STK-1.ingest.yaml performs the data’s compression and applies tiling so that the data’s recovery is made faster. The generation of the ingestion is presented below:

sudo datacube -v ingest \
              -c CB4_64_16D_STK-1.ingest.yaml \
              --executor multiproc 2

Besides the option --executor, applied in the above command, several others can be used to accelerate the ingestion process and guarantee its reproducibility. For more information, consult the ODC documentation.