You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

  • What shall it contain? (This depends highly on your sensors data.)
    • If your sensor produces only one file type and files are all stored in your sensors root directory under platforms/urn/exdata/, there are no dublicates and all existing files shall be published then then we require the follwing
      • A description how we can extract date and time information from your filenames
      • A data format description if it is no standard data format. This can be a description or a persistent link to a format description or a software which can be used to open and process this data.
    • If your sensor produces more than one file type we additionally need:
      • A file name description for how to distinguish between the differen files. (E.g. fileendings, prefix in filename, ...)
    • If there are directories below the sensors root directory
      • A description if we shall archive files from all directories of if to discard certain folders.
      • A description if we can extract information suche as Expedition, Event, Date and Time from the directory nam
  •  What is it used for?  We use this data description to create an Ingest template. Your descriptions are used to create Regular Expressions to match your files in your directories. We apprechiate it, if you can provide us such regex repressions directly in your data description if your are familiar with them. Your data description is translated into such a template:

      "_ingest": {
        "_sourceRegex": ".*(?P<campaign>PS[0-9]{2,3})",
        "_columns": [
          {
            "regex": "^((Screenshots|screenshots))\\/.*\\.((JPG|jpg|PNG|png))",
            "column": "Binary Object []",
            "comment": "Screenshots",
            "description": ""
          },
          {
            "regex": "^PHF_ASD_[1-2][0-9][0-9][0-9][0-1][0-9][0-3][0-9].*\\/HS3PHF_(?P<year>[1-2][0-9][0-9][0-9])-(?P<month>[0-1][0-9])-(?P<day>[0-3][0-9])T(?P<hour>[0-2][0-9])(?P<minute>[0-5][0-9])(?P<second>[0-5][0-9]).*Z_[0-9]*\\.asd",
            "column": "Binary Object []",
            "comment": "PHF_ASD files",
            "description": ""
          },
          {
            "regex": "^PHS_ASD_[1-2][0-9][0-9][0-9][0-1][0-9][0-3][0-9].*\\/HS3PHS_(?P<year>[1-2][0-9][0-9][0-9])-(?P<month>[0-1][0-9])-(?P<day>[0-3][0-9])T(?P<hour>[0-2][0-9])(?P<minute>[0-5][0-9])(?P<second>[0-5][0-9]).*Z_[0-9]*\\.asd",
            "column": "Binary Object []",
            "comment": "PHS_ASD files",
            "description": ""
          },
          {
            "regex": "^S7K_[1-2][0-9][0-9][0-9][0-1][0-9][0-3][0-9].*\\/(?P<year>[1-2][0-9][0-9][0-9])(?P<month>[0-1][0-9])(?P<day>[0-3][0-9])_(?P<hour>[0-2][0-9])(?P<minute>[0-5][0-9])(?P<second>[0-5][0-9])_.*\\.((s7k|S7K))$",
            "column": "Binary Object []",
            "comment": "RESON-S7K files",
            "description": ""
          }
        ]
      }
    
    
    
    
    • the key "_sourceRegex"  can be used to extract campain or event information from the directory. This is important if you wish to publish data from several campaings or events together. The following regex groups are supported by our framework: "campaign", "leg", "science_operation", "device_operation"
    • each element in the list below "columns" : [   descripes one filetype.  "regex": provides the regular expression to match this certain file. The following reges groups are supported to extract information here:  "year", "month", "hour", "minute", "second"
  • Finally this template is used to create the data table on the PANGAEA publication page and to archive the there liste files accordingly. The date time information from the file can be used to provide a georeference based on the campaigns mastertrack if desired.


  • No labels