Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: typos

...

  • What shall it contain? (This depends highly on your sensors data.)
    • If your sensor produces only one file type and files are all stored in your sensors root directory under platforms/urn/exdata/, there are no dublicates duplicates and all existing files shall be published, then then we require the follwingfollowing:
      • A description how we can extract date and time information from your filenames
      • A data format description, if it is no a non-standard data format. This can be a description or a persistent link to a format description or a software which can be used to open and process this data.
    • If your sensor produces more than one file type we additionally need:
      • A file name description for how to distinguish between the differen different files. (E.g. fileendingsfile endings, prefix in filename, ...)
    • If there are directories below the sensors root directory
      • A description if we shall archive files from all directories of if to discard certain folders.
      • A description if we can extract information suche such as Expedition, Event, Date and Time from the directory namname
  •  What is it used for?  We use this data description to create an Ingest template. Your descriptions are used to create Regular Expressions to match your files in your directories. We apprechiate it, if you can appreciate you to provide us with such regex repressions directly in your data description if your are familiar with them. Your data description is translated into such a template:


    Code Block
    languagepy
      "_ingest": {
        "_sourceRegex": ".*(?P<campaign>PS[0-9]{2,3})",
        "_columns": [
          {
            "regex": "^((Screenshots|screenshots))\\/.*\\.((JPG|jpg|PNG|png))",
            "column": "Binary Object []",
            "comment": "Screenshots",
            "description": ""
          },
          {
            "regex": "^PHF_ASD_[1-2][0-9][0-9][0-9][0-1][0-9][0-3][0-9].*\\/HS3PHF_(?P<year>[1-2][0-9][0-9][0-9])-(?P<month>[0-1][0-9])-(?P<day>[0-3][0-9])T(?P<hour>[0-2][0-9])(?P<minute>[0-5][0-9])(?P<second>[0-5][0-9]).*Z_[0-9]*\\.asd",
            "column": "Binary Object []",
            "comment": "PHF_ASD files",
            "description": ""
          },
          {
            "regex": "^PHS_ASD_[1-2][0-9][0-9][0-9][0-1][0-9][0-3][0-9].*\\/HS3PHS_(?P<year>[1-2][0-9][0-9][0-9])-(?P<month>[0-1][0-9])-(?P<day>[0-3][0-9])T(?P<hour>[0-2][0-9])(?P<minute>[0-5][0-9])(?P<second>[0-5][0-9]).*Z_[0-9]*\\.asd",
            "column": "Binary Object []",
            "comment": "PHS_ASD files",
            "description": ""
          },
          {
            "regex": "^S7K_[1-2][0-9][0-9][0-9][0-1][0-9][0-3][0-9].*\\/(?P<year>[1-2][0-9][0-9][0-9])(?P<month>[0-1][0-9])(?P<day>[0-3][0-9])_(?P<hour>[0-2][0-9])(?P<minute>[0-5][0-9])(?P<second>[0-5][0-9])_.*\\.((s7k|S7K))$",
            "column": "Binary Object []",
            "comment": "RESON-S7K files",
            "description": ""
          }
        ]
      }
    
    
    
    


    • the key "_sourceRegex"  can be used to extract campain campaign or event information from the directory. This is important if you wish to publish data from several campaings campaigns (expedition legs) or events together. The following regex groups are supported by our framework: "campaign", "leg", "science_operation", "device_operation"
    • each element in the list below "columns" : [   descripes describes one filetypefile type.  "regex": provides the regular expression to match this certain file. The following reges groups are supported to extract information here:  "year", "month", "hour", "minute", "second"
  • Finally this template is used to create the data table on for the PANGAEA publication page and to archive there the there liste list of files accordingly. The date time information from the file can be used to provide a georeference based on the campaigns campaign's mastertrack if desired.