Measuring data can be explored in the database in multiple ways, here: we describe the way using the GUI way via https://dashboard.awi.de/data-xxl/.

Reminder: The O2A near real-time data service provides data as-is. This means no or only basic quality control procedures are applied. Thus use data with caution.

1 Facets

On the left-hand side you can find several facets, which are an easy to use way to restrict your searching terms. For every facet applies: - if there is a label that reads > show more... there is more to be shown. - only available fields can be shown, means if you filter for a type (e.g. aircraft) and there is no specific parameter (e.g. pressure) available it cannot be displayed in the facet - metadata that should be searched and filtered for need to exist!

First facet allows you to filter for a certain Type of item, e.g. only measured parameters aboard of vessels. Then, only data streams associated to a vessel (no matter which) are displayed.

Filtering for Parameter types directly corresponds to the entries made in sensor.awi.de at the parameter section (https://spaces.awi.de/x/1Ia0FQ).

Looking for specific events during a campaign or for a certain subset of data the ‘Actions’ (https://spaces.awi.de/x/0oa0) from sensor.awi.de can be facilitated. On the one hand this can be achieved by the facet Mission. This one incorporates all valid entries of the type mission from sensor.awi.de. On the other hand, if finer granularity is needed or if you know a certain label title the facet Action should be clicked. These labels can have its origin from manual entries to sensor.awi.de or they correspond to the Dship https://spaces.awi.de/x/EQDnDw Action Log labels, e.g. PS122/1_1-220.

Under Collection current as well as historical collections (once) defined in sensor.awi.de appear.

The facet Contact is a direct read from sensor.awi.de. All available contacts with all contact roles are displayed here.

Additionally the field organisation is read per contact. The summary is displayed in the facet Organisation and can be used to filter according to institutions.

2 Search Bar

Remark: generally and in advance it might be useful to know what you are searching for, e.g. by exploring relevant entries at sensor.awi.de, then it is much easier to formulate search queries.

The first field of the search bar allows the user to apply a full-text search on metadata. It proved useful to search for item/parameter URNs. This can be cut short by using wildcards (asterisk *) to meet several conditions of your query.

Examples:

  • station:heluwobs:heluw1:adcp_awi_12868:adcp_temp -> exactly one parameter
  • station:heluwobs:heluw1:adcp_awi_12868:* -> meets all parameters of adcp_awi_12868
  • station:heluwobs:heluw1:* -> all items and parameters at Helgoland Underwater Observatory
  • station:heluwobs:heluw1:adcp_awi_12868:dtb* -> only the parameter current starting with ‘dtb’ (distance to bottom)
  • station:heluwobs:heluw1:*temperature -> every parameter at Helgoland Underwater Node 1 that ends with temperature (as of today 18 parameters)
  • station:heluwobs:heluw1:*temp every parameter at Helgoland Underwater Node 1 that end with temperature (as of today 19 parameters)

Additionally it can be also searched on Action labels or Contact information or any information in the facets as well. We recommend to use a proper search term and then restrict the query results by selecting facets.

The age of the incoming data streams can be restricted using the second field. all literally means everything available in the database will be shown as long as it meets the restrictions provided by the user. The age can be set for

  • within the last month
  • within the last week
  • within the last day
  • within the last 12h
  • within the last 6h.

By ticking the only QF field only data streams that undergone basic quality control (https://spaces.awi.de/x/22WjEw) during ingest will be displayed. When the sorting is set to Code then a strict alphanumerical order is kept. Selecting Relevance depends strongly on your search term how exact the result meets your expectation.

3 Query Results

The query results are displayed according to your favorite sorting (see above). When ticked the selected data streams stays on top of the selection, regardless of the pages you move to. The table consists of five columns. The first one is just the selection square (with no column name).


The second column (Code) represents the full parameter URN without unit (see https://spaces.awi.de/x/zYa0FQ at ‘Short names and URN syntax’ for more details). On the left-hand side of the parameter URN can be clicked to see more details about the parameter. By clicking ‘Detailed description’ you are forwarded to the respective entry in sensor.awi.de. When there is a PI available for the parameter’s item in sensor.awi.de (https://spaces.awi.de/x/0Ia0FQ) it is displayed here. It might be important to note, that this works only with the contact role ‘PI’, otherwise the fields is blank. Furthermore the explained properties are listed here (https://spaces.awi.de/x/2Ia0FQ), if they are set by the item editor.

On the right-hand side of the parameter URN the icon copies the full parameter URN to your clipboard. Mostly to the right of the copy tool a green icon indicates that the data stream originates from our NRT database. Occasionally a blue denotes that the corresponding data stream is part of the AWI datapool, a hadoop cluster containing non-realtime mass data (more information about that project will be announced here too). These data streams can be requested, but it takes (in parts much) longer than the NRT-notated streams.

The third column gives the Age of the data stream. In other words, what is the duration of the last timestamp from the ingested data until now. Negative values, with a purple background color show that the values are presumably erroneous. All valid values can have minutes, hours, or days as time duration. Coming with a green background the age is < 10 minutes, with an orange background color the age is < 60 minutes. Every data stream older than one hour appears with red background.

The fourth column shows the last ingested Value plus unit (for legacy reasons some data streams do not show units, but truly they have).

The fifth column is reserved for the quality flag created by autoQC (see https://spaces.awi.de/x/2Ia0FQ). If no quality control was applied by autoQC a 0 is printed, if yes, the last flag is printed here.


4 Configuration Panel

In the first field the user can optionally restricted the time span of data to be downloaded by the information incorporated in various Action labels. Technically this works in the same way as in the facet search. The only restriction is, that it only applies to actions of the type ‘mission’ and ‘deployment’. By typing a search algorithm starts to query all events in sensor.awi.de. If a suitable event is found it can be selected and the start and end date timestamps are filled in the fields From begin date (UTC) and Until end date (UTC). Of course these fields can be filled manually too. Just click the icon.

REMARKS:

  1. When data streams are selected and the time range shall be restricted by actions derived from sensor.awi.de these two entities must match. In other words: the selected data streams need to have the respective action label (https://spaces.awi.de/x/0oa0FQ) assigned, otherwise the result might be unintended (misleading data or no data at all). There are no (validity) checks on these entries from AWI side.
  2. The concept of using deployment timestamps has some drawbacks in practise. Most labels derived from Dship Action Log are rather a single point in time instead of a time range, such as deployment timestamps with a corresponding, but differing recovery label. Here, in DWS a time range is needed. The user is requested to check on the entries for validity.

The field Aggregation allows you aggregate data to minutes, hours or days. Choosing an aggregation type always refers to the (arithmetical) average value by default. If this is not the preferred statistical base, you can alternatively choose between

  • minimum,
  • 25th percentile,
  • median,
  • 75th percentile,
  • maximum,
  • standard deviation,
  • variance, and
  • count (number of values in chosen time interval).

These functions are not available for higher resoluted data. That means, when you choose seconds, milliseconds or microseconds the data will be offered to download as is.

REMARK: The data web service is limited to one million values per call altogether. Hence, two parameters would divide the length of a time series in halves, three parameters would make roughly 300 000 rows of data for each parameter and so on. That means very high resoluted data needs to be called/downloaded in small chunks if the time span significantly overshoots one million values.

Under Quality you can specify if the data export shall contains quality flags or not. By default this feature is disabled. You can activate it by ticking . Please keep in mind that

  1. properties (https://spaces.awi.de/x/2Ia0FQ) need to be set in order to create quality flags during ingest,
  2. quality flag are de facto an extra column of data to be downloaded. Thus the number of data rows is doubled by quality flags.

Note on quality flags and aggregation: When data is aggregated (e.g. hourly values aggregated to daily values) and quality flags created from autoQC (https://spaces.awi.de/x/22WjEw) are available only data values with a quality flag <= 3 are facilitated for aggregation. Values > 4 are omitted. The quality flags itself are aggregated as well. If a single data value has a quality flag of 0, the aggregated quality flag is set 0. Otherwise the highest quality flag (in the sense of best quality available) will be used for the respective interval. The following synthetic value tables might clarify the procedure:



Example data

datetimedataqfnote
12021-06-24 01:00:0032in
22021-06-24 02:00:0034out
32021-06-24 03:00:0041in
42021-06-24 04:00:0071in
52021-06-24 05:00:00113in
62021-06-24 06:00:0072in
72021-06-24 07:00:00143in
82021-06-24 08:00:0073in
92021-06-24 09:00:0063in
102021-06-24 10:00:0064out
112021-06-24 11:00:0042in
122021-06-24 12:00:0042in
132021-06-24 13:00:00111in
142021-06-24 14:00:00101in
152021-06-24 15:00:00132in
162021-06-24 16:00:0044out
172021-06-24 17:00:0093in
182021-06-24 18:00:00134out
192021-06-24 19:00:00101in
202021-06-24 20:00:0092in
212021-06-24 21:00:00113in
222021-06-24 22:00:0061in
232021-06-24 23:00:0061in
242021-06-24 24:00:0051in

When all quality flags >3 are excluded from the table the following table remains to aggregate:

Foundation for aggregation

datetimedataqfnote
12021-06-24 01:00:0032in
32021-06-24 03:00:0041in
42021-06-24 04:00:0071in
52021-06-24 05:00:00113in
62021-06-24 06:00:0072in
72021-06-24 07:00:00143in
82021-06-24 08:00:0073in
92021-06-24 09:00:0063in
112021-06-24 11:00:0042in
122021-06-24 12:00:0042in
132021-06-24 13:00:00111in
142021-06-24 14:00:00101in
152021-06-24 15:00:00132in
172021-06-24 17:00:0093in
192021-06-24 19:00:00101in
202021-06-24 20:00:0092in
212021-06-24 21:00:00113in
222021-06-24 22:00:0061in
232021-06-24 23:00:0061in
242021-06-24 24:00:0051in

The resulting aggregated single-day value would be:

Aggregation of example data to daily values using arithmetic mean
datetimedataqf
2021-06-247.851


Your selected data streams can have different flavors for downloading. As output Format it can be chosen between JSON or tab-delimited (CSV).

Example Data – JSON type

{
    "beginDate": "2021-06-08T13:00:00.000",
    "endDate": "2021-06-08T13:00:30.000",
    "qualityFlags": [],
    "withQualityFlags": false,
    "sensors": [
        "station:heluwobs:heluw1:ctd_awi_578:chlorophyll_a_03"
    ],
    "data": [
        [
            "2021-06-08T13:00:00.000",
            20.58
        ],
        [
            "2021-06-08T13:00:01.000",
            24.77
        ],
        [
            "2021-06-08T13:00:02.000",
            21.3
        ],
        [
            "2021-06-08T13:00:03.000",
            33.83
        ],
        [
            "2021-06-08T13:00:04.000",
            23.88
        ],
        [
            "2021-06-08T13:00:05.000",
            31.34
        ]
    ]
}

Example Data – CSV type

datetime    station:heluwobs:heluw1:ctd_awi_578:chlorophyll_a_03 [µg/l]
2021-06-08T13:00:00.000 20.58
2021-06-08T13:00:01.000 24.77
2021-06-08T13:00:02.000 21.3
2021-06-08T13:00:03.000 33.83
2021-06-08T13:00:04.000 23.88
2021-06-08T13:00:05.000 31.34


Finally can be clicked and the request is processed. Then a list is generated. The first line is a suggestion on how to cite the data. If the item has a declared PI in sensor.awi.de she/he is included in the citation as first author. Otherwise the generic citation is used: O2A Data Services (2021): Data from provided by O2A Data Services. Alfred-Wegener-Institut, https://dashboard.awi.de/data-xxl Below the citation three options are shown:

  1. The data set will be – with regard to your choice – downloaded after clicking.

  2. The download link will be copied to your clipboard so you can paste it elsewhere.

  3. A short summary of the data set can be generated, annotated with some links to more information about the parameter. This is just an excerpt from sensor.awi.de.

// The following metadata is prepared based on sensor.awi.de descriptions.
// Always quote the principal investigator and this data service when using data! The licence is  CC BY 4.0.
// See O2A documentation: https://spaces.awi.de/display/DM/

station:heluwobs:heluw1:ctd_awi_578:chlorophyll_a_03
- name: Chlorophyll A
- unit: µg/l
- type: chlorophyll a
- principal investigator(s):
  Fischer, Philipp <philipp.fischer@awi.de>
- resources:
  JSON representation [https://sensor.awi.de/rest/sensors/item/getDetailedItem/3833?includeChildren=true]
  SensorML representation [https://sensor.awi.de/rest/sensors/item/getItemAsSensorML/3833]
  Web page [https://sensor.awi.de/?id=3833]
  • No labels