You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

Why do I have to publish “my” MOSAiC data?

  • Each PI / lead author assures both raw and primary (processed) data are published, which will guarantee their Findability, Accessibility, Interoperability and Reusability (FAIR data principles).
  • See MOSAiC Data Policy for details.
  • MOSAiC participants by agreeing to the MOSAiC Data policy ensure that MOSAiC is a successful and resource-effective research project. With that, each PI / lead author assures both raw and primary (processed) data are published. Data publication in a dedicated data repository will guarantee their Findability, Accessibility, Interoperability and Reusability (FAIR data principles).
  • See MOSAiC Data Policy for details.

What is data publication?

  • Data publication is a published data set or data collection equipped with a complete set of metadata. It is fully citable by having a title, authors, abstract and a persistent identifier (usually DOI). It can have but need not have a reference to a scientific paper publication.
  • Data publication isn’t sharing data at MOSAiC Central Storage (MCS).
  • Data publication isn’t adding a data table as a supplement to a published scientific paper.

Why making a distinction between data publication and a manuscript?

  • Authorship and acknowledgement:
    • often the data authors and paper authors overlap, but it is not always like that
    • acknowledging contributions of scientists, who generated the data, but did not contribute to the interpretation or manuscript writing
  • FAIR data principles
    • Findability, Accessibility, Interoperability and Reusability: these are easier to guarantee in data repositories than in journal publishers
    • Data publishers: focus on metatadata

What isn't data publication?

  • Data publication isn’t adding a data table as a supplement to a published scientific paper. Often these are in the form of xlsx file or table in pdf. If the paper is not open access, the supplement isn't open access either. The dataset isn't citable!
  • Data publication isn’t sharing data at MOSAiC Central Storage (MCS).
  • Data publication isn't a statement in the paper "Data used for this manuscript were uploaded to PANGAEA and will be available soon."

What is data publication?

Data publication is a published data set or data collection equipped with a complete set of metadata. It is fully citable by having a title, authors, abstract and a persistent identifier (usually DOI). It can have but need not have a reference to a scientific paper publication.

Where do I publish “my” MOSAiC data?

  • The default repository for MOSAiC is PANGAEA.
  • My national funding agency requires depositing data in a special national repository. What should I do? These cases are handled as exceptions (see Data Policy) and a legitimate reason for not publishing the data in PANGAEA. At the moment, written agreements have been signed with several repositories: Arctic Data Center (ADC), Atmospheric Radiation Measurement (ARM) data center, British Oceanographic Data Centre (BODC), UK Polar Data Centre and Centre for Environmental Data Analysis (CEDA). When archiving data in these repositories, it is always important to acknowledge the MOSAiC project (see Data Policy). The agreements assure FAIR data publication and future findability of MOSAiC data from a single access point (portal to project and data).
  • Other exceptions are possible for special data types (e.g., genomics, source code, high volume model data), for which PANGAEA is not a suitable repository and a dedicated community repository exists which fullfills the FAIR criteria. When archiving data in these repositories, it is always important to acknowledge the MOSAiC project (see Data Policy). Otherwise, findability of MOSAiC data from a single access point (portal) cannot be assured in the future. If you are unsure if this applies, contact the PANGAEA team.
  • PANGAEA does not assign or link own DOIs to data sets published somewhere else.

Datasets in PANGAEA may be archived as stand-alone publications of data (e.g., https://doi.org/10.1594/PANGAEA.753658) or as supplements to an article (e.g., https://doi.org/10.1594/PANGAEA.846130). Data can be submitted to and published in PANGAEA with access restrictions in place for a predefined period (until article publication, or during an embargo period).

Metadata must be submitted together with the data. Minimal requirements are:

  • dataset Author(s)
  • PI, dataset title
  • MOSAiC device operation ID(s) / Event labels associated with individual data / data files
  • related institute(s) or publication(s))

Any documentation (e.g., MOSAiC Standard operating procedures, MSOPs) helping to understand the data can and should be linked to the dataset(s). If no persistent link to the documents can be provided, PANGAEA can archive the documents permanently alongside the data.
The granularity of the data is up to the author(s) of the dataset. Lower-granularity datasets can be combined in a (time-)series collection dataset as in https://doi.org/10.1594/PANGAEA.873032. During submission (https://www.pangaea.de/submit/), the connection with MOSAiC has to be clearly stated in the Label Field of the Data Submission ("MOSAiC"). The MOSAiC Project ID (for the Polarstern expedition this is "AWI_PS122_00") is internally associated as a grant number of the MOSAiC project and does not have to be inserted in the submission form additionally.

The MOSAiC Device operation ID(s) / Events list is available after the end of each leg from PANGAEA page https://www.pangaea.de/expeditions/byproject/MOSAiC (can be found for viewing or download under "Event list: " link).

Within the data table, parameters (table header) should be submitted with full names and units. Data submitted in the form of videos, photos, geoTIFF, shape files, netCDF, sgy, etc. will be archived as is (e.g., https://doi.org/10.1594/PANGAEA.865445).

More information on data submission can be found in https://wiki.pangaea.de/wiki/Data_submission.


If a published dataset needs to be updated, PANGAEA will upload a new version of this dataset, with new documentation and complete metadata (clearly providing information on the changes between the versions). Both versions can be linked but will have their own permanent DOI.


When do I submit and publish “my” MOSAiC data?

  • Submit your quality controlled data sets as early as possible and before they are used for a paper. In PANGAEA, the data can be password protected, which means only metadata are accessible, but data itself cannot be viewed or downloaded. However, PANGAEA editors can already provide a temporary access key for reviewers. While data set status is “in review”, the content can still be changed, and the DOI is provided already for use in your manuscript.
  • Cite your data sets in any paper which is using them. Remember, data sets have full citations which can be used just like any other references.
  • Like paper publication, data publication involve editorial work, which requires time (sometime up to several weeks). Do not wait with the submission of data for publication for the last minute. No data citation will be possible before the data are actually archived in the repository.

Data publishing workflow

The workflow drafted below includes publishing data set in the repository before the paper submission. This enables correct crossreferencing of both.

Suggested data publication workflow

  • Submit your quality controlled data sets as early as possible and before they are used for a paper. Including reference of data set already at the paper submission stages enables considering the data for peer review, which for some journals is compulsory.
  • Like paper publication, data publication involves editorial work, which requires time (sometime up to several weeks). Do not wait with the submission of data for publication for the last minute. No data citation will be possible before the data are actually archived in the repository.
  • In PANGAEA, during "in review" status, the data can be password protected. This means only metadata are accessible, but data itself cannot be viewed or downloaded. The data set can also be open access at this stage.
  • PANGAEA editors can already provide a temporary access key for reviewers or colleagues.
  • While data set status is “in review”, the content can still be changed, but the DOI is provided already for use in your manuscript.
  • After final publishing, followed by the DOI registration, no changes to the data sets are possible. New version need to be archived instead and linked to the previous version.
  • Even published MOSAiC data sets can remain under password protection until the publiucation of the associated paper or until the end of MOSAiC moratorium in January 2023.
  • Cite your data sets in any paper which is using them. Remember, data sets have full citations which can be used just like any other references. An example of data set citation is:

Timofeeva, Anna; Smolyanitsky, Vasily; Bessonov, Vladimir; Petrovskiy, Tomash (2020): Special sea ice observations aboard Akademik Fedorov MOSAiC leg 1, 2019-09-25 to 2019-10-20. PANGAEA, https://doi.org/10.1594/PANGAEA.912021

How do I approach data publication?

  • Raw data publication is going to be semi-automatic in the near future and the responsible PIs will be informed about the process.
  • Primary data publication (calibrated data, data ready made for a paper publication) needs to be always initiated by the authors by opening a data submission ticket in PANGAEA or other designated repository, if exceptions apply.
  • If the raw data wasn’t published with PANGAEA at the time of primary data publication yet, and is needed, contact the PANGAEA team for initiating the raw data publication.
  • During data publication instruct the editors in your data repository to create links to other versions of data (e.g., raw data), especially when they were or are being published in another repository.
  • All published data must include a funding acknowledgment of MOSAiC in the following form: "Multidisciplinary drifting Observatory for the Study of the Arctic Climate (MOSAiC)" with the tag "MOSAiC20192020”. Additionally, the Project ID given for specific expedition must be mentioned. For the Polarstern expedition this is "AWI_PS122_00". Additional attributions like specific award/grant numbers might be added.



Attribution: group, peoples icon made by icon king from www.freeicons.io; password icon made by icon king from www.freeicons.io; document, content, article, letter, paper icon made by BECRIS from www.freeicons.io;  edit, document, note, writing, review icon made by BECRIS from www.freeicons.io;  engagement, customer, user, interaction, branding icon made by BECRIS from www.freeicons.io;  essentials, sand, clock, time icon made byandriy matviychuk from www.freeicons.io;  
  • No labels