Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Introduction

A The HSM (Hierarchical Storage Management) Hierarchical Storage Management) at AWI provides 

  1. slightly smiling face (nearly) unlimited storage space and
  2. (smile)  two replicates on tape (in different buildings for security) and
  3. slightly smiling face a third copy on disk (for selected data, smaller files for faster access) → VTL (Virtual Tape Library, Silent Brick)
  4. slightly smiling face for a (comparable) low cost

However, there are two caveats:

  • (warning) Your (larger) Data is archived on

...

  • tape, so it will take a

...

  • few minutes to get it back (unless it is online or has a disk-copy).
  • (warning) You need a project to archive your data (apply for a project here: eResources).


HSM at AWI

Four domains
of
for tape archiving at AWI

Image Modified


Domain
Usage
PurposePathDisk CopyDescriptionHow to apply
Purpose

A

(mainly) PANGAEA and SENSOR

PANGAEA

/hs/platforms

(tick)

Archive of sensor.awi.de

Permanent archive for irrecoverable data. Metadata is needed for this domain. Please
contact Stefanie Schumacher or Janine Felden for more information aboutPANGAEA and
contact PANGAEA on how to submit your data.
/hs/usero
/hs/pangaea
(tick)PANGAEA
Note: Former /hs/usero has moved to /hs/pangaea/data/legacy/ on 2024-06-17
P
Open to all users
Projects

/hs/D-P/projects
/hs/D-P/s3projects

(tick)Project data

Long term project data with predefined life time. A project can be created with eResources https://cloud.awi.de/#/projects 
The setgid on directories ensures that new (sub-)directories belong to the project (POSIX) group automatically

.
Please use eResources to create a project and request storage resources. 

CDepricated

/hs/usera
/hs/userc

/hs/userm


Have vanished early 2020


I
IB, IP
Desaster recovery/hs/D-I/(minus)Project replica from the Isilons (Bhv, Pot)Daily (or weekly) automatic replication of online project data from the Isilon (when selected in eResources) in Bremerhaven and Potsdam, respectively. Used for disaster recovery only (=hopefully never (wink) ).
DInternal IT stuff
For internal IT use (used for additional backups and
/hs/backup(tick)samfs-dumps, ScoutFS-dumps, logfiles, VeeamIT internal use only
/hs/store(minus)10-year storing of expired user
/
and project data
/hs/s3gateway
experimental storage

A disk copyis available for specific (smaller) files for some file systems. This allows a significantly faster access of offline files. The availability of this disk archive depends on the actual resources/usage and might change.

Concept

Principle Idea 

  • HSM:  A (H)ierarchical (S)torage (M)anagement system consists of (at least) two storage systems: A cache speeds up access, and the hierarchy reduces cost. Based on a set of rules, data is stored on certain connected storage devices (tapes and optional disks).
  • ScoutFS is used at AWI since 2024 and supersedes samfs (used since decades before2004www.versity.com/products/scoutfs/  www.scoutfs.org
  • ScoutAM is used as (A)rchive (M)anager www.versity.com/products/scoutam/ and supersedes OHSM/VSM

The Circle of Life

Archiving

  • When creating a file in ScoutFS (e.g., by rsync, scp, ftp, S3-PUT) the data is stored on a fast SSD cache system.
  • Depending on predefined policies (e.g., when the file has not been modified for a specific amount of time) the file is automatically archived on slower (and much cheaper) tapes (and optionally HDD).
  • A file just created is online

Releasing

  • The metadata (filename, size, ownership, permissions, etc.) of a file remains always in the cache system and is visible/accessible, but 
  • when the cache system fills up (e.g, 90% capacity) the data of large files and files that have not been touched for some time is released.
  • If the data of the file is released it is called offline.
  • The user does (in the first instance) not see any difference between an online and an offline file.

Staging

  • When offline data is accessed ScoutAM intercepts the call and automatically gathers the data from the archive media. ScoutAM uses information stored in metadata to find (and stage = "copy back") the data.
  • In the meantime the read from this file is blocked, thus the process accessing the data blocks is blocked, too.
  • When accessing more than a few files, prior staging is strongly recommended (see User commands)!

Recycling

  • If the content of a file changes a new archive copy has to be produced. (You can not modify just the relevant bits on the tape.)
  • The previous archive copy becomes useless (aside from having an additional backup of a previous version).
  • If a file is deleted, the archive copy becomes useless, too.
  • Both processes result in unused (invalid) sections on a tape. 
  • Eventually only a small part of a tape contains relevant (up to date) information. The residual data can be archived on other tapes, the old tape is erased and can be used for future archive copies. This process is called recycling.

Hardware

Tape Library

  • 2 TFinity Tape Libraries in two buildings
  • each has 3100 licensed slots for LTO tapes
  • 600 LTO-9 tapes in each library (600x ~20 TB → ~12 PB)

Image Modified

TFin-E


Image ModifiedTFin-D

Live View Frame #2 or Frame #3
User: viewer PW: Viewer123!

Live View Frame #2 or Frame #3
User: viewer PW: Viewer123!

Server, Switches and disks

  • Three Dell R760xd server (hssrv2[a-c].dmawi.de)
  • 4 Brocade (32Gb) 6610 SAN switches
  • 270 TB SSD primary cache (Dell ME5084-SSD)
  • 488 TB extended HDD-cache (Dell ME5084-HDD)
  • Ab Ende Planed for end of 2024:
    1120 TB Virtual Tape Library (VTL) Silent Brick von Fast-LTA 
    • Controller G5200
    • SAS Switch
    • 7x SilentBrick Max

...