Table of Contents |
---|
HSM at AWI
Tape Library
TFin-E | The HSM (Hierarchical Storage Management) provides for your data
However, there are two caveats:
| |
← Live View → |
Server and disks
- Three Dell R760xd server
- 4 Brocade (32Gb) 6610 SAN switches
- 270 TB SSD cache (Dell ME55084)
- 488 TB Extended HDD-cache (Dell ME5084)
- 1120 TB Virtual Tape Library (VTL) Silent Brick von Fast-LTA
- Controller G5200
- SAS Switch
- 7x SilentBrick Max
Concept
Principle Idea
- HSM: A (H)ierarchical (S)torage (M)anagement system consists of (at least) two storage systems: A cache speeds up access, and the hierarchy reduces cost. Based on a set of rules, data is stored on certain connected storage devices (tapes and optional disks).
- ScoutFS is used at AWI since 2024 and supersedes samfs (used since decades before) www.versity.com/products/scoutfs/ www.scoutfs.org .
- ScoutAM is used as (A)rchive (M)anager www.versity.com/products/scoutam/
The Circle of Life
Archiving
- When creating a file in ScoutFS (e.g., by rsync, scp, ftp, S3-PUT) the data is stored on a fast cache system (SSD).
- Depending on predefined policies (e.g., when the file has not been modified for a specific amount of time) the file is automatically archived on slower (and much cheaper) tapes (and optionally HDD).
- A file just created is online
Releasing
- The metadata (filename, size, ownership, permissions, etc.) of a file remains always in the cache system and is visible, but
- when the cache system fills up (e.g, 90% capacity) the data of large files and files that have not been touched for some time is released.
- If the data of the file is released it is called offline.
- The user does (in the first instance) not see any difference between a online and offline file.
Staging
- When offline data is accessed ScoutAM intercepts the call and automatically gathers the data from the archive media. ScoutAM uses information stored in metadata to find (any copy back) the data.
- In the meantime the reads from this file is be blocked, thus the process accessing the data blocks, too.
- When accessing more than a few files, prior staging is strongly recommended (see User commands)!
Recycling
- If the content of a file changes a new archive copy has to be produced. (You can not modify just the relevant bits on the tape.)
- The previous archive copy becomes useless (aside from having an additional backup of a previous version).
- If a file is deleted, the archive copy becomes useless, too.
- Both processes result in unused (invalid) sections on a tape.
- Eventually only a small part of a tape contains relevant (up to date) information. The residual data is archived on other tapes, the old tape is erased and can be used for future archive copies.