TFin-E | The HSM (Hierarchical Storage Management) provides for your data
However, there are two caveats:
| ![]() |
← Live View → |
tar -cPf - $DIRECTORY | pigz -c | split -a3 -d -b500GB - $TOFILE.
Four domains of tape archiving at AWI | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Domain | File Systems | Disk Copy | How to apply | |
---|---|---|---|---|
A | /hs/platforms | Archive of sensor.awi.de | Only possible with sufficient metadata. | |
/hs/usera | Archive of individual Projects | |||
/hs/usero /hs/pangaea | PANGAEA | |||
P | /hs/projects | Project data | Please use eResources to create a project. | |
C | /hs/userc | No | Have vanished since early 2020 | |
IB, IP | /hs/isirep-... | Project replica from the Isilons (Bhv, Pot) | Disaster recovery only | |
D | /hs/backup | samfsdumps and logfiles | IT internal use only | |
/hs/store | 10-year storing of expired user and project data | |||
A disk archive is available for specific (smaller) files for some file systems. This allows a fast access of offline files. The availability of this disk archive depends on the actual resources/usage and might change.
Files might be offline for several reasons. If you want to access these they are read from tape automatically, but this takes some time. If you want to read/copy more than one file staging is strongly recommended (background see below, command see right column).
Connect to \\hsm.dmawi.de\ within windows explorer (use the right mouse button and add a network device). You can either search for shared directories or directly connect with e.g., \\hsm.dmawi.de\projects\<project>.
All HSM file systems are shared and mounted automatically on most Linux clients. A simple ls /hs/projects/ should do. If you miss any directory please contact hsm-support@awi.de
The HSM system ist shared/mounted read only. To write data into the tape archive use one of the following options:
Suggestion | Command | Important Notes |
---|---|---|
rsync -e ssh -Pauv <file|dir> <username>@hsm.dmawi.de:<destination-dir> | rsync is the most versatile way of transfering data. E.g., it allows updates with the -u option. This ensures that only new files are copied (and overwritten), existing (unchanged) files are not touched. This is important to reduce tape access. You do not want to use -a, because this would stage all files from tape to the disk-cache for a complete file-comparison. When copying directories you need -r (recursive, already included in -a). | |
sftp filezilla | sftp provides fast way of transferring large amounts of data. Use your favourite ftp-client. However, note, that only two connections per user are allowed. If you request more, your connection will terminate. sftp uses the secure ssh-protocol and should be preferred. Use port 22 for sftp. | |
scp <file|dir> <username>@hsm.dmawi.de:<destination-dir> | scp seems convenient, but it is slightly slower when transferring data compared to ftp and/or rsync. It also just overrides existing files and no update (like rsync -u) is possible. This would also create new tape copies, you do not want to do that!!! |
Note: If you have to archive many (>100 000) small (<100 MB) files this will stress the system more than necessary. Please zip or tar[.gz] your directories and upload these compressed files.
Direct access (login) to hsm.dmawi.de is not possible. However, you can execute remote commands on hsm.dmawi.de in a restricted shell to get information about your data. E.g., you can release and stage your data if necessary. See the right column for some useful commands. They are executed with: ssh <username>@hsm.dmawi.de <command>
Linux way:
As file systems are mounted read only, you have to execute a chmod on the server. As permissions for directories and files are slightly different (most files should not have the 'x') the following suggestion might be useful to restore the default permissions:
ssh hsm.dmawi.de "find /hs/<DIR> -type d -exec chmod 2775 {} \; " # directories get drwxrws---
ssh hsm.dmawi.de "find /hs/<DIR> -type f -exec chmod 0664 {} \; " # files get -rw-rw----
Windows way
File permissions can be changed with filezilla or another sFTP program.
Your $HOME on hsm.dmawi.de is the standard UNIX home directory. You can use a ssh-key for hsm.dmawi.de. Execute these commands in a terminal (e.g., putty on windows):
Note: Starting with the 7.0 release of OpenSSH, support for ssh-dsa keys has been disabled by default. You can re-enable support locally by updating your sshd_config (in /etc/ssh, /opt/local/etc/ssh, or ~/.ssh/config) with:
PubkeyAcceptedKeyTypes=+ssh-dss
Related articles appear here based on the labels you select. Click to edit the macro and add or change labels.
|
If you have problems please contact: hsm-support@awi.de
To be executed as
ssh hsm.dmawi.de <command>
A detailed explanation to samfs specific commands can be found here: https://docs.oracle.com/cd/E22586_01/html/E22976/glads.html#scrolltoc
User command | Description |
---|---|
mkdir | Create a new directory, e.g., ssh hsm.dmawi.de mkdir /hs/projects/<project>/newdir |
stage | You can (and should) stage a file before you access it. If you use stage -w <file>, the command ends when the file is online. If you want to access more than one file (e.g., a complete directory) you should use stage -r <dir> (recursive) and additionally (optional) stage -r -w <dir> if you want the terminal to wait until all files are staged. NOTE: Never use '-w' in your first stage-command, because you would prevent samfs from optimizing the tape access. https://docs.oracle.com/cd/E22586_01/html/E22976/glajx.html#scrolltoc |
sls -D (or sls -2) | equivalent to ls, shows detailed information about a file and its archive status, e.g., online or offline: https://docs.oracle.com/cd/E22586_01/html/E22976/glaic.html |
sls -E | like sls -D, but shows the md5 checksum, too. Can be used to validate that the file was correctly archived. |
sdu -h | equivalent to du, shows detailed information about a file and its archive status. |
release | release disk space of archived files if online quota reaches the limit. Try release -r <dir> to recursively release files in (sub-)directories. https://docs.oracle.com/cd/E22586_01/html/E22976/glaip.html#scrolltoc |
sfind | equivalent to find, shows correct information about the file size on tape (and not only on the disk cache). https://docs.oracle.com/cd/E22586_01/html/E22976/glaia.html#scrolltoc e.g., is my file online? |
saminfo.sh -q | Get quota of all groups on HSM |
saminfo.sh -s | Show staging status |
saminfo.sh -t | Show tape drive status |
A useful combination if you want to get all netcdf-files in a specific directory online would be something like:
ssh hsm.dmawi.de "sfind <absolut-path> -offline -name *.nc -exec stage {} \;"