You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Suggestions/good to know

  1. Storing millions of files in one directory slows down any filesystem. Therefore we suggest to limit the number of files whenever possible (per design, workflow, zip, tar, ...)
  2. We suggest to limit the size of each individual file to a maximum of 500 GB per file. E.g. with
    tar -cPf - $INDIRECTORY | pigz -c | split -a3 -d -b500GB - $OUTFILE.
  3. If you access(read) more than one file from the HSM, please stage all files you plan to access beforehand at once! This will decrease the overall access time significantly by reducing the necessary robotic accesses to a minimum. However, please limit yourself to access only about a dozen TB at once. 

File Storage (POSIX, smb, nfs)

Read only Access

The Windows Way (smb/cifs)

Connect to \\hssrv2.dmawi.de\ within windows explorer (use the right mouse button and add a network device). You can either search for shared directories or directly connect with e.g., \\hssrv2.dmawi.de\projects\<project>.

The Linux Way (nfs)

The HSM file systems are shared and mounted automatically on most Linux clients (but not on laptops). A simple ls /hs/projects/ should do. If you should miss a directory please contact hsm-support@awi.de

Read/Write Access

The HSM system ist shared/exported/mounted read only. To write data into the tape archive use one of the following options:

SuggestionCommandImportant Notes
(big grin) Best choice

rsync -e ssh -Pauv <file|dir> <username>@hssrv2.dmawi.de:<destination-dir>

rsync is the most versatile way of transferring data. E.g., it allows updates with the -u option. This ensures that only new files are copied (and overwritten), existing (unchanged) files are not touched. This is important to avoid tape waste.
Note: You do not want to use -c, because this would  stage all files from tape to the disk-cache for a complete file-comparison.

When copying directories you need -r (recursive, already included in -a).

(big grin) Fast choicesftp
filezilla
sftp provides fast way of transferring large amounts of data. Use your favorite ftp-client. However, note, that only two connections per user (per HSM server) are allowed. If you request more, your connection will terminate. sftp uses the secure ssh-protocol and should be preferred. Use port 22 for sftp.
(minus) Do not use!scp <file|dir> <username>@hssrv2.dmawi.de:<destination-dir>scp seems convenient, but it is slightly slower when transferring data compared to ftp and/or rsync. It also just overrides existing files and no update (like rsync -u) is possible. This would also create new tape copies, you do not want to do that!!!

Note: If you have to archive many (>100 000) small (<100 MB) files this will stress the system more than necessary. Please zip or tar[.gz] your directories and upload these compressed files.

Object Storage (S3)

  • ScoutAM provides a S3-gateway to access data. 
  • Currently this in the testing phase.
  • We consider/plan to provide S3 storage via eResources in he future (Speculation: Q4/2024)
  • Only TLS connections are possible. Please use this certificate:  HSM_S3gw.cert.pem
     
  • No labels