Please note: Although these Information about the usage of S3 are correct, the S3-gateway@AWI is still under construction and not available for users, yet. However, if you have use cases you may contact Pavan Kumar Siligam . He collects use cases.
Clients
There are verity of S3 clients to choose from and here are few that covers both command line based interaction to S3 bucket and also python scripting based interaction.
The clients that are covered here are:
- aws
- s3cmd
- s3fs
- boto3
Configuration for each of these tools is a bit different in terms of credentials naming conventions.
First setup the software stack using conda
conda create -y -n s3 python=3.12 conda activate s3 pip install aws-shell pip install s3cmd conda install -y -c condo-forge s3fs boto3 python-magic pyyaml
It is not required to install every thing as listed above. Installing only the required ones also works.
credentials
Lets say the following information is provided by the system administrator
URL:PORT => https://hssrv2.dmawi.de:635 region/location => bhv ACCESS_KEY => HPC_user SECRET_KEY => t1H13sOUBD/H7NuL CERTS_FILE => https://spaces.awi.de/download/attachments/494210152/HSM_S3gw.cert.pem
These credentials are to be adapted for each of the clients as they fit accordingly.
Please make sure to download the certificate file.
aws
Use aws configure
to adapt the credentials to this tool or create the following files.
[default] aws_access_key_id=HPC_user aws_secret_access_key=t1H13sOUBD/H7NuL
[default] region = bhv endpoint_url = https://hssrv2.dmawi.de:635 ca_bundle = /Users/pasili001/Downloads/HSM_S3gw.cert.pem # < CORRECT ME >. using tilde (~) or $HOME in path does *NOT* work.
Listing the buckets
> aws s3 ls 2024-04-06 01:11:30 testdir > aws s3 ls s3://testdir 2024-04-06 01:11:30 385458 tmp.csv
s3cmd
s3cmd
is a free command line tool and client for uploading
, retrieving
and managing data
in Amazon S3 and other cloud storage service providers that use the S3 protocol.
s3cmd
look for credentials at ${HOME}/.s3cfg
create the config file as follows
[default] host_base = hssrv2.dmawi.de:635 host_bucket = hssrv2.dmawi.de:635 bucket_location = bhv access_key = HPC_user secret_key = t1H13sOUBD/H7NuL use_https = Yes ca_certs_file = /Users/pasili001/Downloads/HSM_S3gw.cert.pem. # < CORRECT ME >
Listing the buckets
> s3cmd ls 2024-04-06 01:11 s3://testdir > s3cmd ls s3://testdir 2024-04-06 01:11 385458 s3://testdir/tmp.csv
upload a directory
> s3cmd sync --stats demo-airtemp/ s3://testdir/demo-airtemp/ Done. Uploaded 5569414 bytes in 62.8 seconds, 86.61 KB/s. Stats: Number of files transferred: 306 (5569414 bytes) > s3cmd ls s3://testdir/demo-airtemp DIR s3://testdir/demo-airtemp/ > s3cmd ls s3://testdir/demo-airtemp/ DIR s3://testdir/demo-airtemp/air/ DIR s3://testdir/demo-airtemp/lat/ DIR s3://testdir/demo-airtemp/lon/ DIR s3://testdir/demo-airtemp/time/ 2024-04-07 15:57 307 s3://testdir/demo-airtemp/.zattrs 2024-04-07 15:57 24 s3://testdir/demo-airtemp/.zgroup 2024-04-07 15:57 3969 s3://testdir/demo-airtemp/.zmetadata
Note: trailing forward-slash /
matters in both listing the objects and as-well in transferring files ( `sync` ) to S3.