...
Note: trailing forward-slash /
matters in both listing the objects and as-well in transferring files ( `sync` ) to S3.
s3fs
- `s3fs` is a Python library to talk to S3.
- It builds on top of `botocore`.
- parts of the library uses `fsspec` to map to S3.
Features the following:
- `s3fs.S3Filesystem` for file system operations (ls, remove, du, ...)
- `s3fs.S3Map` for python dictionary like access (key --> value)
- `s3fs.S3File` for file-like object (read, write, seek, ...)
s3fs is a bit flexible with config file naming convention and also with the file format of the config file. Users are free to choose to store their credentials in either yaml or json or any other format that is convenient for them read and load them. Here these credentials are shown as a yaml format just because it a bit reader friendly.
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
key: HPC_user
secret: t1H13sOUBD/H7NuL
client_kwargs:
endpoint_url: https://hssrv2.dmawi.de:635
verify: /Users/pasili001/Documents/HSM_S3gw.cert.pem
region_name: bhv |
Write a utility function to read the config file
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
import os
import yaml
import s3fs
def get_fs():
with open(os.path.expanduser("~/.s3fs")) as fid:
credentials = yaml.safe_load(fid)
return s3fs.S3FileSystem(**credentials) |
listing bucket
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
>>> fs = get_fs()
>>> fs.ls('testdir')
['testdir/demo-airtemp', 'testdir/tmp.csv']
>>>
>>> fs.ls('testdir/demo-airtemp')
['testdir/demo-airtemp/.zattrs',
'testdir/demo-airtemp/.zgroup',
'testdir/demo-airtemp/.zmetadata',
'testdir/demo-airtemp/air',
'testdir/demo-airtemp/lat',
'testdir/demo-airtemp/lon',
'testdir/demo-airtemp/time'] |
download file
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
>>> fs.get("testdir/demo-airtemp/.zattrs", "zattrs")
[None]
>>>
>>> # reading the local file `zattrs` to check if all bytes are transfered
>>> import json
>>> with open("zattrs") as fid:
... content = json.load(fid)
...
>>> print(content)
{'Conventions': 'COARDS',
'description': 'Data is from NMC initialized reanalysis\n'
'(4x/day). These are the 0.9950 sigma level values.',
'platform': 'Model',
'references': 'http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis.html',
'title': '4x daily NMC reanalysis (1948)'}
>>> |
directly read a file from s3
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
>>> with fs.open("testdir/demo-airtemp/.zattrs", mode="rb") as f:
... content = f.read().decode()
... content = json.loads(content)
...
>>> print(content)
{'Conventions': 'COARDS',
'description': 'Data is from NMC initialized reanalysis\n'
'(4x/day). These are the 0.9950 sigma level values.',
'platform': 'Model',
'references': 'http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis.html',
'title': '4x daily NMC reanalysis (1948)'}
>>> |
Further documentation:
check out their API for function signatures andalso their documentation for more examples.