Getting Started
swmr_tools is a python package for making live data processing of hdf5_files easy.
swmr_tools can be installed from conda-forge using:
conda install -c conda-forge swmr-tools
It can also be installed from PyPi:
pip install swmr_tools
Alternatively you can clone the git repository containing swmr_tools using:
git clone https://github.com/DiamondLightSource/python-swmrtools.git
HDF5 File Requirements
To live process HDF5 data using the swmr_tools package there are a few requirements on the file structure.
The file must be created in swmr mode (see https://docs.h5py.org/en/stable/swmr.html)
The file must have one (or more) key datasets (see below)
(Optional) The file can have a finished dataset (see below)
Key Datasets
Although swmr allows HDF5 to be read while being written, it can be difficult to determine whether a slice of the data has been written to or is just the fill data HDF5 uses when a dataset is expanded. To determine whether real data is actually written, swmr_tools needs a key dataset. The key dataset is usually an integer dataset, with a fill value of zero, which is flushed with a non-zero integer value after the corresponding frame of the main dataset is flushed. By monitoring these key datasets, swmr_tools can determine when each data frame is readable.
Finished Dataset
Since HDF5 datasets can be expanded it can be difficult to tell whether a file is complete or whether more data is likely to be written. The swmr_tools library uses a time out to determine when to finish, but this can also be paired with a finished dataset. The finished dataset is a single integer dataset, with a value zero when the file is still being written to and non-zero when the file is complete. This allows a long time out to be used without wasting time waiting when the file is complete.