Creating master file fails sometimes
The current solution is to just re-run the integration (it will skip existing files and create missing files).
[...]
File "/home/esrf/paleo/.venv/py38_ubuntu20/lib/python3.8/site-packages/integrator/distributed_integration.py", line 329, in _create_output_masterfile
merge_hdf5_files(
File "/home/esrf/paleo/.venv/py38_ubuntu20/lib/python3.8/site-packages/nabu/io/writer.py", line 251, in merge_hdf5_files
virtual_layout = create_virtual_layout(files_or_pattern, h5_path, base_dir=base_dir, axis=axis)
File "/home/esrf/paleo/.venv/py38_ubuntu20/lib/python3.8/site-packages/nabu/io/writer.py", line 181, in create_virtual_layout
with HDF5File(fname, "r", swmr=True) as fid:
File "/home/esrf/paleo/.venv/py38_ubuntu20/lib/python3.8/site-packages/tomoscan/io.py", line 78, in __init__
super().__init__(filename, mode=mode, swmr=swmr, **kwargs)
File "/home/esrf/paleo/.venv/py38_ubuntu20/lib/python3.8/site-packages/h5py/_hl/files.py", line 424, in __init__
fid = make_fid(name, mode, userblock_size,
File "/home/esrf/paleo/.venv/py38_ubuntu20/lib/python3.8/site-packages/h5py/_hl/files.py", line 190, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 96, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = 'ACL9011001b_0022.h5/ACL9011001b_0022.h5_05000.h5', errno = 2, error message = 'No such file or directory', flags = 40, o_flags = 0)
Shutting down client
yet the number of files is correct:
paleo@slurm-nice-devel2903:$ ls out/ACL9011001b_0022.h5 | wc -l
125
I suspect a latency between (a) the file creation within a SLURM node and (b) visibility from other hosts.