[writer] make file lock exceptions more explicit
If you open the scan file with silx<0.12.0 or any other third party software that doesn't explicitly disable file locking, your scan will not start with this message in bliss:
RuntimeError: Data writer is in FAULT state due to "Unable to create file (unable to open file: name = '/tmp/scans/nexus_writer_config/data_external.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)"
In the writer logs you see which software is locking the file + the h5py exception:
ERROR:nexus_writer_service.subscribers.session_writer: [nexus_writer_config-3 (RUNNING)] [2_ct-3 (INIT)]
### Files matching '.+data_external.h5$' opened by 1 process
File '/tmp/scans/nexus_writer_config/data_external.h5':
Opened by psutil.Process(pid=26979, name='silx', started='09:53:10')
'/usr/bin/python3 /usr/bin/silx view data_external.h5'
###
ERROR:nexus_writer_service.subscribers.session_writer: [nexus_writer_config-3 (RUNNING)] [2_ct-3 (FAULT)] Unable to create file (unable to open file: name = '/tmp/scans/nexus_writer_config/data_external.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
ERROR:nexus_writer_service.subscribers.session_writer: [nexus_writer_config-3 (RUNNING)] [2_ct-3 (FAULT)] Stop listening due to exception:
Traceback (most recent call last):
File "/data/id21/inhouse/wout/dev/virtualenvs/rnice8/bliss/py37/lib/python3.7/site-packages/h5py/_hl/files.py", line 182, in make_fid
fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 85, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mntdirect/_data_id21_inhouse/wout/dev/blissmain/nexus_writer_service/subscribers/base_subscriber.py", line 318, in __greenlet_main
self._listen_event_loop()
File "/mntdirect/_data_id21_inhouse/wout/dev/blissmain/nexus_writer_service/subscribers/scan_writer_base.py", line 214, in _listen_event_loop
with self.nxroot() as nxroot:
File "/data/id21/inhouse/wout/dev/virtualenvs/rnice8/bliss/py37/lib/python3.7/contextlib.py", line 112, in __enter__
return next(self.gen)
File "/mntdirect/_data_id21_inhouse/wout/dev/blissmain/nexus_writer_service/subscribers/scan_writer_base.py", line 453, in nxroot
with nexus.nxRoot(filename, **self._nxroot_kwargs) as nxroot:
File "/mntdirect/_data_id21_inhouse/wout/dev/blissmain/nexus_writer_service/io/nexus.py", line 898, in __init__
super().__init__(name, creationlocks=creationlocks, mode=mode, **kwargs)
File "/mntdirect/_data_id21_inhouse/wout/dev/blissmain/nexus_writer_service/io/nexus.py", line 852, in __init__
super().__init__(name, mode=mode, swmr=swmr, **kwargs)
File "/data/id21/inhouse/wout/dev/virtualenvs/rnice8/bliss/py37/lib/python3.7/site-packages/h5py/_hl/files.py", line 394, in __init__
swmr=swmr)
File "/data/id21/inhouse/wout/dev/virtualenvs/rnice8/bliss/py37/lib/python3.7/site-packages/h5py/_hl/files.py", line 184, in make_fid
fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 105, in h5py.h5f.create
OSError: Unable to create file (unable to open file: name = '/tmp/scans/nexus_writer_config/data_external.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
This needs to be simplified: in the writer I will catch h5py OSError
exceptions with e.errno == errno.EAGAIN
and replace them with
RuntimeError(f"Scan file '{}' is locked by process {}. Please close the process and only open the file in read-only mode with file locking disabled.")
@matias.guijarro @meyer @pithan @sole @andy.gotz Is there a way to figure out the locking process when it is not running on the same machine as the writer? Is this error message understandable? Should I mention silx>=0.12.0?