Scan number mixup
Corrupted scan numbering is still something that occurs occasionally. Here is an example at ID22 (nexus writer log)
INFO 2020-05-26 16:12:51 nexus_writer_service.subscribers.session_writer: [exp-3 (RUNNING)] [13_fscan-3 (ON)] Start writing to '/data/id22/inhouse/id222003/id22/Laurent_Tests_260520/Laurent_Tests_260520_0002/Laurent_Tests_260520_0002.h5'
...
INFO 2020-05-28 10:45:33 nexus_writer_service.subscribers.session_writer: [exp-3 (RUNNING)] [13_fscan-3 (ON)] Start writing to '/data/id22/inhouse/id222003/id22/Laurent_Tests_260520/Laurent_Tests_260520_0002/Laurent_Tests_260520_0002.h5'
ERROR 2020-05-28 10:45:36 nexus_writer_service.subscribers.session_writer: [exp-3 (RUNNING)] [13_fscan-3 (FAULT)] Scan 13.1 already exists in /data/id22/inhouse/id222003/id22/Laurent_Tests_260520/Laurent_Tests_260520_0002/Laurent_Tests_260520_0002.h5
As you can see: same file, same scan number. This should never be possible.
It is hard to figure out what happened post mortem. Bliss tries to get the last scan number from Redis and if that does not work (e.g. db has been flushed) then it looks at the file:
- Determine next scan number from Redis
node = SCAN_SAVING.get_parent_node()
last = int(node.connection.hget(node.db_name, "last_scan_number"))
- Determine next scan number from HDF5 file
last = max(int(i.split(".")[0]) for i in SCAN_SAVING.writer_object.get_scan_entries())
So if this happens again to someone, run these commands @claustre @matias.guijarro @sebastien.petitdemange
Edited by Wout De Nolf