SKIP option does not immediately quit the process
The master file should be checked before launching the integration.
(2021.1) slurm-nice-devel2904:ihma109/id15/test_dec2021 % integrate-slurm integrator.conf
/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/dask_jobqueue/core.py:20: FutureWarning: tmpfile is deprecated and will be removed in a future release. Please use dask.utils.tmpfile instead.
from distributed.utils import tmpfile
Will process the following datasets:
[ID15Dataset(fname=/data/id15/inhouse4/ihma109/id15/ACL9011001b/ACL9011001b_0002/ACL9011001b_0002.h5, entry=1.1),
ID15Dataset(fname=/data/id15/inhouse4/ihma109/id15/ACL9011001b/ACL9011001b_0003/ACL9011001b_0003.h5, entry=1.1)]
Spawning workers
Spawning integrators
New dataset /data/id15/inhouse4/ihma109/id15/ACL9011001b/ACL9011001b_0002/ACL9011001b_0002.h5
Will process dataset: /data/id15/inhouse4/ihma109/id15/ACL9011001b/ACL9011001b_0002/ACL9011001b_0002.h5 into output file: /data/id15/inhouse4/ihma109/id15/test_dec2021/ACL9011001b/ACL9011001b_0002/azint_ACL9011001b_0002_scan0001.h5
4/125 - ETA 606 secs
6/125 - ETA 597 secs
Traceback (most recent call last):
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/integrator/app/integrate_slurm.py", line 105, in integrate_slurm_cli
wait(futures, timeout=healthcheck_period)
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/client.py", line 4329, in wait
result = client.sync(_wait, fs, timeout=timeout, return_when=return_when)
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/client.py", line 865, in sync
return sync(
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/utils.py", line 327, in sync
raise exc.with_traceback(tb)
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/utils.py", line 310, in f
result[0] = yield future
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/tornado/gen.py", line 762, in run
value = future.result()
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/client.py", line 4300, in _wait
await future
File "/usr/lib/python3.8/asyncio/tasks.py", line 501, in wait_for
raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/bin/integrate-slurm", line 8, in <module>
sys.exit(integrate_slurm_cli())
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/integrator/app/integrate_slurm.py", line 109, in integrate_slurm_cli
DI.get_eta()
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/integrator/distributed_integration.py", line 570, in get_eta
eta = (len(self._tasks) - n_finished)/speed
ZeroDivisionError: float division by zero
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOLoop object at 0x7fe26a3103a0>>, <Task finished name='Task-1445' coro=<Cluster._sync_cluster_info() done, defined at /scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/deploy/cluster.py:104> exception=CommClosedError("Exception while trying to call remote method 'set_metadata' before comm was established.")>)
Traceback (most recent call last):
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/comm/tcp.py", line 205, in read
frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/core.py", line 787, in send_recv_from_rpc
result = await send_recv(comm=comm, op=key, **kwargs)
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/core.py", line 640, in send_recv
response = await comm.read(deserializers=deserializers)
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/comm/tcp.py", line 221, in read
convert_stream_closed_error(self, e)
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/comm/tcp.py", line 128, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) rpc.set_metadata local=tcp://160.103.228.95:46068 remote=tcp://160.103.228.95:36087>: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
ret = callback()
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
future.result()
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/deploy/cluster.py", line 105, in _sync_cluster_info
await self.scheduler_comm.set_metadata(
File "/scisoft/tomotools_env/integrator/ubuntu20.04/x86_64/2021.1/lib/python3.8/site-packages/distributed/core.py", line 790, in send_recv_from_rpc
raise type(e)(
distributed.comm.core.CommClosedError: Exception while trying to call remote method 'set_metadata' before comm was established.