Add cuda stream support to CudaProcessing, SinoFilter, Backprojector etc..
A stream to the CudaProcessing
and derived objects would allow to perform several tasks in //, e.g.:
- one stream to upload data
- one stream to process
- one stream to download data
I use this in PyNX to swap data https://gitlab.esrf.fr/favre/PyNX/-/blob/devel/pynx/holotomo/cu_operator.py#L2006
It would 'only' need an optional input stream parameter to the CudaProcessing
constructor, and all derived cuda calls to use the supplied stream. Setting up the stream(s) would be up to the calling process.
Besides speed for regular tomo processing, it could also be useful If I want to add a tomo regularisation to PyNX reconstructions (would be very time-consuming and limited to problems which fit in the main computer memory but with 512 GB there's already lots to do).
Once this is possible we can complain again the network is too slow :-)