Skip to content

Cuda context handling for Cuda >= 11

Pierre Paleo requested to merge cuda_ctx into master

This MR is an attempt at fixing #247 (closed).

A reconstructor may instantiate several pipelines objects in a reconstruction (since each pipeline is tied to a certain chunk/group size). It is critical to release the Cuda memory when re-initializing a pipeline.
In theory, pycuda transparently releases the resources when deleting the object (RAII model), but Python's multiple references and garbage collection make things more difficult. Therefore, doing self.pipeline.d_radios = None is not enough to actually release the memory.
The only working method was found to be self.pipeline = None; import gc; gc.collect().
Prior to cuda 11, this worked well. But from Cuda >= 11, we get "invalid context handle" errors for some reason.

For now, the workaround is to manually handle cuda contexts:

  • Context is created (or pushed) at each new pipeline instantiation (through nabu.cuda.utils.get_cuda_context())
  • When a new pipeline has to be re-initialized, the previous one is destroyed
    • For Cuda < 11, this is fine
    • For Cuda >= 11, we also have to re-push the cuda context
  • Create a new pipeline instance

The drawback is that he context handling is now done by Reconstructor.

Update: new workaround found - see !150 (comment 152136)

Close #247 (closed)

Edited by Pierre Paleo

Merge request reports