Adapt chunks not fitting in GPU
Nabu processes the data by chunks, as does PyHST.
The current design defines a "maximum possible chunk height" based on GPU memory, with the insight that all the chunk should fit in the GPU.
However, some steps might put a constraint on the minimum chunk height. For example, the Paganin filter might have a large kernel (see MARGE
in PyHST). In this case, we might have min_required_chunk_height > max_gpu_chunk_height
.
Nabu should be able to handle this case: process chunks height larger than GPU memory:
- Use a chunk height that satisfy the "Phase retrieval requirement" (
min_required_chunk_height
) - For radio-processing, process by individual radios (as it is the case now !). These (sub-)radios might be bigger than
max_gpu_chunk_height
, but they are processed individually. - For sino-processing, use sub-chunks (i.e sub-stacks here). This entails memory transfers.
With this approach, the factor defining the maximum chunk height becomes the main CPU memory.
A positive side-effect is that it is a step forward heterogeneous computing, where workers have different computing resources.
Edited by Pierre Paleo