Skip to content

When max_shape is set, h5py will guess the chunk size when not provided

Wout De Nolf requested to merge nexus_writer_chunk_fix into master

guess_dataset_config guesses the appropriate chunk shape for a particular dataset. When no chunking is needed (e.g. because dataset is small) it is set to None. However when compression or resizing is enabled, the h5py library is ignoring the None and guessing a chunk size itself. This chunk size could be very inappropriate, causing the file to be very large (happened in #3450 (closed)).

The solution is to not set the the chunk size to None but to the total (expected) dataset shape.

Requires a 1.10.x backport.

guess_dataset_config does not exist in Bliss 1.9.x and lower. A different fix would be needed.

Edited by Wout De Nolf

Merge request reports