payno requested to merge cast_volumes into master Jul 07, 2022

TODO

add a nabu-cast application
add nabu.io.cast_volume module. Quick overview:
- get_default_output_volume function: generate default output volume according to the extected type. Will be used by nabu-cast application and tomwer
- cast_volume: function processing the volume cast
- find_histogram: look for some histogram saved regarding the volume and retrieve data_min and data_max from it.
- clamp_and_rescale_data clamp and rescale data ^^
add tests
add doc
test on (more) raw data
add some 'helpers' from the application. Like easing volume creation or defining output data type.

extra information

CLI API:

usage: nabu-cast [-h] [--output-data-type OUTPUT_DATA_TYPE] [--output_volume_url OUTPUT_VOLUME_URL] [--output_type OUTPUT_TYPE] [--overwrite] input_volume_url

positional arguments:
  input_volume_url      input volume url like: 
                        - EDFVolume      : edf:volume:/path/to/my/my_folder ; edf:volume:/path/to/my/my_folder?file_prefix=basename (if myname != folder name)
                        - HDF5Volume     : hdf5:volume:/path/to/file_path?data_path=entry0000
                        - JP2KVolume     : jp2k:volume:/path/to/my/my_folder ; jp2k:volume:/path/to/my/my_folder?file_prefix=basename (if myname != folder name)
                        - MultiTIFFVolume: tiff3d:volume:/path/to/tiff_file.tif
                        - TIFFVolume     : tiff:volume:/path/to/my/my_folder ; tiff:volume:/path/to/my/my_folder?file_prefix=basename (if myname != folder name)

optional arguments:
  -h, --help            show this help message and exit
  --output-data-type OUTPUT_DATA_TYPE
                        output data type. Valid value are numpy default types name like (uint8, uint16, int8, int16, int32, float32, float64)
  --output_volume_url OUTPUT_VOLUME_URL
                        output volume url. Must be provided if 'output_type' isn't. Must looks like: 
                        - EDFVolume      : edf:volume:/path/to/my/my_folder ; edf:volume:/path/to/my/my_folder?file_prefix=basename (if myname != folder name)
                        - HDF5Volume     : hdf5:volume:/path/to/file_path?data_path=entry0000
                        - JP2KVolume     : jp2k:volume:/path/to/my/my_folder ; jp2k:volume:/path/to/my/my_folder?file_prefix=basename (if myname != folder name)
                        - MultiTIFFVolume: tiff3d:volume:/path/to/tiff_file.tif
                        - TIFFVolume     : tiff:volume:/path/to/my/my_folder ; tiff:volume:/path/to/my/my_folder?file_prefix=basename (if myname != folder name)
  --output_type OUTPUT_TYPE
                        output type. Must be provided if 'output_volume_url' isn't. Valid values are ('h5', 'hdf5', 'nexus', 'nx', 'npy', 'npz', 'tif', 'tiff', 'jp2', 'jp2k', 'j2k', 'jpeg2000', 'edf')
  --overwrite           Overwrite file or dataset if exists

related to

tomoscan!84 (merged)

usage examples

nabu-cast hdf5:volume:./bambou_hercules_0001slice_1080.hdf5?data_path=entry0000/reconstruction --output-data-type uint8 --output_volume_url tiff:volume:./cast_volume?file_prefix=bambou_hercules_0001slice_1080

discussions

should we do some check between the output format type and the expected output_type to prevent some errors (like tiff vs float)
could we propose a default output data type from the output format ? (like tiff / uint16 or uint8) ?
regarding usages I guess we should be able to simplify it by trying to deduce automatically some volumes from raw file path (at least for hdf5 ? ). but the question is how far do we want to do go this way ? this can quickly complexity code and add some corner cases. But I also think we should go one step further to ease usage. About this point I think we could make a simple util at tomoscan level (like guess_volume in tomoscan.esrf.volume.utils) that could:
- if a file is provided check the extension. If tiff then expect a multitiff, if a hdf5 check for some "reconstruction/results" dataset
- if a folder is provided check what is the higher present file format and try to deduce a volume from it (but basename... can quickly be an issue)
This has been done in tomoscan!85 (merged)
in the hdf5 volume case: for now we expect the user to provide the equivalent of url value. As a user I guess providing data dataset (where the volume is) could be more convenient. Maybe we should also take care of this use case ?

Edited Jul 26, 2022 by payno

add volume casting

TODO

extra information

related to

usage examples

discussions

Merge request reports