Spring cleaning: refactoring, API changes, consistency checks
Nabu has seen an increase in the number of features at a steady pace from the beginning. However some of the choices made at the time turn out not to be the best. This issue tries to list the things that can be improved.
The ultimate goal is to have an easy to understand module in terms of architecture: external people should be able to "dive" in the code base without too much hassle.
This "spring cleaning" WILL result in API changes, although we should try to minimize the impact. It would best be done now rather than with a larger users base.
Modules to be renamed/moved
Currently the processing pipelines (FullField, FullRadios, ...) are put in
nabu.app. The "app" name might suggest an unstable API, which we would like to avoid.
nabu.pipelinewould be better. (This name was avoided in the past to make it clear that no generic pipeline engine would be implemented).
nabu.distributedis to be removed
nabu.cudacontains general-purpose processing modules (GPP) like
unsharp_cudawhich is also a GPP module. So either we put all the GPP cuda modules in
nabu.cuda, or we only put in cuda the specific helpers (kernel, etc).
nabu.io.writermight be better split into specific files for clarity (ex.
nabu.io.nxwriter), of course all of them being accessible from
PaganinPhaseRetrieval, while CTF has its own file. So
Paganinclasses might better be put in dedicated files, and
nabu.preproc.phasewould "redirect" to the correct classes.
nabu.preproc.ccdmight be too generic
nabu.resourceslooks like a "misc" module. It contains too many modules with different purposes. Its scope should be re-defined. For example
nabu.resources.nxflatfieldshould be moved to
The CLI tools should not be in
Don't add useless constraints in API.
In the beginning, main classes like
SinoProcessing were created. They were supposed to provide a clear architecture and avoid boilerplate code. However, it turns out that these classes are trouble:
- Multiple inheritance + "diamond problem". Example: CudaSinoProcessing and SinoDeringer both inherit from SinoProcessing, and CudaSinoDeringer should inherit from both of them.
- There is actually almost no "boilerplate" code that can be factored in a general-purpose parent class
- Too much constraints due to the API. For example the shape might not have to be fixed at class instantiation.
SinoProcessing are bound to be removed. Actually few classes inherit from them, and none (?) do in a meaningful way.
Building blocks (processing classes and functions) should primarily act on arrays
Data processing and data reading must be decoupled.
For example in
FlatField, it makes no sense to pass
DataUrl object in a processing class.
This module will be split into two modules with different scopes.
Purpose: provide various utility tools: logger, gpu, machinesdb, ...
Purpose: make the link between a
(config_file, dataset) pair and the processing pipeline.
The principle is:
- Nabu ingests a user configuration (file) and a dataset
- This module generates an internal pipeline description (processing steps and options)
A dedicated page in the documentation will be written on these steps (see comment below).