Commit 153b8efd authored by Pierre Paleo's avatar Pierre Paleo
Browse files

Remove nabu_tasks.md

parent 6b257dec
......@@ -66,7 +66,6 @@ Advanced documentation
tests.md
pipeline.md
nabu_tasks.md
validators.md
......
# Nabu tasks representation
This page explains how Nabu represents the lists of processing steps internally.
``` note:: This page is intended for Nabu developers.
```
## Tasks representation in a nutshell
In order to define the processing pipeline, basically two pieces of information are needed:
- The processing steps: "what to do ?"
- The options of each step: "how to do it" ?
Additionally, if tasks are to be distributed (on the local machine processors, or on a computing cluster), additional information is needed:
- How to distribute the computations: "what are the tasks done by each worker ?"
## Limitations
### Simple processing pipeline
In its current form, the nabu tasks representation only allows "linear" pipelines, where each step is done exactly once. This however can be easily extended to complex pipelines if needed.
### Computations distribution
Nabu uses the [dask.distributed](https://distributed.readthedocs.io) Python module for distributing the computations with a [RPC](https://en.wikipedia.org/wiki/Remote_procedure_call) approach, with the paradigm "move the computing resources, not the data".
By design of Nabu, and thanks to the synchrotron parallel beam geometry, each worker handles a subset ([chunk](definitions.md)) of the data. No synchronisation or data exchange between workers is needed.
However, if at some point the workers need to exchange a notable amount of data, this approach becomes less practical (but achievable) ; and shared-memory solutions like MPI would be more appropriate.
### Chunk processing
The current module [nabu.app](apidoc/nabu.app) is designed to process the data by [chunks](definitions.md). If the detector is very wide (horizontally) and if there are many projections, this approach could require more memory than available.
## Why using an additional tasks representation ?
The [ProcessConfig object](apidoc/nabu.resources.processconfig) contains all the necessary information for the tomography processing. Therefore, one might ask why another data structure is used to represent the processing steps. There are two main reasons:
- This data structure is more adapted to tasks distributions ("what to do" and "how to do it"), while the `ProcessConfig` object is a translation of the user configuration file, and therefore can be seen as a user interface.
- The nabu configuration file sections/keys/default values [might change when deemed appropriate](nabu_config_file.md#compatibility-policy). To ensure that a nabu version is compatible with a former one, this data structure serves to "absorb" the changes between user interfaces and internal components.
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment