Handling heavy scan metadata in Blissdata
Related to #4163
Exploring some approaches...
-
Embed path to an external file into scan's JSON
- pros:
- no modification
- data type can be anything if there is a file format for it
- possibility to link a single file to multiple scans
- cons:
- not using Redis exposes us to slow filesystem problems
- file should be managed rigorously for blissdata to be reliable (creation order, lifetime, no overwriting)
- pros:
-
Embed directly into scan's JSON and tell that part of it is disposable
- pros:
- no need for type definition if human readable
- cons:
- JSON is not suited for large binaries (should be human readable, parsers don't like heavy stuff)
- making JSON bigger will slow down scan.load() and put extra load on Redis
- need to modify blissdata protocol (including memory tracker)
- pros:
-
Use a dedicated stream and publish to it once the scan is running
- pros:
- can reuse ndarray streams for typing
- cons:
- can only be published during running scan or requires blissdata protocol modifications (and potentially a new intermediate scan state)
- need a new type of stream if it can't fit ndarray
- pros:
-
Use additional Redis keys and make the memory tracker to clean them
- pros:
- ...
- cons:
- need for explicit types definition as neither stream encoder, JSON or file will handle it (no pickle), msgpack ?
- need to modify blissdata protocol (including memory tracker)
- pros:
-
Use a scan sequence with an extra channel containing large metadata artifact along the actual scan
- pros:
- no modification at all
- can link a single artifact to multiple scans
- cons:
- makes scan access more complex
- pros:
Edited by Lucas Felix