Wout De Nolf requested to merge optimize_streamline into main Mar 20, 2024

Streamline overheads considered

QR-reader tuning: 5 seconds
Exposure/attenuator optimization: 3 sec per sample when using a ct (0.8 sec when using an ascan with save=False)
ID31 scan overhead (p3 initialization + metadata gathering): 1 to 2 sec for an sct
In addition overheads with newsample and enddataset were discovered (see below, ~1 sec each when ~2500 datasets)
Is the P3 saving temporary EDF files?

The current situation

QR-reader tuning whenever we read "QRCODE_NOT_READABLE" (for example when there is no QR code)
exposure/attenuator optimization for each sample individually with a ct (and making sure we base the calculation on a pixel value that is within the dynamic range of the camera and not too low)

This MR

Individual tuning and exposure/attenuator optimization is the most robust.

This MR adds streamline_scanner options to exchange robustness for speed when desired:

SIXC [5]: streamline_scanner
    Robust vs. speed:
        verify_qrcode           False     
        autotune_qrreader_per   'baguette'
        optimize_exposure_per   'baguette'

    Testing:
        dryrun   False

autotune_qrreader_per:
- None: never tune
- "sample": tune when a sample QR code is "QRCODE_NOT_READABLE"
- "baguette": tune one time after loading when QR code is "QRCODE_NOT_READABLE" for the first (or second or third or ... it looks for the first successful read)
optimize_exposure_per:
- None: keep the current attenuator and measure for the requested time (1 sec by default)
- "sample": determine optimal attenuator and exposure time for each sample individually to reach a certain max pixel value
- "baguette": determine optimal attenuator and exposure time for all samples with a single scan at fixed attenuator position
verify_qrcode:
- False: read QR code only once (when moving to a hole)
- True: verify the QR-code before and after each measurement. QR-codes are read 3 times per sample when enabled (autotune_qrreader_per applies to the two additional reads as well)
dryrun:
- False: scan normally (i.e. save data + trigger workflows)
- True: scan without the actual measurement to check the total overhead (i.e. the actual sct that saves data + triggers a workflow)

Profiling

For the tests I used a baguette with one missing QR code and measured the average time per sample for a full baguette run.

Profiling commands are

SIXC [1]: user_script_load("/users/blissadm/local/xrpd/blissprofile/id31_streamline.py")
SIXC [2]: user.timeit_holder_scan()  # takes 2.5h to complete
SIXC [3]: user.timeit_holder_scan_original(dryrun=False)
SIXC [4]: user.timeit_holder_scan_new(dryrun=False)
SIXC [5]: user.profile_holder_scan()
SIXC [6]: user.timeit_optimize_exposure()  # ascan vs. individual ct optimization

REMARK: as time goes on, the system becomes slower. For ~2500 datasets a newdataset() call takes almost 1 sec blissstats_opid31_pstat_newdataset.pyprof. So do not look at the absolute times but the different between different settings.

These are the most important profiling results

dryrun    verify_qrcode    autotune_qrreader_per    optimize_exposure_per    time (sec/sample)
--------  ---------------  -----------------------  -----------------------  -------------------
False     True             sample                   sample                   12.038 ± 0.593 (original)
False     False            baguette                 baguette                 6.766 ± 0.037  (new)

True      True             sample                   sample                   6.444 ± 0.127  (original overhead only)
True      False            baguette                 baguette                 2.373 ± 0.051  (new overhead only)

True      False            -                        -                        1.014 ± 0.003  (raw overhead)

original: like it was before
new: like it is with the new default options
raw: just baguette moving, QR-code reading and data policy commands

So the optimization improved the speed by 2x while still having baguette-wise tuning and exposure optimization.

As said before, the absolute value of the time seems to vary alot. Here is another run

dryrun    verify_qrcode    autotune_qrreader_per    optimize_exposure_per    time (sec/sample)
--------  ---------------  -----------------------  -----------------------  -------------------
False     True             sample                   sample                   12.837 ± 0.378 (original)
False     False            baguette                 baguette                 6.923 ± 0.026  (new)

True      True             sample                   sample                   6.993 ± 0.042  (original overhead only)
True      False            baguette                 baguette                 3.462 ± 0.068  (new overhead only)

True      False            -                        -                        1.955 ± 0.042  (raw overhead)

All permutations in a fresh proposal

dryrun    verify_qrcode    autotune_qrreader_per    optimize_exposure_per    time (sec/sample)
--------  ---------------  -----------------------  -----------------------  -------------------
True      True             -                        -                        1.312 ± 0.115
True      True             -                        sample                   4.376 ± 0.069
True      True             -                        baguette                 2.654 ± 0.052
True      True             sample                   -                        2.951 ± 0.192
True      True             sample                   sample                   6.444 ± 0.127
True      True             sample                   baguette                 4.303 ± 0.067
True      True             baguette                 -                        1.256 ± 0.021
True      True             baguette                 sample                   4.771 ± 0.043
True      True             baguette                 baguette                 2.651 ± 0.041
True      False            -                        -                        1.014 ± 0.003
True      False            -                        sample                   4.522 ± 0.067
True      False            -                        baguette                 2.495 ± 0.071
True      False            sample                   -                        1.535 ± 0.005
True      False            sample                   sample                   4.975 ± 0.010
True      False            sample                   baguette                 2.993 ± 0.066
True      False            baguette                 -                        1.020 ± 0.008
True      False            baguette                 sample                   4.540 ± 0.050
True      False            baguette                 baguette                 2.373 ± 0.051
False     True             -                        -                        4.569 ± 0.017
False     True             -                        sample                   10.409 ± 0.288
False     True             -                        baguette                 6.248 ± 0.093
False     True             sample                   -                        6.361 ± 0.116
False     True             sample                   sample                   12.038 ± 0.593
False     True             sample                   baguette                 7.901 ± 0.154
False     True             baguette                 -                        4.880 ± 0.014
False     True             baguette                 sample                   10.931 ± 0.292
False     True             baguette                 baguette                 6.508 ± 0.088
False     False            -                        -                        4.799 ± 0.022
False     False            -                        sample                   10.622 ± 0.234
False     False            -                        baguette                 6.236 ± 0.077
False     False            sample                   -                        5.446 ± 0.016
False     False            sample                   sample                   11.411 ± 0.249
False     False            sample                   baguette                 7.059 ± 0.070
False     False            baguette                 -                        5.157 ± 0.040
False     False            baguette                 sample                   11.156 ± 0.142
False     False            baguette                 baguette                 6.766 ± 0.037

https://jira.esrf.fr/browse/DPDEV-203

profiling: dau/devops/bliss/blisshelpers/blissprofile!1
streamline_changer: streamline_changer!28 (merged)

Reference

Qr-code reader

SR700NL20.autotuning: starts from the last tunned bank or bank 3, lets the reader tune itself until in succeeds or fails
SR700NL20.read(autoTuningAllowed=True): reads the qrcode, calls SR700NL20.autotuning when failed
SampleChanger.tune_qrreader: takes about 5 seconds with force=True

optimize exposure/attenuator

limatake and ct take the same time but limatake (1 sec overhead) does not print table of counters so use that one

SIXC [56]: with bench():
      ...:     limatake(0.2)
acquisition chain
└── p3
    └── roi_counters

Scan 52 2024-03-20T15:00:19.478835+01:00 None sixc user = opid31
limatake 0.2000 1
p3 acq #1
Finished (took 0:00:01.006782)

Execution time: 1s 145ms 128μs

SIXC [57]: with bench():                                                                                                                                                                                                      
      ...:     ct(0.2)                                                                                                                                                                                                        
ct: elapsed 1.012 s  (abort with Ctrl-c) 
Execution time: 1s 100ms 795μs

Id31StreamlineScanner._optimize_sample_exposure takes about 3 sec which is the sum of
- setup_globals.att(position) takes about 1 sec
- moving the blades from 14 to 31 or v.v. takes between 0.5 and sec (we do it at least 2 times)
- limatake takes about 1 sec
determining all attenuator/exposure conditions with an ascan at fixed attenuator position takes ... sec/sample and with individual ct's at variable attenuator position (when counts are too high or low) takes ... sec/sample.

SIXC [1]: user_script_load("/users/blissadm/local/xrpd/blissprofile/id31_streamline.py")
SIXC [2]: user.timeit_optimize_exposure()

Edited Mar 26, 2024 by Wout De Nolf

speedup streamline

Streamline overheads considered

The current situation

This MR

Profiling

Related

Reference

Qr-code reader

optimize exposure/attenuator

speedup streamline

Streamline overheads considered

The current situation

This MR

Profiling

Related

Reference

Qr-code reader

optimize exposure/attenuator

Merge request reports