minor doc stuff

d58460ce · Mauro Rovezzi · 4d63973c · d58460ce
Commit d58460ce authored 5 months ago by Mauro Rovezzi
--- a/notebooks/data_reduction.ipynb
+++ b/notebooks/data_reduction.ipynb
@@ -8,8 +8,10 @@
    "\n",
    "- This notebook is based on a **work in progress project**, called `famewoks` and may be affected by bugs\n",
    "- `famewoks` is publicly available at [https://gitlab.esrf.fr/F-CRG/fames/famewoks](https://gitlab.esrf.fr/F-CRG/fames/famewoks)\n",
-    "- Subscribe to the `fame-data-analysis@esrf.fr` mailing list to be kept updated about bug fixes and new features\n",
-    "- Report bugs and features requests in the [famewoks issues tracker](https://gitlab.esrf.fr/F-CRG/fames/famewoks/-/issues) or by directly sendin an email to [mauro.rovezzi@esrf.fr](mailto:mauro.rovezzi@esrf.fr)\n",
+    "- Subscribe to the [`fame-data-analysis@esrf.fr` mailing\n",
+    "  list](https://sympa.esrf.fr/sympa/info/fame-data-analysis) to be kept updated\n",
+    "  about bug fixes and new features\n",
+    "- Report bugs and features requests in the [famewoks issues tracker](https://gitlab.esrf.fr/F-CRG/fames/famewoks/-/issues) or by directly sending an email to [mauro.rovezzi@esrf.fr](mailto:mauro.rovezzi@esrf.fr)\n",
    "- To run this notebook at BM16:\n",
    "    - (production/stable) go to [http://jade:8000](http://jade:8000) and run the notebook with the `sloth (v2312)` kernel\n",
    "    - (development) go to [http://bm16ctrl:8000](http://bm16ctrl:8000) and run the notebook with the `sloth` kernel\n"

 %% Cell type:markdown id: tags:

 # Data reduction and evaluation workflow

 - This notebook is based on a **work in progress project**, called `famewoks` and may be affected by bugs
 - `famewoks` is publicly available at [https://gitlab.esrf.fr/F-CRG/fames/famewoks](https://gitlab.esrf.fr/F-CRG/fames/famewoks)
- Subscribe to the `fame-data-analysis@esrf.fr` mailing list to be kept updated about bug fixes and new features
- Report bugs and features requests in the [famewoks issues tracker](https://gitlab.esrf.fr/F-CRG/fames/famewoks/-/issues) or by directly sendin an email to [mauro.rovezzi@esrf.fr](mailto:mauro.rovezzi@esrf.fr)
+- Subscribe to the [`fame-data-analysis@esrf.fr` mailing
+  list](https://sympa.esrf.fr/sympa/info/fame-data-analysis) to be kept updated
+  about bug fixes and new features
+- Report bugs and features requests in the [famewoks issues tracker](https://gitlab.esrf.fr/F-CRG/fames/famewoks/-/issues) or by directly sending an email to [mauro.rovezzi@esrf.fr](mailto:mauro.rovezzi@esrf.fr)
 - To run this notebook at BM16:
    - (production/stable) go to [http://jade:8000](http://jade:8000) and run the notebook with the `sloth (v2312)` kernel
    - (development) go to [http://bm16ctrl:8000](http://bm16ctrl:8000) and run the notebook with the `sloth` kernel

 %% Cell type:markdown id: tags:

 First set main imports and global settings for the experiment (session)

 %% Cell type:code id: tags:

 ``` python
 %load_ext autoreload
 %autoreload 2

 # Uncomment the following two lines if the plots do not show in the notebook
 # import plotly.io as pio
 # pio.renderers.default = "iframe"

 import os
 from dataclasses import asdict
 from IPython.display import display
 from famewoks import __version__ as wkflver
 from famewoks.datamodel import ExpSession, CNTS_FLUO_XMAP, CNTS_FLUO, CNTS_CAS
 from famewoks.tests.datainfos import DATAINFOS, get_exp_session
 from famewoks.plots import plot_data, plot_eshift
 from famewoks.bliss2larch import (
    search_samples,
    show_samples_info,
    search_datasets,
    load_data,
    get_scans,
    get_group,
    set_enealign,
    apply_eshift,
    set_bad_fluo_channels,
    set_bad_scans,
    set_bad_samples,
    merge_data,
    save_data,
 )
 from famewoks.bliss2larch import _logger


 # adjust the logger level:
 # "DEBUG" -> show all messages
 # "INFO" -> usesful messages
 # "WARNING" -> warnings only
 # "ERROR" -> only errors
 _logger.setLevel("DEBUG")

 #show workflow version (and famewoks branch information)
 branch = os.popen("cd ~/devel/famewoks; git branch --show-current").read()[:-1]
 _logger.info(f"--> Using famewoks version: {wkflver} (branch: {branch})")
 ```

 %% Cell type:markdown id: tags:

 Initialize the `ExpSession` object, which is the representation of the whole experimental session with data and metadata and show it

 %% Cell type:code id: tags:

 ``` python
 # define the experimental session metadata manually
 session = ExpSession(
    flag=1,
    datadir="/data/visitor/es1423/bm16/20240716",
    proposal="es1423",
    session="20240716",
    proposer="Sanchez",
    lc="MR, JLH",
    elem="As",
    edge="K",
    comment="",
    counters=CNTS_FLUO_XMAP, #other possible choices: CNTS_FLUO, CNTS_CAS
    samples=[],
    bad_samples=[],
    bad_fluo_channels=None,
    enealign=None,
 )

 # (testing) otherwise it is possible to get the session from a collection of previous experiments
 session = get_exp_session(DATAINFOS, proposer="Biswas",session="20240924")
 #session = get_exp_session(DATAINFOS, proposer="Sanchez",session="20240716")
 #session = get_exp_session(DATAINFOS, proposer="Stellato",session="20240625")
 #session = get_exp_session(DATAINFOS, proposer="Bertrand",session="20240910")

 # display the session metadata
 display(asdict(session))
 ```

 %% Cell type:markdown id: tags:

 ## Search samples and datasets

 Search for the samples names available in the given experimental session.

 It is possible to use the parameter `ignore_names = ["list", "of", "strings"]` to ignore those samples names containing such words.

 %% Cell type:code id: tags:

 ``` python
 samples = search_samples(session, ignore_names=["rack", "mount", "align", "bl_", "sample"])
 ```

 %% Cell type:markdown id: tags:

 if you want to skip some samples, that is, marking them bad (`flag = 0`)

 %% Cell type:code id: tags:

 ``` python
 set_bad_samples(session, [15])
 ```

 %% Cell type:markdown id: tags:

 to show all the samples, including those marked as bad (`flag = 0`)

 %% Cell type:code id: tags:

 ``` python
 show_samples_info(session, all=True)
 ```

 %% Cell type:markdown id: tags:

 select a sample and search for the datasets. The function shows the index of each dataset and the number of XAS scans available

 %% Cell type:code id: tags:

 ``` python
 sel_sample = 5
 datasets = search_datasets(session, sample=sel_sample)
 ```

 %% Cell type:markdown id: tags:

 ## Load data

 Select a dataset and load the scan data into the session (e.g. load from disk to memory).

 **Parameters for `load_data()`**

 - `skip_scans`: the scans that are not going to be loaded (e.g. bad scans), it can be a list of numbers `[1,2,3]` or a string `"1:4, 7"`
 - `use_fluo_corr`: if True, it uses the dead-time corrected fluorescence channel (**NOTE** this correction usually fails at low count rates, check with/without correction so see which is the lower noise configuration)
 - `iskip`: the index of the initial data points to skip (by default we skip the first point)
 - `merge`: to automatically merge the scans in a dataset (True)
 - `calc_eshift`: fit the energy shift using the first scan of the dataset as reference (*NOTE* this slows down the loading)

 %% Cell type:code id: tags:

 ``` python
 sel_dataset = 0
 dataset = datasets[sel_dataset]
 load_data(
    session,
    dataset,
    use_fluo_corr=False,
    iskip=1,
    calc_eshift=True,
    merge=True,
    skip_scans=[],
 )
 ```

 %% Cell type:markdown id: tags:

 ## Data plotter

 Plot the data loaded in the dataset

 **Parameters for `plot_data()`**

 - `data` can be:
    - `"fluos"`: to show all fluorescence channels
    - `"fluo"`: sum of active fluorescence channels (use `set_bad_fluo_channels()` for excluding bad ones)
    - `"trans"`: sample transmission (*muT1*)
    - `"ref"`: reference "foil" transmission (*muT2*)
 - `ynorm`: *None* or `area` (shows y data normalized by their area)
 - `show_slide`: if True shows one scan at time with a slider
 - `show_i0`: *True* shows I0 signal (*NOTE* for `data = "ref"` it is I1 signal)
 - `show_e0`: *True* shows E0 (as found by the `pre_edge()` function of Larch)
 - `show_merge`: *True* shows the merged signal (sum of the channels for the current scan)
    - if `"rebin"` it shows the rebinned version of the merge (*NOTE* the single scans are never rebinned, as they are meant to be merged (and then rebinned))

 %% Cell type:code id: tags:

 ``` python
 fig = plot_data(
    dataset,
    data="ref",
    ynorm="area",
    show_slide=True,
    show_i0=False,
    show_e0=False,
    show_merge=False,
 )
 ```

 %% Cell type:markdown id: tags:

 The plot function permits digging into the whole data for a single dataset. The following functions permits acting on the data (you may go back in the workflow and plot the data again)

 %% Cell type:markdown id: tags:

 ## Energy alignment

 First set a reference group for the energy alignment

 %% Cell type:code id: tags:

 ``` python

 energy_reference_group = get_group(dataset, scanint=2, data="ref")
 set_enealign(session, energy_reference_group)
 ```

 %% Cell type:markdown id: tags:

 Now go back and use `calc_eshift=True` in the `load_data()` function

 %% Cell type:markdown id: tags:

 Plot the energy shifts

 %% Cell type:code id: tags:

 ``` python
 efig = plot_eshift(session=session, dset=dataset, show_e0=False, array="dmude")
 ```

 %% Cell type:code id: tags:

 ``` python
 apply_eshift(dataset)
 ```

 %% Cell type:markdown id: tags:

 ## Data cleaning

 Set bad scans

 %% Cell type:code id: tags:

 ``` python
 set_bad_scans(dataset, scan=4)
 ```

 %% Cell type:markdown id: tags:

 To exclude the bad channels. If `scan=None` it will exclude the `channels` for all scans

 %% Cell type:code id: tags:

 ``` python
 set_bad_fluo_channels(dataset, channels="", scan=None)
 ```

 %% Cell type:code id: tags:

 ``` python
 ```

 %% Cell type:code id: tags:

 ``` python
 ```

 %% Cell type:markdown id: tags:

 ## Merge data

 Merge the scans in the dataset

 (*NOTE* `pre_edge` and `rebin_xafs` are applied on the merged group),

 *Default parameters for Larch `rebin_xafs()` function* :

 - `pre1`: pre_step*int((min(energy)-e0)/pre_step)
 - `pre2` : -30
 - `pre_step`: 2
 - `exafs1` :-15
 - `exafs2` : max(energy)-e0
 - `xanes_step` : e0/25000 , round down to 0.05
 - `method` : centroid

 %% Cell type:code id: tags:

 ``` python
 merge_data(dataset)
 ```

 %% Cell type:markdown id: tags:

 ## Save data

 Save the data to an Athena project file (*NOTE*: the file is overwritten each time).

 To save the rebinned spectra, use the option `save_rebinned=True`

 %% Cell type:code id: tags:

 ``` python
 save_data(dataset, session.datadir, save_rebinned=False)
 ```

 %% Cell type:markdown id: tags:

 ## Related workflows (optional)

 In this section are shown some examples of related workflows.

 %% Cell type:markdown id: tags:

 ### Load data for the whole experiment

 Create a session and load all samples and all datasets of the experiment.
 (*NOTE* here the `session` is created from a stored list of experimental
 sessions, otherwise create your own `ExpSession` object as shown at the
 beginning of this notebook).

 %% Cell type:code id: tags:

 ``` python
 do_merge_and_save = False #: to merge and save the data

 _logger.setLevel("WARNING")  # change the logger level to get more/less output
 session = get_exp_session(DATAINFOS, proposer="Stellato", session="20240625")
 allsamples = search_samples(session, verbose=False)
 errors = []
 for samp in allsamples:
    alldatasets = search_datasets(session, sample=samp.name, verbose=False)
    for dset in alldatasets:
        try:
            load_data(session, dset, skip_scans=[], merge=False, calc_eshift=False)
            if do_merge_and_save:
                merge_data(dset)
                save_data(dset, session.datadir)
        except Exception:
            errors.append(dset.name)
            continue
 _logger.warning(f"Found errors in {errors}")
 ```

 %% Cell type:markdown id: tags:

 once everything is loaded into the session, it is possible to perform grouped actions faster (no need to relaod the data in memory). For example, to plot I0 with fluorescence sum

 %% Cell type:code id: tags:

 ``` python
 for samp in session.samples:
    for dset in samp.datasets:
        if dset.name in errors:
            continue
        _logger.info(f"--> DATASET: {dset.name}")
        plot_data(dset, data="fluo", show_i0=True, show_slide=True)
 ```

 %% Cell type:markdown id: tags:

 ## Sandbox

 This section is simply a sandbox (messy area) for testing/development or add your own workflow.

 %% Cell type:code id: tags:

 ``` python
 ```

 %% Cell type:code id: tags:

 ``` python
 ```

 %% Cell type:code id: tags:

 ``` python
 ```