Skip to content
Snippets Groups Projects
Commit 232591e8 authored by Mauro Rovezzi's avatar Mauro Rovezzi
Browse files

update example notebook for the users

parent a1dbbb54
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Data reduction and evaluation workflow
- This notebook is based on a **work in progress project**, called `famewoks` and may be affected by bugs
- `famewoks` is publicly available at [https://gitlab.esrf.fr/F-CRG/fames/famewoks](https://gitlab.esrf.fr/F-CRG/fames/famewoks)
- Subscribe to the [`fame-data-analysis@esrf.fr` mailing
list](https://sympa.esrf.fr/sympa/info/fame-data-analysis) to be kept updated
about bug fixes and new features
- Report bugs and features requests in the [famewoks issues tracker](https://gitlab.esrf.fr/F-CRG/fames/famewoks/-/issues) or by directly sending an email to [mauro.rovezzi@esrf.fr](mailto:mauro.rovezzi@esrf.fr)
- To run this notebook at BM16:
- (production/stable) go to [http://jade:8000](http://jade:8000) and run the notebook with the `sloth (v2312)` kernel
- (development) go to [http://bm16ctrl:8000](http://bm16ctrl:8000) and run the notebook with the `sloth` kernel
- To run this notebook at BM16: go to [http://jade:8000](http://jade:8000) and run the notebook with the `sloth (v2312)` kernel
%% Cell type:markdown id: tags:
First set main imports and global settings for the experiment (session)
First set main imports and global settings for the experiment (session). **NOTE** to restart the notebook, you need to restart the underlying kenel process. Simply closing the notebook window/tab does not reload it.
%% Cell type:code id: tags:
``` python
%load_ext autoreload
%autoreload 2
# Uncomment the following two lines if the plots do not show in the notebook
# import plotly.io as pio
# pio.renderers.default = "iframe"
import os
from dataclasses import asdict
from IPython.display import display
from famewoks import __version__ as wkflver
from famewoks.datamodel import ExpSession, ExpCounters, CNTS_FLUO_XMAP, CNTS_FLUO, CNTS_CAS
from famewoks.tests.datainfos import DATAINFOS, get_exp_session
from famewoks.plots import plot_data, plot_eshift
from famewoks.bliss2larch import (
search_samples,
show_samples_info,
search_datasets,
load_data,
get_scans,
get_group,
set_enealign,
apply_eshift,
set_bad_fluo_channels,
set_bad_scans,
set_bad_samples,
merge_data,
save_data,
)
from famewoks.bliss2larch import _logger
# adjust the logger level:
# "DEBUG" -> show all messages
# "INFO" -> useful messages
# "WARNING" -> warnings only
# "ERROR" -> only errors
_logger.setLevel("INFO")
#show workflow version (and famewoks branch information)
branch = os.popen("cd ~/devel/famewoks; git branch --show-current").read()[:-1]
_logger.info(f"--> Using famewoks version: {wkflver} (branch: {branch})")
```
%% Cell type:markdown id: tags:
Initialize the `ExpSession` object, which is the representation of the whole experimental session with data and metadata and show it
%% Cell type:code id: tags:
``` python
# define the counters names or use those already defined for BM16: CNTS_FLUO_XMAP, CNTS_FLUO, CNTS_CAS
CNTS_CAS_XPAD = ExpCounters(
# you can define your own counters names or use those already defined for BM16: CNTS_FLUO_XMAP, CNTS_FLUO, CNTS_CAS
MYCNTS = ExpCounters(
ene="energy_enc",
ix=["p201_1_bkg_sub", "p201_3_bkg_sub", "p201_5_bkg_sub"], #: I0, I1, I2
fluo_roi=["xpad_roi1"], # all detector names for ROI1
fluo_corr=["xpad_roi1"], # all detector names (DT corrected)
fluo_time=[
"sec"
], # elapsed time, which is different for the spikes
time="sec", # "musst_timer"
)
# define the experimental session metadata manually
session = ExpSession(
flag=1,
datadir="/data/visitor/es1423/bm16/20240716",
proposal="es1423",
session="20240716",
proposer="Sanchez",
lc="MR, JLH",
elem="As",
edge="K",
comment="",
counters=CNTS_FLUO_XMAP, #*NOTE* give here the correct counters names!
samples=[],
bad_samples=[],
bad_fluo_channels=None,
enealign=None,
)
# (testing) otherwise it is possible to get the session from a collection of previous experiments
#session = get_exp_session(DATAINFOS, proposer="Bertrand",session="20241022")
#session = get_exp_session(DATAINFOS, proposer="Isaure",session="20241003")
session = get_exp_session(DATAINFOS, proposer="Rimondi",session="20241001")
#session = get_exp_session(DATAINFOS, proposer="Biswas",session="20240924")
#session = get_exp_session(DATAINFOS, proposer="Sanchez",session="20240716")
#session = get_exp_session(DATAINFOS, proposer="Stellato",session="20240625")
#session = get_exp_session(DATAINFOS, proposer="Bertrand",session="20240910")
# to display the session metadata
#display(asdict(session))
```
%% Cell type:markdown id: tags:
## Users workflow
A minimal/typical workflow for the users consists of (copy/paste the relative functions fomr the "full documented workflow" below):
- `search_samples`
- select a sample
- `search_datasets`
- select a dataset
- `load_data`
- `plot_data`
- remove bad channels/scans
- `save_data`
- import the Athena project in Larix and continue your data analysis
%% Cell type:code id: tags:
``` python
samples = search_samples(session)
```
%% Cell type:code id: tags:
``` python
sel_sample = 2
datasets = search_datasets(session, sample=sel_sample)
```
%% Cell type:code id: tags:
``` python
sel_dataset = 0
dataset = datasets[sel_dataset]
load_data(
session,
dataset,
use_fluo_corr=False,
iskip=1, #: ignore the first point
istrip=1, #: ignore the last point
calc_eshift=False,
merge=True,
skip_scans=[],
)
```
%% Cell type:code id: tags:
``` python
#mydatadir = None #use this for testing, to save into a temporary directory `/tmp/PROCESSED_DATA/famewoks`
mydatadir = session.datadir #use this to save into `PROCESSED_DATA/famewoks`
save_data(dataset, data=["fluo"], datadir=mydatadir, save_rebinned=False)
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
# display the session metadata
display(asdict(session))
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
## Search samples and datasets
## Full documented workflow
Here the whole data reduction and evaluation workflow currently implemented is explained in details. You may take (copy/paste somewhere) only those cells of interest for you.
Search for the samples names available in the given experimental session.
%% Cell type:markdown id: tags:
It is possible to use the parameter `ignore_names = ["list", "of", "strings"]` to ignore those samples names containing such words.
### Search samples and datasets
Search for the samples names available in the given experimental session. It is possible to use the parameter `ignore_names = ["list", "of", "strings"]` to ignore those samples names containing such words.
%% Cell type:code id: tags:
``` python
samples = search_samples(session, ignore_names=["rack", "mount", "align", "bl_", "sample"])
```
%% Cell type:markdown id: tags:
if you want to skip some samples, that is, marking them bad (`flag = 0`)
%% Cell type:code id: tags:
``` python
set_bad_samples(session, [15])
```
%% Cell type:markdown id: tags:
to show all the samples, including those marked as bad (`flag = 0`)
%% Cell type:code id: tags:
``` python
show_samples_info(session, all=True)
```
%% Cell type:markdown id: tags:
select a sample and search for the datasets. The function shows the index of each dataset and the number of XAS scans available
%% Cell type:code id: tags:
``` python
sel_sample = 2
datasets = search_datasets(session, sample=sel_sample)
```
%% Cell type:markdown id: tags:
it is also possible to search/load the datasets for all samples
%% Cell type:code id: tags:
``` python
session = search_datasets(session, verbose=False)
```
%% Cell type:markdown id: tags:
## Load data
### Load data
Select a dataset and load the scan data into the session (e.g. load from disk to memory).
**Parameters for `load_data()`**
- `skip_scans`: the scans that are not going to be loaded (e.g. bad scans), it can be a list of numbers `[1,2,3]` or a string `"1:4, 7"`
- `use_fluo_corr`: if True, it uses the dead-time corrected fluorescence channel (**NOTE** this correction usually fails at low count rates, check with/without correction so see which is the lower noise configuration)
- `iskip`: the index of the initial data points to skip (None)
- `istrip`: the relative index with respect to the last data points to strip (None)
- `merge`: to automatically merge the scans in a dataset (True)
- `calc_eshift`: fit the energy shift using the first scan of the dataset as reference (*NOTE* this slows down the loading)
%% Cell type:code id: tags:
``` python
sel_dataset = 0
dataset = datasets[sel_dataset]
load_data(
session,
dataset,
use_fluo_corr=False,
iskip=1, #: ignore the first point
istrip=1, #: ignore the last point
calc_eshift=True,
merge=True,
skip_scans=[],
)
```
%% Cell type:markdown id: tags:
## Data plotter
### Data plotter
Plot the data loaded in the dataset
**Parameters for `plot_data()`**
- `data` can be:
- `"fluos"`: to show all fluorescence channels
- `"fluo"`: sum of active fluorescence channels (use `set_bad_fluo_channels()` for excluding bad ones)
- `"trans"`: sample transmission (*muT1*)
- `"ref"`: reference "foil" transmission (*muT2*)
- `ynorm`: *None*, `area` (shows y data normalized by their area), `flat` (show flattened) or *True* (show normalized)
- `show_slide`: if True shows one scan at time with a slider
- `show_i0`: *True* shows I0 signal (*NOTE* for `data = "ref"` it is I1 signal)
- `show_e0`: *True* shows E0 (as found by the `pre_edge()` function of Larch)
- `show_deriv`: *True* shows the derivative of the signal
- `show_merge`: *True* shows the merged signal (sum of the channels for the current scan)
- if `"rebin"` it shows the rebinned version of the merge (*NOTE* the single scans are never rebinned, as they are meant to be merged (and then rebinned))
%% Cell type:code id: tags:
``` python
fig = plot_data(
dataset,
data="fluo",
ynorm="area",
show_slide=True,
show_i0=False,
show_e0=False,
show_deriv=False,
show_merge=False,
)
```
%% Cell type:markdown id: tags:
The plot function permits digging into the whole data for a single dataset. The following functions permits acting on the data (you may go back in the workflow and plot the data again)
%% Cell type:markdown id: tags:
## Energy alignment
### Energy alignment
First set a reference group for the energy alignment
%% Cell type:code id: tags:
``` python
energy_reference_group = get_group(dataset, scanint=1, data="ref")
set_enealign(session, energy_reference_group)
```
%% Cell type:markdown id: tags:
Now go back and use `calc_eshift=True` in the `load_data()` function
%% Cell type:markdown id: tags:
Plot the energy shifts
%% Cell type:code id: tags:
``` python
efig = plot_eshift(session=session, dset=dataset, show_e0=False, array="dmude")
```
%% Cell type:code id: tags:
``` python
apply_eshift(dataset)
```
%% Cell type:markdown id: tags:
## Data cleaning
### Data cleaning
Set bad scans. The `scans` variable can be a list of integers or a string that is interpreted (*NOTE* always put spaces after commas). If `scans=None`, all scans are enabled.
%% Cell type:code id: tags:
``` python
set_bad_scans(dataset, scans="1, 10")
#set_bad_scans(dataset, scans=None) #: all marked as good
```
%% Cell type:markdown id: tags:
To exclude the bad channels. If `scan=None` it will exclude the `channels` for all scans
%% Cell type:code id: tags:
``` python
set_bad_fluo_channels(dataset, channels="", scan=None)
```
%% Cell type:markdown id: tags:
## Merge data
Merge the scans in the dataset
(*NOTE* `pre_edge` and `rebin_xafs` are applied on the merged group),
### Merge data
*Default parameters for Larch `rebin_xafs()` function* :
Merge the scans in the dataset. `pre_edge` and `rebin_xafs` are applied on the merged group
*Default parameters for Larch `rebin_xafs()` function*:
- `pre1`: pre_step*int((min(energy)-e0)/pre_step)
- `pre2` : -30
- `pre_step`: 2
- `exafs1` :-15
- `exafs2` : max(energy)-e0
- `xanes_step` : e0/25000 , round down to 0.05
- `method` : centroid
%% Cell type:code id: tags:
``` python
merge_data(dataset)
```
%% Cell type:markdown id: tags:
## Save data
### Save data
Save the data to an Athena project file (*NOTE*: the files are overwritten each time). To save the rebinned spectra, use the option `save_rebinned=True`. The data channel should be specified, e.g. `data = ["fluo", "ref", "trans"]`. An Athena project for each data channel is created. If you want to change the scans saved in the Athena project, simply use the `set_bad_scans()` function (first enable all scans and then select those to export, see example below).
%% Cell type:code id: tags:
``` python
set_bad_scans(dataset, scans=None) #: all marked as good
set_bad_scans(dataset, scans="1, 10")
save_data(dataset, data=["ref"], datadir=None, save_rebinned=False)
set_bad_scans(dataset, scans=None) #: all marked as good
set_bad_scans(dataset, scans="2:9")
save_data(dataset, data=["fluo"], datadir=None, save_rebinned=False)
```
%% Cell type:markdown id: tags:
## Related workflows (optional)
In this section are shown some examples of related workflows.
%% Cell type:markdown id: tags:
### Load data for the whole experiment
Create a session and load all samples and all datasets of the experiment.
(*NOTE* here the `session` is created from a stored list of experimental
sessions, otherwise create your own `ExpSession` object as shown at the
beginning of this notebook).
%% Cell type:code id: tags:
``` python
do_merge_and_save = False #: to merge and save the data
_logger.setLevel("WARNING") # change the logger level to get more/less output
session = get_exp_session(DATAINFOS, proposer="Stellato", session="20240625")
allsamples = search_samples(session, verbose=False)
errors = []
for samp in allsamples:
alldatasets = search_datasets(session, sample=samp.name, verbose=False)
for dset in alldatasets:
try:
load_data(session, dset, skip_scans=[], merge=False, calc_eshift=False)
if do_merge_and_save:
merge_data(dset)
save_data(dset, session.datadir)
except Exception:
errors.append(dset.name)
continue
_logger.warning(f"Found errors in {errors}")
```
%% Cell type:markdown id: tags:
once everything is loaded into the session, it is possible to perform grouped actions faster (no need to relaod the data in memory). For example, to plot I0 with fluorescence sum
%% Cell type:code id: tags:
``` python
for samp in session.samples:
for dset in samp.datasets:
if dset.name in errors:
continue
_logger.info(f"--> DATASET: {dset.name}")
plot_data(dset, data="fluo", show_i0=True, show_slide=True)
```
%% Cell type:markdown id: tags:
### Using the main experiment controller
**WARNING**: This feature is still under development
%% Cell type:code id: tags:
``` python
from famewoks.controller import ExpSessionController
exp = ExpSessionController(session)
exp.build_data_tree()
```
%% Cell type:code id: tags:
``` python
exp.samples
```
%% Cell type:code id: tags:
``` python
from famewoks.bliss2larch import show_datasets_info, show_samples_info
#show_datasets_info(exp.session.samples[2])
show_samples_info(exp.session, all=False, show_datasets=True)
```
%% Cell type:code id: tags:
``` python
exp.session.samples[2].datasets[3].scans[0]
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
## Sandbox
This section is simply a sandbox (messy area) for testing/development or add your own workflow.
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:code id: tags:
``` python
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment