Skip to content
Snippets Groups Projects
Commit 5dfb3fd2 authored by payno's avatar payno
Browse files

doc.tutorials.publish_processed_data_to_data_portal: add a warning about esrf...

doc.tutorials.publish_processed_data_to_data_portal: add a warning about esrf mounting points (like /gpfs/easy)
parent 10c97ac8
No related branches found
No related tags found
No related merge requests found
Pipeline #207308 passed
%% Cell type:markdown id:287b637a-4a01-4c11-81b8-68404d4c0727 tags: %% Cell type:markdown id:287b637a-4a01-4c11-81b8-68404d4c0727 tags:
# How to publish reconstructed volume to the data portal? # How to publish reconstructed volume to the data portal?
This tutorial explains how to publish a reconstructed volume (done by nabu) to the (ESRF) data portal. This tutorial explains how to publish a reconstructed volume (done by nabu) to the (ESRF) data portal.
Today the ESRF data portal catalog is based on DRAC (successor of ICAT). As switching is fresh, you should consider ICAT == DRAC as both names can be used interchangeably. Today the ESRF data portal catalog is based on DRAC (successor of ICAT). As switching is fresh, you should consider ICAT == DRAC as both names can be used interchangeably.
%% Cell type:markdown id:dae0f793-0a6b-440c-8bae-d6781b144bad tags: %% Cell type:markdown id:dae0f793-0a6b-440c-8bae-d6781b144bad tags:
## DRAC processed dataset ## DRAC processed dataset
There are two types of datasets in DRAC: raw datasets (published automatically by Bliss-tomo), and processed datasets (the ones we will publish in this tutorial) There are two types of datasets in DRAC: raw datasets (published automatically by Bliss-tomo), and processed datasets (the ones we will publish in this tutorial)
A **DRAC processed dataset** is related to: A **DRAC processed dataset** is related to:
* one or several raw datasets (usually one, but it can be several in the case of a stitching, for example) (*a*) * one or several raw datasets (usually one, but it can be several in the case of a stitching, for example) (*a*)
* a set of metadata keys (voxel size, phase retrieval options, etc.) (*b*) * a set of metadata keys (voxel size, phase retrieval options, etc.) (*b*)
* one beamline (*c*) * one beamline (*c*)
* one proposal (*d*) * one proposal (*d*)
* one dataset (*e*) * one dataset (*e*)
* one folder (the folder containing the reconstructed volume) (*f*) * one folder (the folder containing the reconstructed volume) (*f*)
%% Cell type:markdown id:868b2c98-0bb3-43de-b0ce-c14daa2fbbd2 tags: %% Cell type:markdown id:868b2c98-0bb3-43de-b0ce-c14daa2fbbd2 tags:
## Retrieve all the data needed for ICAT ## Retrieve all the data needed for ICAT
%% Cell type:markdown id:a7254799-34f7-474e-9674-9a02874cd1ba tags: %% Cell type:markdown id:a7254799-34f7-474e-9674-9a02874cd1ba tags:
### (*a*) `raw` parameter ### (*a*) `raw` parameter
Path to the raw datasets. Source(s) of the processed dataset. It should be a tuple, but it can be a tuple of a single element. Path to the raw datasets. Source(s) of the processed dataset. It should be a tuple, but it can be a tuple of a single element.
You can get the original dataset path from an instance of `TomoScanBaseInstance` by calling `get_bliss_original_files()`. You can get the original dataset path from an instance of `TomoScanBaseInstance` by calling `get_bliss_original_files()`.
Warning: this path can contain some '/mnt/multipath-shares' prefix that shouldn't be passed to ICAT/DRAC. To filter this you can use the 'from_bliss_original_file_to_raw' helper function. Warning: this path can contain some '/mnt/multipath-shares' prefix that shouldn't be passed to ICAT/DRAC. To filter this you can use the 'from_bliss_original_file_to_raw' helper function.
``` python ``` python
from tomoscan.esrf.scan.utils import from_bliss_original_file_to_raw from tomoscan.esrf.scan.utils import from_bliss_original_file_to_raw
``` ```
%% Cell type:markdown id:f0b764b9-93bd-46f5-bdda-ae93f1a3af40 tags: %% Cell type:markdown id:f0b764b9-93bd-46f5-bdda-ae93f1a3af40 tags:
### (*b*) `metadata` parameter ### (*b*) `metadata` parameter
The metadata to be published to ICAT can be obtained from an instance of `VolumeBase` by calling the `build_drac_metadata` function. The metadata to be published to ICAT can be obtained from an instance of `VolumeBase` by calling the `build_drac_metadata` function.
For example, for an `HDF5Volume` you can have: For example, for an `HDF5Volume` you can have:
``` ```
volume = HDF5Volume( volume = HDF5Volume(
file_path=..., file_path=...,
data_path=..., data_path=...,
) )
drac_metadata = volume.build_drac_metadata() drac_metadata = volume.build_drac_metadata()
``` ```
Note: there is a tutorial on volumes for more information. Note: there is a tutorial on volumes for more information.
**Warning**: at the moment, the DRAC metadata will not contain the 'Sample_name' field, which is mandatory (without it, there will be no processing done). So you will need to add it. **Warning**: at the moment, the DRAC metadata will not contain the 'Sample_name' field, which is mandatory (without it, there will be no processing done). So you will need to add it.
``` ```
drac_metadata["Sample_name"] = ... drac_metadata["Sample_name"] = ...
``` ```
It can be obtained from the `TomoScanBaseInstance` by calling `scan.sample_name`. It can be obtained from the `TomoScanBaseInstance` by calling `scan.sample_name`.
*Note*: Available DRAC keys are defined [here](https://gitlab.esrf.fr/icat/hdf5-master-config/-/blob/master/hdf5_cfg.xml?ref_type=heads) (see `Tomo` group, `reconstruction` section). *Note*: Available DRAC keys are defined [here](https://gitlab.esrf.fr/icat/hdf5-master-config/-/blob/master/hdf5_cfg.xml?ref_type=heads) (see `Tomo` group, `reconstruction` section).
%% Cell type:markdown id:2c10b63a-b52e-4cff-8dea-e3b6f9105f74 tags: %% Cell type:markdown id:2c10b63a-b52e-4cff-8dea-e3b6f9105f74 tags:
### (*c*) `beamline` parameter ### (*c*) `beamline` parameter
This is the name of the beamline, like 'bm05', 'bm18'... (in lower case) This is the name of the beamline, like 'bm05', 'bm18'... (in lower case)
%% Cell type:markdown id:c7ceb617-b64f-46c9-bfa4-8fa888f50b6b tags: %% Cell type:markdown id:c7ceb617-b64f-46c9-bfa4-8fa888f50b6b tags:
### (*d*) `proposal` parameter ### (*d*) `proposal` parameter
Name of the proposal. Name of the proposal.
%% Cell type:markdown id:45a211a4-e203-4517-8edb-1e7515beab01 tags: %% Cell type:markdown id:45a211a4-e203-4517-8edb-1e7515beab01 tags:
### (*e*) `dataset` parameter ### (*e*) `dataset` parameter
Name of the dataset. This is the (processed) dataset in the DRAC context. Name of the dataset. This is the (processed) dataset in the DRAC context.
This dataset will create a key with the folder path at the DRAC level and it must be unique. This dataset will create a key with the folder path at the DRAC level and it must be unique.
The default value we propose is 'reconstructed_volumes'. The default value we propose is 'reconstructed_volumes'.
%% Cell type:markdown id:0e6ce848-cc68-47bf-bb4f-65083bbf789c tags: %% Cell type:markdown id:0e6ce848-cc68-47bf-bb4f-65083bbf789c tags:
### (*f*) `path` parameter ### (*f*) `path` parameter
This is the path to the folder containing the reconstructed volume (by Nabu). This is the path to the folder containing the reconstructed volume (by Nabu).
**Warning**: All files contained in this folder will be published to ICAT. There is no mechanism to publish a single file or a set of files. **Warning 1**: path should be cleaned of any 'esrf mounting points' like '/mnt/multipath-shares' or '/gpfs/easy'. If needed you can use the 'filter_esrf_mounting_points' from tomoscan.esrf.scan.utils.
**Warning 2**: All files contained in this folder will be published to ICAT. There is no mechanism to publish a single file or a set of files.
Here is the recommended structure if path == 'reconstructed_volumes' and for an HDF5 reconstruction: Here is the recommended structure if path == 'reconstructed_volumes' and for an HDF5 reconstruction:
``` ```
reconstructed_volumes reconstructed_volumes
| |
|------ nabu_rec.hdf5 - nabu reconstructed volume master file (1) |------ nabu_rec.hdf5 - nabu reconstructed volume master file (1)
|------ nabu_rec |------ nabu_rec
| |---------- nabu_rec_0000_0256.hdf5 - nabu reconstructed volume sub file 1 | |---------- nabu_rec_0000_0256.hdf5 - nabu reconstructed volume sub file 1
|------ gallery - gallery related to the processed dataset (2) |------ gallery - gallery related to the processed dataset (2)
| |------ screenshot_1.png | |------ screenshot_1.png
| |------ screenshot_2.png | |------ screenshot_2.png
|------ nabu_cfg_files - folder containing nabu configuration files (3) |------ nabu_cfg_files - folder containing nabu configuration files (3)
|------ nabu_config.cfg |------ nabu_config.cfg
``` ```
(1) The Nabu reconstructions. It can be replaced by a folder containing a volume with .tiff files. (1) The Nabu reconstructions. It can be replaced by a folder containing a volume with .tiff files.
(2) **Optional**. A set of images (.png or .jpg) linked to the reconstructed volume, like 3 slices along each axis. (2) **Optional**. A set of images (.png or .jpg) linked to the reconstructed volume, like 3 slices along each axis.
(3) nabu_cfg_files: location of the configuration used to obtain the volume(s). In the future, it should be used to reprocess a volume. (3) nabu_cfg_files: location of the configuration used to obtain the volume(s). In the future, it should be used to reprocess a volume.
%% Cell type:markdown id:28d6a1e3-2616-4112-a90d-736d6793ddd5 tags: %% Cell type:markdown id:28d6a1e3-2616-4112-a90d-736d6793ddd5 tags:
## Publication to DRAC / ICAT ## Publication to DRAC / ICAT
To publish a **processed dataset** to ICAT, we use [pyicat_plus](https://gitlab.esrf.fr/icat/pyicat-plus). To publish a **processed dataset** to ICAT, we use [pyicat_plus](https://gitlab.esrf.fr/icat/pyicat-plus).
%% Cell type:markdown id:eee28397-a05f-4aaa-a748-5bfa2d3e7944 tags: %% Cell type:markdown id:eee28397-a05f-4aaa-a748-5bfa2d3e7944 tags:
### Instantiate the `IcatClient` ### Instantiate the `IcatClient`
``` python ``` python
from pyicat_plus.client.main import IcatClient from pyicat_plus.client.main import IcatClient
icat_client = IcatClient( icat_client = IcatClient(
metadata_urls=("bcu-mq-01.esrf.fr:61613", "bcu-mq-02.esrf.fr:61613") metadata_urls=("bcu-mq-01.esrf.fr:61613", "bcu-mq-02.esrf.fr:61613")
) )
``` ```
%% Cell type:markdown id:047debaa-bac2-431a-8c99-d1411aa73c71 tags: %% Cell type:markdown id:047debaa-bac2-431a-8c99-d1411aa73c71 tags:
### Publish to ICAT ### Publish to ICAT
``` python ``` python
icat_client.store_processed_data( icat_client.store_processed_data(
raw=raw, # (a) raw=raw, # (a)
metadata=metadata, # (b) metadata=metadata, # (b)
beamline="id16a", # (c) beamline="id16a", # (c)
proposal=self.inputs.proposal, # (d) proposal=self.inputs.proposal, # (d)
dataset="reconstructed_volumes", dataset="reconstructed_volumes",
path=path, path=path,
) )
``` ```
%% Cell type:code id:1d4bee56-b109-4f38-9b34-66171c45b661 tags: %% Cell type:code id:1d4bee56-b109-4f38-9b34-66171c45b661 tags:
``` python ``` python
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment