Skip to content

Resolve "Dataset metadata without configuration"

Wout De Nolf requested to merge 2976-dataset-metadata-without-configuration into master

Closes #2976 (closed)

Simplify the dataset metadata API and configuration (i.e. ICAT metadata)

A full documentation can be found here:

  • doc/docs/data/config_data_policy.md
  • doc/docs/data/data_dataset_metadata.md

Although the aim was to remove configuration, it seems we still need a minimal configuration. The minimal configuration is much simpler than before though.

Here are the most important changes:

The metadata object

Each Bliss session has a bliss.icat.metadata.ICATmetadata object which is used by Bliss to gather metadata from controllers that implement the HasMetadataForDataset protocol. It can also be used to browse the ICAT definitions from the shell with auto completion.

Configuration

The entire configuration of ICATmetadata is optional.

- class: Session
  name: demo_session
  setup-file: ./demo_session_setup.py
  config-objects: 
    ...
  icat-metadata:
    definitions: "https://gitlab.esrf.fr/icat/hdf5-master-config/-/raw/master/hdf5_cfg.xml"  # optional
    default:
      secondary_slit: $secondary_slits
      sample.positioners: [$sy, $sz]
      variables: $sx
      optics.positioners: [$robx, $roby]
      detector05: $lima_simulator
      detector06: $beamviewer
      detector07: $fluo_diode.counter
      detector08: $diode1
      detector09: $diode2
      attenuator01: $att2
    techniques:
      TOMO:
        detector01: $tomocam
      XRD:
        detector02: $diffcam
        attenuator01: $att1
      FLUO:
        detector03: $mca1  # metadata group provided by `HasMetadataForDataset.get_metadata()` of controller `mca1`
        detector04.name: $mca2.name  # metadata field provided by the `name` attribute of controller `mca2`

The keys secondary_slit, sample.positioners, etc. correspond to "metadata groups" and should be chosen from a fixed list (see below)

There are several reasons why you would want to specify a controller explicitly under icat-metadata:

  • the controller is not listed in config-objects
  • the controller does not have any default metadata groups (HasMetadataForDataset.dataset_metadata_groups() == list())
  • the metadata groups are different from the default (e.g. ["secondary_slit"] instead of ["slits"])
  • you want to select specific controller attributes as metadata
  • the controller only needs to be included for a specific technique

The metadata of a dataset will be a combination of metadata from

  • the controllers under default
  • optionally one or more techniques (see SCAN_SAVING.dataset.techniques)
  • controllers in the BLISS session that are not specified explicitly and with HasMetadataForDataset.dataset_metadata_groups() != list()

Browse ICAT definitions

The metadata object bliss.icat.metadata.ICATmetadata has a namespace definitions which allows for easy browsing of all available ICAT fields (auto completion in the Bliss shell).

Description of a single field

To show all information (description, type, ...) of a single ICAT field (one ICAT database entree):

DEMO_SESSION [1]: demo_session.icat_metadata.definitions.instrument.detector01.elapsed_time
         Out [1]: IcatField(name='elapsed_time', field_name='InstrumentDetector01_elapsed_real_time', parent='instrument.detector01', nxtype='NX_FLOAT', description='Time elapsed between start and stop of the measurement', units=None)

To get/set the value of this field

DEMO_SESSION [2]: SCAN_SAVING.dataset.metadata.instrument.detector01.elapsed_time = 2
DEMO_SESSION [3]: print(SCAN_SAVING.dataset.metadata.instrument.detector01.elapsed_time)
2

Description of a group for HasMetadataForDataset controllers

Show all available fields that can be returned by dataset_metadata of a controller that implements the HasMetadataForDataset protocol (for example bliss.controllers.motors.slits.Slits):

 DEMO_SESSION [5]: demo_session.icat_metadata.definitions.instrument.primary_slit
          Out [5]: Namespace contains:
                  .name              = IcatField(name='name', field_name='InstrumentSlitPrimary_name', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .vertical_gap      = IcatField(name='vertical_gap', field_name='InstrumentSlitPrimary_vertical_gap', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .vertical_offset   = IcatField(name='vertical_offset', field_name='InstrumentSlitPrimary_vertical_offset', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .horizontal_gap    = IcatField(name='horizontal_gap', field_name='InstrumentSlitPrimary_horizontal_gap', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .horizontal_offset = IcatField(name='horizontal_offset', field_name='InstrumentSlitPrimary_horizontal_offset', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .blade_up          = IcatField(name='blade_up', field_name='InstrumentSlitPrimary_blade_up', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .blade_down        = IcatField(name='blade_down', field_name='InstrumentSlitPrimary_blade_down', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .blade_front       = IcatField(name='blade_front', field_name='InstrumentSlitPrimary_blade_front', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)
                  .blade_back        = IcatField(name='blade_back', field_name='InstrumentSlitPrimary_blade_back', parent='instrument.primary_slit', nxtype='NX_CHAR', description=None, units=None)

So dataset_metadata should return a dictionary which has the keys name, vertical_gap, vertical_offset, ...

Available configuration keys

Use demo_session.icat_metadata.available_icat_groups (for controllers) or demo_session.icat_metadata.available_icat_fields (for controller attributes)

Show all available keys for the configuration

DEMO_SESSION [6]: print(demo_session.icat_metadata.available_icat_groups)

    ['SAXS',
    'MX',
    'EM',
    'PTYCHO',
    'PTYCHO.Axis1',
    'PTYCHO.Axis2',
    'FLUO',
    'FLUO.measurement',
    'TOMO',
    'MRT',
    'HOLO',
    'WAXS',
    'sample',
    'sample.notes',
    'sample.positioners',
    'sample.patient',
    'sample.environment',
    'sample.environment.sensors',
    'instrument',
    'instrument.variables',
    'instrument.positioners',
    'instrument.monochromator',
    'instrument.monochromator.crystal',
    'instrument.source',
    'instrument.primary_slit',
    'instrument.secondary_slit',
    ...
Edited by Wout De Nolf

Merge request reports