Segfault on init_device
For the controller and receiver devices we call init_device in the constructor which sometimes segfaults:
https://gitlab.esrf.fr/bliss/bliss/-/jobs/2125316
[runner-f8unlx2w3-project-325-concurrent-5:9495 :0:9495] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x8)
==== backtrace (tid: 9495) ====
0 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libucs.so.0(ucs_handle_error+0x2fd) [0x709842da584d]
1 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libucs.so.0(+0x2fa3f) [0x709842da5a3f]
2 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libucs.so.0(+0x2fc0a) [0x709842da5c0a]
3 /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x709843ff9420]
4 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango12AttrPropertyC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_+0x27) [0x7098449fd977]
5 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZNSt6vectorIN5Tango12AttrPropertyESaIS1_EE17_M_realloc_appendIJRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESB_EEEvDpOT_+0xa4) [0x7098449fea34]
6 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango19MultiClassAttribute20init_class_attributeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEl+0xba9) [0x7098449ff709]
7 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango7DServer11init_deviceEv+0x315) [0x709844a81755]
8 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango12DServerClass14device_factoryEPKNS_17DevVarStringArrayE+0xb7) [0x709844a8e1f7]
9 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango12DServerClassC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x3b1) [0x709844a8e931]
10 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango12DServerClass4initEv+0x94) [0x709844a8ecd4]
11 /opt/conda/envs/bliss_lima2_simulator/bin/../lib/libtango.so.10.1(_ZN5Tango4Util11server_initEb+0x3f) [0x709844bb315f]
12 lima2_tango(main+0xf4c) [0x56c11394166c]
13 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x709843b69083]
14 lima2_tango(+0x22c9e) [0x56c113941c9e]
=================================
which cleaned up looks like this
==== backtrace (tid: 9495) ====
0 libucs.so.0(ucs_handle_error+0x2fd)
1 libucs.so.0(+0x2fa3f)
2 libucs.so.0(+0x2fc0a)
3 libpthread.so.0(+0x14420)
4 libtango.so.10.1(Tango::AttrProperty::AttrProperty(...) +0x27)
5 libtango.so.10.1(std::vector<Tango::AttrProperty>::_M_realloc_append(...) +0xa4)
6 libtango.so.10.1(Tango::MultiClassAttribute::init_class_attribute(...) +0xba9)
7 libtango.so.10.1(Tango::DServer::init_device(...) +0x315)
8 libtango.so.10.1(Tango::DServerClass::device_factory(...) +0xb7)
9 libtango.so.10.1(Tango::DServerClass::DServerClass(...) +0x3b1)
10 libtango.so.10.1(Tango::DServerClass::init(...) +0x94)
11 libtango.so.10.1(Tango::Util::server_init(...) +0x3f)
12 lima2_tango(main+0xf4c)
13 libc.so.6(__libc_start_main+0xf3)
14 lima2_tango(+0x22c9e)
It seems like init_device is doing lots of things which perhaps are called too soon when you do it in the constructor and could segfault?
Edit: Reynald confirmed that calling init_device in the constructor and delete_device in the destructor is the normal thing to do.
https://gitlab.esrf.fr/limagroup/lima2/-/blob/develop/tango/include/lima/tango/control.inl#L27
For example:
attribute_lock_guard lock(this->get_device_attr()->get_attr_by_name("acq_state"));
this->push_change_event("acq_state", dev_state, 1, 0, true);
If get_device_attr() or get_attr_by_name(...) returns nullptr, attribute_lock_guard will dereference it and blow up.
Also the callback (registered with m_ctrl->register_on_state_change(...)) can fire from another thread at any time. If it runs before Tango finished setting up attributes, it will crash.