mpi / OpenCL issue

Hello, while working on the hermes ptycho code, I found that the OpenCL mpi does not work out of the box.

I identify the issue in the pynx/processing_unit/init.py

@ -425,12 +425,13 @@ class ProcessingUnit(object):
                     r = mpi.scatter(local_ranks, root=0)
                     # Before assigning the GPU, sort the found devices by their address, because the device
                     # order can be different on two process on the same node (is that also true for opencl?)
-                    benchmark_results = [(r[0], r[1], r[0].int_ptr) for r in benchmark_results]
+                    import pyopencl
+                    benchmark_results = [(r[0], r[1], r[0].get_info(pyopencl.device_info.PCI_BUS_ID_NV)) for r in benchmark_results]
                     benchmark_results = list(sorted(benchmark_results, key=lambda t: t[2]))
                     if verbose:
                         print("select_gpu using MPI: node=%s mpi_rank=%d, using GPU #%d/%d ptr:" %
                               (platform.node(), mpi.Get_rank(), r % nb, nb), benchmark_results[r % nb][0].int_ptr)
-                return self.set_device(benchmark_results[gpu_rank][r % nb], test_fft=False, verbose=verbose)
+                return self.set_device(benchmark_results[r % nb][0], test_fft=False, verbose=verbose)
 
             if ranking in ['fft', 'bandwidth']:
                 benchmark_results = sorted(benchmark_results, key=lambda t: -t[1])

There is two issues here.

the int_ptr value is not unique between the process, so it is possible to select the same GPU card.
the benchmark_results is not indexed the right way. We can see thaht the cuda code use the right forme (the one with the +)

In this version of the code, I did a quick fix for (1), but this is not great. If we have an AMD card the PCI_BUS_ID_NV is not available (I guess). So we should find a way to extract an unique identifier for each process unit and something which does not depends on the process.

Cheers