Setting a motor position "too large" means you can no longer move it

Please don't hate me for this.

Feel free to label this as wontfix and point me to some documentation. It comes down the way floats are being converted to digital integers and sent to controllers.

I am missing the proof that step errors are bounded as <=1.
docs that the "+/-1" step error consequences are intentionally allowed
a the guard that certain motor positions (>1e16) are just not allowed and will not work in bliss.
How should we configure our hardware to get exactly correct step numbers ?

In the icepapcms software we have the option to give an integer number of steps for an integer number of turns. So all the rational numbers can be used and we get a mathematically well behaved system (except overflow, wraparound, etc).

For motor diffrz in iceid111 it is currently configured in bliss with 17493.3 steps per degree. In icepapcms we have it (hopefully correctly) with 52480 steps for 3 motor turns. There is probably one motor turn per degree. In spec we had 17493.33333 (and were not allowed more due to the formatting in the config file).

Currently that looks like a 12 step error after a 360 degree movement. With the old spec configuration we needed to make 1000 full turns to get a 1 step error. With the new EBS source and XRDCT scans we would like to have motors turning continuously for long periods of time, and could exceed 1000 turns.

I could just put .3333333333333333 and live with it but I would like to have the exact rational value 52480/3 and stop worrying.

mvr does not round any more:

In olden times, motor positions were always rounded to integer steps. Bliss aims to overcome the "problem" of mvr accumulating errors by using a double precision float. This is a change for me and so I am trying to understand the implications and how to use it safely. I had understood that rounding to controller digital positions was done for a "good reasons" and learned to use absolute move instead of relative in macros.

This has some implications. I don't know which ones are bugs and which ones are intentional design decisions.

I expect the following two actions to be equivalent, but they give different scan motor positions:

# lbroll as 1 step per unit
umv(lbroll,0.5)
dscan lbroll -2 2 4 0.1
           #         dt[s]        lbroll         pico4
           0             0            -2       291.038
           1      0.202723            -1       392.902
           2      0.404451             0        305.59
           3      0.600001             1       349.246
           4       0.80452             2       392.902

ascan lbroll -1.5 2.5 4 0.1
           #         dt[s]        lbroll         pico4
           0             0            -2       422.006
           1      0.217583             0       436.558
           2      0.351288             0       451.109
           3      0.578239             2        378.35
           4      0.709806             2       407.454

The second scan is a bit ugly as you are losing a factor of 2 compared to the hardware resolution. It will happen any time you hit round to nearest even I guess?

the _set_position is hidden state. I understood it should work like this:

# m has 1 step ber unit
OH2 [588]: mv(m,0.);mvr(m,0.4);mvr(m,0.4)                                       
OH2 [589]: m.steps_per_unit,m.position,m.dial,m._set_position                   
Out [589]: (1.0, 1.0, 1.0, 0.8)

So the two moves of 0.4 steps add up to 0.8 steps and this rounds to 1. Using a "dscan" or "m.sync_hard()" resets m._set_position to 0:

OH2 [590]: mv(m,0.);mvr(m,0.4);dscan(m,-1,1,2,0.1);mvr(m,0.4)                   
OH2 [591]: m.steps_per_unit,m.position,m.dial,m._set_position                   
Out [591]: (1.0, 0.0, 0.0, 0.4)

So the hidden state changes, sometimes, and users can still accumulate the mvr rounding error in their macros if they included a sync().

If you use mvr(0.5) then you might see it always rounds up, or down, or nearest even, depending on the initial value in _set_position.

floating point numbers in computers are not like any number system in maths. Addition is not associative. Rounding modes depend on a hidden cpu setting. Etc.

The final position in steps depends on the sequence of moves. Artificially:

OH2 [566]: 0.1+0.3+0.2-0.6              
Out [566]: 1.1102230246251565e-16

OH2 [567]: 0.2+0.3+0.1-0.6              
Out [567]: 0.0

Take a real example of a stepper motor with 200 steps per turn. We configure it with 200./360. steps per degree. Now we do 360 steps of 1 degree relative movement and at the end come back with a single mvr of 360:

OH2 [547]: sum([200/360,]*360)-(200/360.)*360                                   
Out [547]: -1.3358203432289883e-12

Confirm this on a real motor configured with 1 step per unit:

OH2 [552]: m.steps_per_unit,m.position,m.dial,m._set_position                    
Out [552]: (1.0, 0.0, 0.0, 0.0)

OH2 [559]: for i in range(360): 
      ...:     mvr(m,200./360.) 
      ...: mvr(m,-200) 
      ...: m.steps_per_unit,m.position,m.dial,m._set_position 
[snip
OH2 [558]: m.steps_per_unit,m.position,m.dial,m._set_position                   
Out [558]: (1.0, 0.0, 0.0, -1.3358203432289883e-12)

So the _set_position has an error that grows over time if you do a large number of relative moves. The 1e-12 is much larger than you would expect for double precision (2e-16). What is a "large number" of relative moves depends on the motor position too. If we first do 1000 turns on that motor we finish with:

OH2 [614]: m.steps_per_unit,m.position,m.dial,m._set_position-m.position        
Out [614]: (1.0, 200000.0, 0.0, 2.3283064365386963e-09)

If someone sets the motor position to 2e14 then this number system breaks:

OH2 [619]: sum([2e14,] + [200/360,]*360)-(200/360.)*360-2e14                    
Out [619]: 2.5

Also happens on a motor:

OH2 [621]: m.position = 2e14
OH2 [622]: for i in range(360): 
      ...:     mvr(m,200./360.) 
      ...: mvr(m,-200) 
...
Moving lbroll from 200000000000202 to 200000000000002.5

So now we "lost" 2.5 steps. I can make it arbitrarily worse:

OH2 [626]: m.position=1e20              
Resetting 'lbroll` position from 200000000000002.0 to 1e+20 (new offset: 1e+20)
OH2 [627]: umvr(m,2)
Moving lbroll from 100000000000000000000 to 100000000000000000000

 lbroll 
100000000000000000000.000

So now 2 full steps were completely lost. Given that m.dial is currently 0 this would not happen if mvr was being done as sending relative step movements to the controller.