C float comparison based on "extended precision" floats fails unexpectedly

The Debian integration tests revealed an interesting bug. It showed up on i386 platform using Debian unstable, gcc 12.2. The bottom line is that rationalize/1 failed to do its job. Much simplified this performs

int num, den;
double target;

...
  while((double)num/(double)den != target)
     <update num and den>

Trying this with target 0.1, the second iteration nicely comes with num = 1, den = 10, but the while loop just continues (until it decides the numbers got too big, causing it to use rational/1).

So, 1/10 == 0.1 fails using gcc-12 on this platform. Why? Well, we can “fix” this using

volatile double f = (double)num/(double)den;
if ( f == target ) ...

The volatile keyword tells the compiler to put f in memory and thus forces the intermediate result into a proper 64 bit double. In the original code the (double)num/(double)den stays in a 80 bit extended float register and the target is promoted to this type before comparing, so we compare the 80 bits approximation of 0.1 to the 64 bit and fail.

In fact, using enough optimization gcc sees there is no point in the volatile declaration and it fails again. For this reason I now put the double in a union with a 64-bit int and to equality testing by comparing the integers. That forces the compiler to put the intermediate in memory.

Patch is in PORT: Fixed rationalize/1 and cmpFloatNumbers() to fail due to gcc ¡ SWI-Prolog/swipl-devel@3e8df9b ¡ GitHub

This works. Now two questions remains

  • Is what gcc does actually correct? Maybe that is not the right question as the gcc notes on float processing acknowledge the default prefers performance over strict compliance with the C standard. So far, no previous versions of gcc did this though (AFAIK).
  • Is there a more elegant way to get the result we want? One is probably to use gcc flags to tweak its float handling. Downside is that other compilers may do the same and these are the only two places where we want to compare floats at 64 bits.

It is a very long standing issue with compilers and i387 style FPU, see option -fexcess-precision in GCC manual.

1 Like

I guess there is another issue with the rationalize/1 function.
Namely it uses the argument x, and does the reduction via
computation with x as a float:

File: pl-gmp.c, routine: mpq_set_double (nice float routine)

    double xi = floor(x);
    double xf = x - xi;
    x = 1.0/xf;

But you get better results if you avoid doing the computation with
x as float, and instead use rational/1 first to obtain a rational number r,
and do the floor/1 and reciprocal on the rational number r during the iteration.

With rational number the reciprocal will be exact, in the above the
reciprocal 1.0/xf is not exact since it uses float. Here is a test case where
you see that this is a problem for the “nice float routine”, I get with the

newest release of SWI-Prolog, on Windows Platform:

/* SWI-Prolog 9.1.2 */
?- Y is 7.0e-15, X is rationalize(Y).
Y = 7.0e-15,
X = 13r1857142857142857.

On the other hand my own implementation which computes with r = rational(x),
instead of x, a quite simple pure Prolog routine in fact, I get the better result,
smaller denominator, i.e. iteration stops earlier:

/* Jekejeke Prolog 1.5.5 */
?- use_module(library(standard/approx)).

?- Y is 7.0e-15, X is rationalize(Y).
Y = 7.0E-15, X = 7#1000000000000000.

?- 1000000000000000 < 1857142857142857.
true.

Hope this Helps! If your algorithm is borrowed from ECLiPSe Prolog,
ECLiPSe Prolog might have the same issue. Disclaimer: The x float is only
a code smell, could be also rooted in some other problem.

Thanks. I’ll leave that to @ridgeworks (please pull first as I did a small cleanup by using modf() rather than floor() and subtraction). This doesn’t come from ECLiPSe. One of the related routines did, but was later replaced and I think there is no longer any relation between the two at the implementation level. There was an older implementation for rationalize, but @ridgeworks replaced that to properly support big integers.

Float limits are indeed the problem. Using long double for x, xi and xf gets the right result. Possibly that is the simple fix?

I don’t recall doing anything to rationalize but I’ll put this suggestion in my queue. (I’m not in a position to do any C development until after the holiday.)

You don’t need to go long double. Just use rational numbers,
start with r = rational(x). You have them and praise them all the
time and murne that ISO doesn’t have them, so please use them.

While the float based algorithm does:

The rational based algoritm does, where a rational is just a pair r = (r(0), r(1)):

    bigint xi = r(0) div r(1);
    rational xf = (r(0) - xi*r(1), r(1));
    r = (xf(1), xf(0))     /* reciprocal is swap */

Pretty easy. One can show that r(0),r(1) gets smaller and smaller
just like in the GCD algorithm. So there is no problem of needing a
larger precision, such as long double. The problem is solved by more

accuracy but not by more precision. The swap is 100% accurate, we
can take the reciprocal of a rational number exactly, whereas the
float 1.0/xf has always some problem:

?- X is 3, Y is 1 rdiv X.
X = 3,
Y = 1r3. /* exact 100% accurate */

?- X is 3, Y is 1/X.
X = 3,
Y = 0.3333333333333333. /* inexact, also when we would use long double */

Edit 20.12.2022

One footnote in the docu rationalize/1 says:

The implementation of rationalize as well as converting a rational number into a float is copied from ECLiPSe and covered by the Cisco-style Mozilla Public License Version 1.1.

Maybe this is outdated?

Here you see that computing with float causes a problem:

/* SWI-Prolog 9.1.0 Windows */
nextf(X, X).
nextf(X, Y) :-
    XI is floor(X), XF is X-XI,
    Z is 1/XF, nextf(Z, Y).

?- nextf(7.0e-15, X).
X = 7.0e-15 ;
X = 142857142857142.84 ;
X = 1.1851851851851851 ;
X = 5.400000000000002 ;
X = 2.4999999999999867 ;
X = 2.0000000000000533 ;
X = 18764998447377.066 ;
X = 15.058823529411764 ;
X = 17.00000000000006 ;
X = 16557351571215.059 ;
X = 17.066666666666666 ;
X = 15.000000000000053 ;
X = 18764998447377.066 ; /* here we have hit an infinite loop */

The problem doesn’t persist when computing with rationals:

nextr(R, R).
nextr(R, S) :-
    rational(R, A, B),
    T is B rdiv (A mod B),
    nextr(T, S).

?- R is rational(7.0e-15), nextr(R, S), X is float(S).
X = 7.0e-15.
X = 142857142857142.84.
X = 1.1708344846911298. /* very different value than in nextf */
X = 5.8536190852099255.
X = 1.1714827108792627.
X = 5.831491669758346.
X = 1.2026578694295595.
X = 4.934424716961625.
[...]
X = 4.144001095065362.
X = 6.944391634980988.
X = 1.0588827377956718.
X = 16.982905982905983.
X = 1.017391304347826.
X = 57.5.
X = 2.0. /* nicely terminates */

The full rationalize/1 algorithm based on nextr/2 will stop earlier and find 7r1000000000000000,
as a viable approximation to 4436777100798803r633825300114114700748351602688 (7.0e-15), inside an epsilon/2 error (right?), since it uses the mathematically

correct reciprocal via rational numbers. On the other hand rationalize/1 based
on nextf/2 will produce nonsense. It will have the wrong nominator and
denominator, because of incorrect values such as 1.1851851851851851,

after a few iterations, whereby the correct value is 1.1708344846911298. You
need to check with nextr/2 what it is as a rational number. I displayed float
numbers so that can compare nextf/2 and nextr/2, but nextr/2 produces rationals.

Edit 20.12.2022
Here is a simulation what a 128-bit decimal (quadruple precision) would do. It
doesn’t help either. It also produces a nonsense value after very few iterations.
Its the same function nextf/2 but called with a 128-bit decimal:

/* Jekejeke Prolog 1.5.5 */
?- nextf(0m7.0e-15, X).
X = 0m7E-15;
X = 0m142857142857142.8571428571428571429;
X = 0m1.166666666666666666608333333333333 /* different value than in nextr */

So both double and long double (quadruple precision) produce nonsense, not
what an exact rational number computation would demand, starting with
r = rational(x) from x a double.

1 Like

You are right. Sorry. There was an original version that was later replaced by a copy from ECLiPSe.

It also seems the rational way is the way to go. I guess it is a little slower. That is probably acceptable.

Not sure. Maybe? Maybe not? The operation floor(x) and subtract boils down
to divmod/4, since you cannot throw away xi, you need both xi and xf. The operation
1.0/xf does practically nothing in the rational setting. If the algorithm improves

by finding smaller better approximations, it gets faster, since it stops earlier.
Maybe there is a positive trade-off? Here is my condensed code. I got a built-in
rational/3 in my Prolog system with mode (-,+,+) which avoids any GCD calculation,

which is not necessary in the Stern-Boroct tree, the rationals are already GCD-ed.
Can’t find the same mode in SWI-Prolog, any workaround available?

% rat_rationalize(+Float, -Rational)
:- private rat_rationalize/2.
rat_rationalize(F, S) :-
   R is rational(F),
   rational(R, V, W),
   rat_iter(V#W, 1#0, 0#1, F, A#B),
   rational(S, A, B).

% rat_iter(+Rational, +Rational, +Rational, +Float, -Rational)
:- private rat_iter/5.
rat_iter(_, X, _, Y, X) :-
   X \= 1#0,
   float(X) =:= Y, !.
rat_iter(_#0, X, _, _, X) :- !.
rat_iter(V#W, M#N, P#Q, Y, X) :-
   divmod(V, W, D, U),
   user: *(D, M, H),
   user: +(H, P, A),
   user: *(D, N, J),
   user: +(J, Q, B),
   rat_iter(W#U, A#B, M#N, Y, X).

The algorithm has also a defect, sometimes it nevertheless overshoots,
and misses the minimal solution versus the float/1 stopping condition, since
it is leap frogging along the Stern-Brocot tree. In the Stern-Brocot tree one

computes a median. This would be a single step:

  • Let L = a/b and H = c/d; compute the mediant M = (a + c)/(b + d).

The algorithm does now leap frogging, in that it computes multiple
steps k, the D from the divmod/4:

  • compute the left-iterated mediant M^k = (k * a + c)/(k * b + d).

Edit 21.12.2022
You get the same leap frogging in SWI-Prolog, there is a comment:

/* compute a = a*xi + b for both numerator and denominator */

So in SWI-Prolog k = xi, in my solution k = D from divmod/4.

Based on @j4n_bur53 thorough analysis, I’ve been trying to get my head around the rationalize algorithm and came up with an SWIP implementation that seems to work on a few samples. Because SWIP rationals are primitive, the “#” notation (with vars) isn’t supported and the eqivalent of “1#0” isn’t a legal rational at all. So I replaced all the X#Y with simple tuples, i.e., (X,Y). I also re-implmented the rat_iter/5 as a single clause as a model for a C-loop. The end result:

rat_rationalize(Flt, Rat) :-
   R is rational(Flt), rational(R, N, D),
   rat_iter((N,D), (1,0), (0,1), Flt, Rat).

rat_iter((V,W), (M,N), (P,Q), Flt, Rat) :-
	divmod(V, W, D, U),
	A is D*M+P,
	B is D*N+Q,
	Try is A rdiv B,
	( (float(Try) =:= Flt ; U == 0)  % terminating conditions
	  -> Rat = Try
	  ;  rat_iter((W,U), (A,B), (M,N), Flt, Rat)
	).

Other than the termination test, it just uses (unbounded) integer arithmetic.

Examples:

?- rat_rationalize(7e-15,S).
S = 7r1000000000000000.

?- rat_rationalize(-pi,S).
S = -245850922r78256779.

I assume this is still subject to “leap-frogging” (uses divmod for D steps) but I don’t understand the end effect. Does it just result in a less than optimal answer or is non-termination a possibility?

Other corner cases to try? Or suggestions?

Without leap frogging it would be something along (homework translate
the Haskell to Prolog and find some examples where without leap frogging
gives the better rationalize result):

mediant :: Interval -> Ratio
mediant ((m, n), (m', n')) = (m+m', n+n')

approximate :: (Ratio -> Ordering) -> [Ratio]
approximate c = h ((0,1),(1,0)) where
    h interval@(lo, hi) = mid : case c mid of
        LT -> h (mid, hi)
        GT -> h (lo, mid)
        EQ -> []
        where mid = mediant interval

https://de.wikipedia.org/wiki/Stern-Brocot-Folge#Stern-Brocot-Baum

In the above a real number is represented as a function (Ratio → Ordering),
and the above doesn’t stop for irrational real numbers, I guess it generates an
infinite list then. Adding a float/1 stopping condition would give rationalize/1

without leap frogging. You should then find examples where the rationalize/1
without leap frogging gives slightly better results than the rationalize with leap
frogging. But the rationalize without leap frogging needs also more iteration,

since it has smaller steps. The difference between the two algorithms is
the same difference as between GCD based on subtraction and GCD
based on modulo. Like here:

/* no leap frogging */
function gcd(a, b)
    while a ≠ b 
        if a > b
            a := a − b
        else
            b := b − a
    return a

/* leap frogging */
function gcd(a, b)
    while b ≠ 0
        t := b
        b := a mod b
        a := t
    return a

https://en.wikipedia.org/wiki/Euclidean_algorithm#Implementations

But for rationalize there is an additional stopping condition, which a leap frogging
might overrun. If you dig deeper you find indeed more relationships between
GCD and the Stern Brocot tree, just try googling it. Digging even further into

the past, you might end in geometric anthyphairesis (Euclid’s Elements X.2)
which is from 300 BC and there are earlier sources as well.

Edit 24.12.2022
I was playing with the idea of a combined leap frogging and non-leap frogging
algorithm. It would first do leap frogging and in the end-phase non-leap frogging.
But this would possibly only pay off if the examples where non-leap frogging

gives a better result aren’t that rare. So that there is a pay-off for the complication
of a hybrid algorithm. But didn’t have time yet to explore this space empirically,
i.e. for example count the examples where non-leap frogging gives a better result.

BTW: We have recently learnt that there are more ways to compute GCD,
for example there is Stein’s algorithm. What kind of rationalize/1 would this
GCD algorithm give? Thats also an open question on my side.

My understanding of rationalize is that it only attempts to produce a “nice” result equivalent to the input float; it doesn’t have to be the optimal result, i.e., the one with the smallest denominator. Using the “leap frogging” algorithm with rational arithmetic seems to achieve that objective, i.e., nice, and certainly better than the current version. I’ve tested with thousands of random floats in various ranges, and haven’t hit any issues so far.

Not in Prolog, that I’m aware of. Looking at an implementation in C, I believe GMP lets you set a rational without “canonicalization” but I need to test it. Probably won’t make much of a difference in speed, but I don’t think speed is the primary concern.

I tried your 903/(pi+903) test case with the rational prototype (SWIP version) and got the same result (222529257r223303450), so that was promising.

To be honest, I don’t quite know what the point of rationalize even is. It doesn’t produce the precise equivalent rational. It does produce a smaller (bitsize) rational that converts back to the same float, but to what end? In effect, you’re just re-introducing rounding error. So why not use the float? A “nice” rational may be slightly more readable than the precise rational, but, again, for readability why not use the float?

In any case, an implementation of rationalize using rational arithmetic seems “nicer” than one using floating point (in general). How much “nicer” it has to be is the open question, but seems to me to be largely an academic issue.

Big SWIP picture - the same test using rational (precise conversion), current rationalize, and leap frog algorithm:

?- aggregate_all((sum(L),count), (between(800,1000,N), 
F is N/(pi+N), R is rational(F), rational(R,A,B), 
L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 21129,
C = 201,
AVG = 105.11940298507463.

?- aggregate_all((sum(L),count), (between(800,1000,N), 
F is N/(pi+N), R is rationalize(F), rational(R,A,B), 
L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 14217,
C = 201,
AVG = 70.73134328358209.

?- aggregate_all((sum(L),count), (between(800,1000,N), 
F is N/(pi+N), rat_rationalize(F,R), rational(R,A,B), 
L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 11038,
C = 201,
AVG = 54.91542288557214.

So “leap frog” is a significant improvement over current rationalize and represents a improvement of about 50% over the precise conversion. According to your results, the non-leap frog version (optimal?) represents a further saving of about 0.5% on this test, but can’t be used as the only mechanism. Are there any “corner cases” when it’s significantly more than 0.5%.

So my question remains: when does a further improvement of ~1% in rationalize matter? What is the motivating use case for rationalize ?

I probably didn’t express myself well. My understanding is that the two implementations, original rationalize and rat_rationalize, both use leap frogging. The improvement in rat_rationalize is due to the use of rational rather than floating point arithmetic. Right?

Now you have a version that doesn’t leap frog. When I compare that with a leapfrogging version using your numbers, I only see a 0.5% improvement on the “pi” test.

And that’s much appreciated, although I’m still looking for a motivating use case for rationalize that would justify the extra effort.

I am completely ignorant of most things in this thread and in the world in general. Maybe this gives me an unexpected advantage :slight_smile:

The way I understood it (Jan has mentioned this in some of his posts here, just keep in mind I am guessing!), the motivation was to be able to use Prolog numbers in Prolog programs, but use those for currency.

In Prolog code, I can write the literal 0.1; and it will be parsed as a floating-point number.

?- float(0.1).
true.

At this point this is no longer 1/10, it is something else (what exactly I don’t know).

What if I wanted to simply say that I have 10 cents though? If my unit was a Euro, then 10 cents is €0.1. I see at least two obvious ways of getting out of this:

  1. Don’t use the Prolog floating point parser for this at all, instead make certain all currency literals are parsed as fixed decimals.
  2. Make it possible to guess what the (rational) number represented by the floating literal was supposed to be.

It seems that option 2 was chosen, and now we have:

?- X is rational(2.1).
X = 4728779608739021r2251799813685248.

?- X is rationalize(2.1).
X = 21r10.

The other option is to introduce another literal type. In C at least we can append some letter at the end of numbers to help the parser I guess, like “10L” for a long int?

Keep in mind that rational() and rationalize() predate rationals like 21r10. At the time you couldn’t type rational literals in Prolog code (right?), you had to use one of the functions, and either way 21r10 is not exactly the same as typing “€2.10”.

I now notice that the docs for rational and rationalize are outdated, it still shows:

?- A is rationalize(0.25).

A = 1 rdiv 4
?- A is rationalize(0.1).

A = 1 rdiv 10

when in fact we now get the rationals printed as 1r10.

And is the number of the footnote a joke/easter egg?

Screenshot 2022-12-28 at 11.02.08

1 Like

Thanks for spotting. Updated.

Otherwise, I don’t know. As the docs say, these functions come from Common Lisp. Proper handling of currency probably requires decimal floats, fixed point decimals or simply using big integers to represent cents (or other fractions, depending on the currency). Within a certain domain, rationalize probably does what is needed. One would surely need to proof this though. Near the edges of the domain it starts finding denominators that are smaller than the “pretty” decimal denominator. In guess you could make a variation on rationalize that aims at “pretty decimals”.

Getting a small denominator looks attractive. Still, I have my doubt whether turning floats into rationals has any sensible use at all. I never encountered a situation where it was needed, I fail to imagine one and nobody else did (so far).

I did mention a use case before - converting float to a precise rational for comparison with another rational. This enables separating intervals which are points (bounds are mathematically equal) from intervals which contain more than 1 real value. However this uses rational, not rationalize.

Yes, that’s a valid use case - converting floats in input data to an “intended” rational, i.e., 0.1 to 1r10 as opposed to 3602879701896397r36028797018963968, the precise equivalent of 0.1 as a floating point value.

The reason for my question was whether result of rationalize has to be the optimal rational value (smallest possible denominator) rather than just “good enough” . I think for the use case above, “good enough” is sufficient. (I believe @j4n_bur53 is investigating whether an optimal value can be efficiently produced.).

I’ve implemented a version of rationalize using the rational “leap frog” algorithm: seems to behave as expected:

?- R is rationalize(7e-15).
R = 7r1000000000000000.

?- aggregate_all((sum(L),count), (between(800,1000,N), 
    F is N/(pi+N), R is rationalize(F), rational(R,A,B), 
    L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 11038,
C = 201,
AVG = 54.91542288557214.

The price to be paid for superior quality is a performance drop of about 30% using GMP. I’ve tested a libbf build, but haven’t compared it with a libbf of the current release; it’s about 8 times slower than the GMP current release.

Once successfully compiled, the C code worked first time, a testament to using Prolog to build a rapid prototype. It was fairly straight forward to map the Prolog to GMP function calls. The Prolog prototype of the iteration loop"

rat_iter((V,W), (M,N), (P,Q), Flt, Rat) :-
	divmod(V, W, D, U),
	A is D*M+P,  %  A = Pnxt
	B is D*N+Q,  %  B = Qnxt
	Try is A rdiv B,
	( (float(Try) =:= Flt ; U == 0)  % terminating conditions
	  -> Rat = Try
	  ;  rat_iter((W,U), (A,B), (M,N), Flt, Rat)
	).

and the C loop:

  for(;;)
  { mpz_fdiv_qr(d,u,v,w);
    mpz_addmul(p,d,m);
    mpz_addmul(q,d,n);
    if ((mpz_fdiv(p,q) == f) || (mpz_sgn(u) == 0)) {  // terminating conditions
      mpq_set_num(r,p); mpq_set_den(r,q);             // final answer
      mpq_canonicalize(r);
      break;
    } else {
      mpz_set(v,w);  mpz_set(w,u);        
      mpz_swap(m,p); mpz_swap(n,q);
    }
  } // for(;;)

The C code is somewhat simpler than the current implementation because it doesn’t have to compensate for floating point rounding errors, e.g., checking for infinity or the size of a prospective rational result.

I had to extend the set of supported GMP routines in libbf, notably mpz_addmul. In doing so it appears that the existing mpz_addmul_ui and mpz_submul_ui do not conform to the GMP semantics which states:

Function: void mpz_addmul (mpz_t rop, const mpz_t op1, const mpz_t op2)
Function: void mpz_addmul_ui (mpz_t rop, const mpz_t op1, unsigned long int op2)
Set rop to rop + op1 times op2.

I’m hesitant to change anything without independent verification for fear of breaking something (?), but I can add fixes to a future PR if that’s agreed.

Mostlikely you don’t need to pay this price:

What was the test suite? Does mpq_canonicalize() perform
a GCD? I find in some source code a gcd variable, so I guess it
does perform some GCD, which is unnecessary in the present case:

 	mpq_canonicalize (mpq_t op)
{	 	{
  mpz_t gcd;	 	  mpz_t gcd;

https://chromium.googlesource.com/native_client/nacl-gcc/+/refs/heads/main/gmp-4.3.1/mpq/canonicalize.c

As I have already mentioned, this is not necessary. One can show
that p and q are already co-prime by the algorithm. If you don’t
believe me, you can try run your test suite, and log the GCD,

and see that you have always gcd(p,q)=1 already before calling
mpq_canonicalize(). A more realiable method would be to
perform some mathematical reasoning, and show that forall

p,q in the Stern Brocot tree, we always have gcd(p,q)=1. Using
google search it should not be so difficult to find a paper that shows
exactly that. Its sometimes a subject in computer science

lectures, since the whole matter has quite some fame beyond the fact
that LISP might have a rationalize functions, one might remember
Edsger W. Dijkstra currious function fusc (EWD570 from 1976). You

will mostlikely regain some speed, if you find something else than
mpq_canonicalize(). Can this not be simply omitted? This GCD consumes
possibly nearly half of the performance, so you can cut the performance by

a half if you omit it. And it is safe to omit, since one can show gcd(p,q)=1 anyway.

Edit 31.12.2022
Actually simply google searching might not be that easy. There might
pop up other trees besides the Stern-Brocot Tree, like for example
the Calkin–Wilf tree. So you might literally not see the forest for the trees.

But if you find a paper reference, you can later put it in the code,
so that people understand why mpq_canonicalize() was omitted.
I should do the same in my code, but was busy with other stuff.

Meanwhile I extended my rationalize/1 implementation, so that it
can detect and handle quadruple precision. Interestingly the pi test
case is still a test case which has a short rationalize/1. The new

qp/2 built-in is needed so that I can fetch pi in decimal128:

?- X is 903/(pi+903).
X = 0.9965330002738426.

?- X is 903/(qp(pi)+903).
X = 0m0.9965330002738426362794241482909284.

qp/2 stands for quadruple precision, I now get:

?- X is rational(903/(qp(pi)+903)).
X = 2491332500684606590698560370727321#2500000000000000000000000000000000.

?- X is rationalize(903/(qp(pi)+903)).
X = 161172659700524787#161733389316997319.