C float comparison based on "extended precision" floats fails unexpectedly

The Debian integration tests revealed an interesting bug. It showed up on i386 platform using Debian unstable, gcc 12.2. The bottom line is that rationalize/1 failed to do its job. Much simplified this performs

int num, den;
double target;

...
  while((double)num/(double)den != target)
     <update num and den>

Trying this with target 0.1, the second iteration nicely comes with num = 1, den = 10, but the while loop just continues (until it decides the numbers got too big, causing it to use rational/1).

So, 1/10 == 0.1 fails using gcc-12 on this platform. Why? Well, we can “fix” this using

volatile double f = (double)num/(double)den;
if ( f == target ) ...

The volatile keyword tells the compiler to put f in memory and thus forces the intermediate result into a proper 64 bit double. In the original code the (double)num/(double)den stays in a 80 bit extended float register and the target is promoted to this type before comparing, so we compare the 80 bits approximation of 0.1 to the 64 bit and fail.

In fact, using enough optimization gcc sees there is no point in the volatile declaration and it fails again. For this reason I now put the double in a union with a 64-bit int and to equality testing by comparing the integers. That forces the compiler to put the intermediate in memory.

Patch is in PORT: Fixed rationalize/1 and cmpFloatNumbers() to fail due to gcc · SWI-Prolog/swipl-devel@3e8df9b · GitHub

This works. Now two questions remains

  • Is what gcc does actually correct? Maybe that is not the right question as the gcc notes on float processing acknowledge the default prefers performance over strict compliance with the C standard. So far, no previous versions of gcc did this though (AFAIK).
  • Is there a more elegant way to get the result we want? One is probably to use gcc flags to tweak its float handling. Downside is that other compilers may do the same and these are the only two places where we want to compare floats at 64 bits.

It is a very long standing issue with compilers and i387 style FPU, see option -fexcess-precision in GCC manual.

1 Like

Thanks. I’ll leave that to @ridgeworks (please pull first as I did a small cleanup by using modf() rather than floor() and subtraction). This doesn’t come from ECLiPSe. One of the related routines did, but was later replaced and I think there is no longer any relation between the two at the implementation level. There was an older implementation for rationalize, but @ridgeworks replaced that to properly support big integers.

Float limits are indeed the problem. Using long double for x, xi and xf gets the right result. Possibly that is the simple fix?

I don’t recall doing anything to rationalize but I’ll put this suggestion in my queue. (I’m not in a position to do any C development until after the holiday.)

You are right. Sorry. There was an original version that was later replaced by a copy from ECLiPSe.

It also seems the rational way is the way to go. I guess it is a little slower. That is probably acceptable.

Based on @anon95304481 thorough analysis, I’ve been trying to get my head around the rationalize algorithm and came up with an SWIP implementation that seems to work on a few samples. Because SWIP rationals are primitive, the “#” notation (with vars) isn’t supported and the eqivalent of “1#0” isn’t a legal rational at all. So I replaced all the X#Y with simple tuples, i.e., (X,Y). I also re-implmented the rat_iter/5 as a single clause as a model for a C-loop. The end result:

rat_rationalize(Flt, Rat) :-
   R is rational(Flt), rational(R, N, D),
   rat_iter((N,D), (1,0), (0,1), Flt, Rat).

rat_iter((V,W), (M,N), (P,Q), Flt, Rat) :-
	divmod(V, W, D, U),
	A is D*M+P,
	B is D*N+Q,
	Try is A rdiv B,
	( (float(Try) =:= Flt ; U == 0)  % terminating conditions
	  -> Rat = Try
	  ;  rat_iter((W,U), (A,B), (M,N), Flt, Rat)
	).

Other than the termination test, it just uses (unbounded) integer arithmetic.

Examples:

?- rat_rationalize(7e-15,S).
S = 7r1000000000000000.

?- rat_rationalize(-pi,S).
S = -245850922r78256779.

I assume this is still subject to “leap-frogging” (uses divmod for D steps) but I don’t understand the end effect. Does it just result in a less than optimal answer or is non-termination a possibility?

Other corner cases to try? Or suggestions?

My understanding of rationalize is that it only attempts to produce a “nice” result equivalent to the input float; it doesn’t have to be the optimal result, i.e., the one with the smallest denominator. Using the “leap frogging” algorithm with rational arithmetic seems to achieve that objective, i.e., nice, and certainly better than the current version. I’ve tested with thousands of random floats in various ranges, and haven’t hit any issues so far.

Not in Prolog, that I’m aware of. Looking at an implementation in C, I believe GMP lets you set a rational without “canonicalization” but I need to test it. Probably won’t make much of a difference in speed, but I don’t think speed is the primary concern.

I tried your 903/(pi+903) test case with the rational prototype (SWIP version) and got the same result (222529257r223303450), so that was promising.

To be honest, I don’t quite know what the point of rationalize even is. It doesn’t produce the precise equivalent rational. It does produce a smaller (bitsize) rational that converts back to the same float, but to what end? In effect, you’re just re-introducing rounding error. So why not use the float? A “nice” rational may be slightly more readable than the precise rational, but, again, for readability why not use the float?

In any case, an implementation of rationalize using rational arithmetic seems “nicer” than one using floating point (in general). How much “nicer” it has to be is the open question, but seems to me to be largely an academic issue.

Big SWIP picture - the same test using rational (precise conversion), current rationalize, and leap frog algorithm:

?- aggregate_all((sum(L),count), (between(800,1000,N), 
F is N/(pi+N), R is rational(F), rational(R,A,B), 
L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 21129,
C = 201,
AVG = 105.11940298507463.

?- aggregate_all((sum(L),count), (between(800,1000,N), 
F is N/(pi+N), R is rationalize(F), rational(R,A,B), 
L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 14217,
C = 201,
AVG = 70.73134328358209.

?- aggregate_all((sum(L),count), (between(800,1000,N), 
F is N/(pi+N), rat_rationalize(F,R), rational(R,A,B), 
L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 11038,
C = 201,
AVG = 54.91542288557214.

So “leap frog” is a significant improvement over current rationalize and represents a improvement of about 50% over the precise conversion. According to your results, the non-leap frog version (optimal?) represents a further saving of about 0.5% on this test, but can’t be used as the only mechanism. Are there any “corner cases” when it’s significantly more than 0.5%.

So my question remains: when does a further improvement of ~1% in rationalize matter? What is the motivating use case for rationalize ?

I probably didn’t express myself well. My understanding is that the two implementations, original rationalize and rat_rationalize, both use leap frogging. The improvement in rat_rationalize is due to the use of rational rather than floating point arithmetic. Right?

Now you have a version that doesn’t leap frog. When I compare that with a leapfrogging version using your numbers, I only see a 0.5% improvement on the “pi” test.

And that’s much appreciated, although I’m still looking for a motivating use case for rationalize that would justify the extra effort.

I am completely ignorant of most things in this thread and in the world in general. Maybe this gives me an unexpected advantage :slight_smile:

The way I understood it (Jan has mentioned this in some of his posts here, just keep in mind I am guessing!), the motivation was to be able to use Prolog numbers in Prolog programs, but use those for currency.

In Prolog code, I can write the literal 0.1; and it will be parsed as a floating-point number.

?- float(0.1).
true.

At this point this is no longer 1/10, it is something else (what exactly I don’t know).

What if I wanted to simply say that I have 10 cents though? If my unit was a Euro, then 10 cents is €0.1. I see at least two obvious ways of getting out of this:

  1. Don’t use the Prolog floating point parser for this at all, instead make certain all currency literals are parsed as fixed decimals.
  2. Make it possible to guess what the (rational) number represented by the floating literal was supposed to be.

It seems that option 2 was chosen, and now we have:

?- X is rational(2.1).
X = 4728779608739021r2251799813685248.

?- X is rationalize(2.1).
X = 21r10.

The other option is to introduce another literal type. In C at least we can append some letter at the end of numbers to help the parser I guess, like “10L” for a long int?

Keep in mind that rational() and rationalize() predate rationals like 21r10. At the time you couldn’t type rational literals in Prolog code (right?), you had to use one of the functions, and either way 21r10 is not exactly the same as typing “€2.10”.

I now notice that the docs for rational and rationalize are outdated, it still shows:

?- A is rationalize(0.25).

A = 1 rdiv 4
?- A is rationalize(0.1).

A = 1 rdiv 10

when in fact we now get the rationals printed as 1r10.

And is the number of the footnote a joke/easter egg?

Screenshot 2022-12-28 at 11.02.08

1 Like

Thanks for spotting. Updated.

Otherwise, I don’t know. As the docs say, these functions come from Common Lisp. Proper handling of currency probably requires decimal floats, fixed point decimals or simply using big integers to represent cents (or other fractions, depending on the currency). Within a certain domain, rationalize probably does what is needed. One would surely need to proof this though. Near the edges of the domain it starts finding denominators that are smaller than the “pretty” decimal denominator. In guess you could make a variation on rationalize that aims at “pretty decimals”.

Getting a small denominator looks attractive. Still, I have my doubt whether turning floats into rationals has any sensible use at all. I never encountered a situation where it was needed, I fail to imagine one and nobody else did (so far).

I did mention a use case before - converting float to a precise rational for comparison with another rational. This enables separating intervals which are points (bounds are mathematically equal) from intervals which contain more than 1 real value. However this uses rational, not rationalize.

Yes, that’s a valid use case - converting floats in input data to an “intended” rational, i.e., 0.1 to 1r10 as opposed to 3602879701896397r36028797018963968, the precise equivalent of 0.1 as a floating point value.

The reason for my question was whether result of rationalize has to be the optimal rational value (smallest possible denominator) rather than just “good enough” . I think for the use case above, “good enough” is sufficient. (I believe @anon95304481 is investigating whether an optimal value can be efficiently produced.).

I’ve implemented a version of rationalize using the rational “leap frog” algorithm: seems to behave as expected:

?- R is rationalize(7e-15).
R = 7r1000000000000000.

?- aggregate_all((sum(L),count), (between(800,1000,N), 
    F is N/(pi+N), R is rationalize(F), rational(R,A,B), 
    L is msb(A)+msb(B)+2), (S,C)), AVG is float(S/C).
S = 11038,
C = 201,
AVG = 54.91542288557214.

The price to be paid for superior quality is a performance drop of about 30% using GMP. I’ve tested a libbf build, but haven’t compared it with a libbf of the current release; it’s about 8 times slower than the GMP current release.

Once successfully compiled, the C code worked first time, a testament to using Prolog to build a rapid prototype. It was fairly straight forward to map the Prolog to GMP function calls. The Prolog prototype of the iteration loop"

rat_iter((V,W), (M,N), (P,Q), Flt, Rat) :-
	divmod(V, W, D, U),
	A is D*M+P,  %  A = Pnxt
	B is D*N+Q,  %  B = Qnxt
	Try is A rdiv B,
	( (float(Try) =:= Flt ; U == 0)  % terminating conditions
	  -> Rat = Try
	  ;  rat_iter((W,U), (A,B), (M,N), Flt, Rat)
	).

and the C loop:

  for(;;)
  { mpz_fdiv_qr(d,u,v,w);
    mpz_addmul(p,d,m);
    mpz_addmul(q,d,n);
    if ((mpz_fdiv(p,q) == f) || (mpz_sgn(u) == 0)) {  // terminating conditions
      mpq_set_num(r,p); mpq_set_den(r,q);             // final answer
      mpq_canonicalize(r);
      break;
    } else {
      mpz_set(v,w);  mpz_set(w,u);        
      mpz_swap(m,p); mpz_swap(n,q);
    }
  } // for(;;)

The C code is somewhat simpler than the current implementation because it doesn’t have to compensate for floating point rounding errors, e.g., checking for infinity or the size of a prospective rational result.

I had to extend the set of supported GMP routines in libbf, notably mpz_addmul. In doing so it appears that the existing mpz_addmul_ui and mpz_submul_ui do not conform to the GMP semantics which states:

Function: void mpz_addmul (mpz_t rop, const mpz_t op1, const mpz_t op2)
Function: void mpz_addmul_ui (mpz_t rop, const mpz_t op1, unsigned long int op2)
Set rop to rop + op1 times op2.

I’m hesitant to change anything without independent verification for fear of breaking something (?), but I can add fixes to a future PR if that’s agreed.

Thanks. If it is try that the mpq_canonicalize() can be omitted the performance may actually be better.

I’m looking at the definition of mpz_addmul_ui(). I assume I’m missing something obvious, but it seems to be doing that GMP docs state!?

Anyway, please have a look at the need for mpq_canonicalize() and make a PR.

I’m happy to take your word that canonicalization is unnecessary; I removed it and indeed there appears to be no issue. However, there’s only a minimal effect on performance since it is only done once on the final result.

The test set is minimal: rationalize(pi). While absolute performance is dependent on the data, I was only looking for relative performance, so didn’t pursue it.

The code in bf_gmp.h:

static inline void
mpz_addmul_ui(mpz_t r, const mpz_t n1, unsigned long n2)
{ mpz_t add;

  mpz_init(add);
  bf_mul_ui(add, n1, n2, BF_PREC_INF, BF_RNDN);
  bf_add(r, n1, add, BF_PREC_INF, BF_RNDN);
  mpz_clear(add);
}

As I read it this calculates r = n1+n1*n2 ; my interpretation of the spec says it should calculate r = r+n1*n2.

However, looking at the code base I don’t think this ever gets called. The only references are in function ar_add_si in pl-arith.c for adding a long to a Number when that Number is a rational (V_MPQ). However, ar_add_si is only called from builtin predicates between/3 and bounded_number/3 where Number is always an integer. This probably makes it safe to fix (if necessary) but hard to test.

1 Like

On SWIP, the differences appear to be small:

?- X is pi, time((between(1,10000,_), rat_rationalize(X, Y), fail; true)).
% 310,004 inferences, 0.298 CPU in 0.301 seconds (99% CPU, 1039093 Lips)
X = 3.141592653589793.

?- X is pi, time((between(1,10000,_), rat_rationalize2(X, Y), fail; true)).
% 310,000 inferences, 0.291 CPU in 0.291 seconds (100% CPU, 1066010 Lips)
X = 3.141592653589793.

But you’re probably correct in stating that the termination test for float equivalence comes at a significant cost. mpz_fdiv is a fairly beefy function entailing special cases, rounding modes, etc.

But the C version is fairly quick IMO, particularly using GMP:

?- X is pi, time((between(1,10000,_), Y is rationalize(X), fail; true)).
% 20,000 inferences, 0.037 CPU in 0.037 seconds (100% CPU, 541697 Lips)
X = 3.141592653589793.

That’s less than 4 microseconds (on my 7 year old hardware) to rationalize pi, which if I recall takes about 14 iteration steps. This seems more than adequate for the identified use case so I’m not sure further effort to improve it is warranted. Tuning the Prolog versions is a different matter, and dependant on Prolog implementation, but I’m not sure what the point would be outside academic curiosity.

Always thought SWIP = SWI-Prolog.

Anyway SWI-Prolog, V 9.1.0. MacOS 10.13.6, 3Ghz, 8-Core Intel Xeon E5

Yes, I dropped the ball on this one - unoptimized artithmetic:

?- X is pi, time((between(1,10000,_), rat_rationalize(X, Y), fail; true)).
% 1,010,000 inferences, 0.376 CPU in 0.376 seconds (100% CPU, 2683651 Lips)
X = 3.141592653589793.

?- X is pi, time((between(1,10000,_), rat_rationalize2(X, Y), fail; true)).
% 880,000 inferences, 0.222 CPU in 0.223 seconds (100% CPU, 3956888 Lips)
X = 3.141592653589793.

and optimized (“arithmetic inferences” compiled away):

?- X is pi, time((between(1,10000,_), rat_rationalize(X, Y), fail; true)).
% 310,000 inferences, 0.308 CPU in 0.308 seconds (100% CPU, 1007144 Lips)
X = 3.141592653589793.

?- X is pi, time((between(1,10000,_), rat_rationalize2(X, Y), fail; true)).
% 310,000 inferences, 0.165 CPU in 0.165 seconds (100% CPU, 1877332 Lips)
X = 3.141592653589793.

The prefer_rationals flag impacts the semantics of A/B so that also has an impact.

I believe per-iteration canonicalization is avoided if the test is A/B =:= Flt and the prefer_rationals flag is false. I also think this doesn’t affect the C version, since there is no rational generated until the final result and hence no per-iteration canonicalization, i.e., mpz_fdiv(a,b) is equivalent (I think) to A/B with prefer_rationals = false.

Don’t disagree with anything you’ve said. I think this current exercise is a good example of using Prolog to develop/verify the algorithm which then serves as an executable specification for the C function. That doesn’t mean that the low level detail of data representation or operation must be equivalent. The new C code (not much in common with existing version) for the iteration structurally resembles the Prolog “specification” (see earlier post) but you’ve already noted some differences. The surrounding setup/cleanup code is quite different; for simplicity everything (even 0 and 1) is a “bigint” and leave it to GMP sort out any optimizations at the expense of a function call. I think that’s the right call in the near term for the available use cases (rationalize(pi) < 4 microseconds), but new requirements may yet come to light.

Yes, the value of pi being used is the closest floating point value to real pi. Given that, the sequence is a match up to the final termination value:

?- rat_rationalize(pi,R).
3, 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 3, 
R = 245850922r78256779.