Arithmetic functions rational/1 and rationalize/1

Ref: SWI-Prolog -- Arithmetic, rational/1 states:

For every normal float X the relation X =:= rational(X) holds.

and rationalize/1 states:

For every normal float X the relation X =:= rationalize(X) holds.

which seems to imply that for every normal float X, rational(X) =:= rationalize(X). I’m pretty sure that isn’t the case, otherwise you wouldn’t need both.

Now there’s currently a rational test case:

test(roundtrip_rationalize) :-
    forall(between(1,1000,_),
           ( rfloat(F),
             F =:= rationalize(F))).

which currently passes. Since rationalize is supposed to approximate the float value, that would suggest that arithmetic comparison isn’t exact either, which would be a bug IMO.

Another way of looking at it is that there are many different rational values that convert to the same float (down conversion since there are more rationals than floats) but that shouldn’t make them equal (IMO).

So bug or feature?

1 Like

No comments yet, so let me propose the following:

Change the documentation for rationalize/1 to

For every normal float X the relation X =:= float(rationalize(X)) holds.

and modify the unit test case to reflect this:

test(roundtrip_rationalize) :-
    forall(between(1,1000,_),
           ( rfloat(F),
             F =:= float(rationalize(F)))).

That leaves an issue with numeric comparisons. The current SWI-Prolog implementation orders the numeric types are ordered as follows: integer, big_integer, rational, float. When two numbers of different types are compared, the lower order number is converted to the same type as the higher order, which means anything that is compared to a float is converted to a float (if necessary). But float conversion is subject to rounding error so the comparison result may be incorrect, e.g.,

?- 10^16-1 < 1e16.
false.

?- 2^1024 =:= inf.
true.

The BIGNUM libraries provide sufficient support for comparison of different types to avoid the float conversion, and numeric comparison (i.e., function cmpNumbers()) should be modified to use them.

Alternatively, it’s possible to re-order the numeric types, e.g., integer, float, big_integer, rational, but that may have an adverse effect on other arithmetic functions supporting mixed types.

:+1:

I don’t know whether this is “incorrect”. 10^16-1 =:= 1e16. holds. If we interpret float comparison to compare equal if we cannot be certain they are different, this appears correct. If we translate 1e6 to a rational there are an infinite number of rationals that could be considered “correct”, one being equal to 10^16-1, while others are either smaller or larger. That also holds for inf, which just means it is larger than the largest float that can be represented rather than math infinity. I’m not an expert in this field. Smells like a can of worms.

Disagree. This only holds when 10^16-1 is converted to a float before comparison with 1e16. Mathematically it’s incorrect. However it may be true that float(10^16-1) =:= 1e16.

Again I disagree. 1e6 is a precise real value and is only equal to 1 rational representing the same real number and it’s 10^16, not 10^16-1.

I see no reason why the float value inf, like any other float, can’t compared with rational values; any finite rational is less than inf and greater than -inf. In this sense it is equivalent to mathematical infinity and, yes, it’s also " larger than the largest float that can be represented ".

What smells to me is converting a rational value to a float before comparing with another float, when floats are a subset of rationals. I.e., any normal float value can be converted to a unique equivalent rational value, but the reverse is not true. And that’s why the current cmpNumbers() is mathematically incorrect, but I don’t see why it has to be that way.

Smells like an earlier discussions on what floats are, where we decided there are two views, (1) they are a set of precise numbers that are a subset of reals (and also of rational numbers) or (2) they represent an interval (possibly half-open for ±inf) of real numbers.

I consider your approach a it dangerous. As 2^1024 < inf, I guess you will agree that 2^1025 < inf as well, no? But now, after

?- set_prolog_flag(float_overflow, infinity).

we get

?- A is 2.0**1024.
A = 1.0Inf.

So far, so good, but then we would get this as a result. I think that is rather undesirable.

?- A is 2.0**1024, 2^1025 < A.
true

While currently, we have the result below. This states (I think) that in the domains of IEEE754 doubles we cannot tell they are different.

?- A is 2.0**1024, 2^1025 =:= A.
A = 1.0Inf.

I still prefer the second view on floats: as an interval of reals or, if you prefer, an approximation of some real number. In this world. the current behavior makes (I think) perfect sense. Before we reconsider I’d like to know how other systems supporting rational numbers or unbounded integers together with floats handle this.

1 Like

If you choose this view what is the result of float < rational when float “interval” includes rational?

The interval of a float is (float - 1ulp, float + 1ulp) or (float - 0.5ulp, float + 0.5ulp)? In the latter case which boundary is included? Current rounding mode is relevant?

IMHO this view is a can of worms.

Many system have different kind of numbers and normally they convert the less precise one to the more precise one before to do the math and in this case it is perfectly doable as floats are a subset of rationals (except special values that have to be considered in special cases).

Of course, this is incorrect as well:

?- 2.0**1024 =:= inf.
true.

But as long as float overflows are converted to float infinities, not much else can be done. In fact:

?- 2.0**1024 =:= 2.0**1025.
true.

Both are undesirable, but they’re due to a floating point overflow so both comparisons are invalid. If you want to compare two finite values outside the range of 64 bit floats, they have to be rationals. And if you want to compare any finite float to a rational the same applies.

But I don’t think that’s what IEEE754 says and it’s not what’s actually (yet) implemented in hardware or any computer language supporting arithmetic that I’m aware of. And trust me that arithmetic on floating point values is quite different from arithmetic on sets of reals. (Think about division by 0.0.)

+1. i.e., keep it simple.

That sounds simple. Effectively this boils down to one of these goals.

101 ?- 1r10 =:= rational(0.1).
false.

102 ?- 1r10 =:= rationalize(0.1).
true.

So, what does “convert the less precise one to the more precise one” mean when converting a float to a rational?. Note that rational/1 provides the exact conversion and rationalize/1 provides a simplified conversion that maintains equality within float rounding.

For now I mostly see undesirable results regardless of the approach taken. That is not a surprise :frowning: It seems mostly a matter of standard. The ISO standard dictates promotion to float AFAIK and all Prolog systems I tried agree that ?- 10^16-1 < 1.0e16. is false. Now there are more places where SWI-Prolog doesn’t follow ISO. We should still have a good reason though. That reason could be in other standards or math.

1 Like

It seems to me that the problem is only due to rationalize presence (whose exact semantic and behavior sounds very confusing to me)

what about to use instead:

?- float(1r10) =:= 0.1.
true.

AFAIK rationalize/1 is present is various systems that define floats and rational numbers (at least, the demand came up quite early when working on rational number support). The idea is to create a rational number with the smallest possible denominator that satisfies this relation for any normal float X.

X =:= float(rationalize(X)).

So, rationalize(0.1) is 1r10 while rational(0.1) is 3602879701896397r36028797018963968 because binary floats cannot represent 0.1.

That is how it works now if we run 1r10 =:= 0.1. but apparently @ridgeworks is unhappy with that. He is surely right that this leads to unexpected results, but that also seems to hold when doing the comparison after converting floats to rationals. Floats are hard :frowning:

I guess this is due to confusion between representation and value: in any language using floats the representation 0.1 is not equal to 1/10.

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 0.1 + 0.1 + 0.1 - 0.3
5.551115123125783e-17

I guess we all agree on that. The issue is what happens on

Rational compare float

where compare is one of the arithmetic comparison operators. Is this the same as

float(Rational) compare float

or

Rational compare rational[ize](float)

If we simplify Rational to (unbounded) integer, ISO and all Prolog systems I tried do “float(Integer) compare float”, @ridgeworks claims “Rational compare rational(float)” is better.

The case Integer op Float is theoretically more troublesome as Integer domain is not a subset of Float domain and Float domain is not a subset of Integer.

Luckily enough this has been historically solved in favor of conversion to float (an arbitrary choice, but fine enough). Compare this to same size unsigned vs signed conversion in C/C++ where the preference toward unsigned is arbitrary.

The case Rational op Float has not this ambiguity as Rational domain is a subset of Float, then the conversion to more expressive domain is an obvious and lossless choice.

If we need to do an automatic conversion of integral Rational to Integer this forces to consider the Integer vs Float case differently (i.e. to handle Integer always as if it was a Rational).

IOW a binary operation should always be done in the common available super type of its operands (if exists), that, for integer and float, is rational.

And you accept this would cause

?- 1r10 =:= 0.1.
false.

And this, which always used to be true now must become false:

?- 18446744073709551614 =:= 18446744073709551614.0
false.

We also have some special floats like -0 and ±Inf that have no rational equivalent.

The problem I still have with this is that while any float can indeed be represented as a rational number, the float itself is often merely an approximation of either some “normal” rational number that cannot be expressed as IEEE double float (0.1, 8446744073709551614.0) or some real number, e.g. pi.

Not only is it better, but it’s correct. As @abramobagnara pointed out, if you only have 64 bit inetgers and doubles, you do have a problem; neither is is strict subset of the other. In this case the only pragmatic choice is to convert the integer to a float and compare., and that’s what most languages define when faced with this issue. This will be correct if the magnitude of the integer =< 2^53, but may not be otherwise.

But if you do have unbounded integers and rationals, as in SWIP with GMP or equivalent, you can implement mathematically correct comparisons, convert both the integer and float to unbounded integer/rational and compare. I can’t say that I’m swayed by the argument that other Prolog implementations aren’t rigorous in this regard.

I should have mentioned GMP in this regard; in particular functions mpz_cmp_d, mpq_cmp_d and mpq_cmp_z. If there are any concerns about implementation, GMP does all the heavy lifting and no explicit conversion is necessary. (And I believe bf_gmp can be extended to do the same.)

This is not surprising as both the written representation “0.1” and “18446744073709551614.0” have not the shown value when they are 64 bits binary float. This is true in any programming language.

As i wrote, special floats value needs special handling. My suggestion is to throw an exception when mixing special values and rationals as likely it is not deliberate.

When in real life we measure something as 13.5 cm we don’t say that 13.5 encapsulate the real world measure of the object, but that 13.5 is an approximated measure of an object with a given precision.

With float it is similar: we don’t say that our PI float variable enclose the real pi value, but that PI is near pi i.e. PI - 0.5ulp <= pi <= PI + 0.5ulp.

Compare this with saying that PI represent something (an interval?) that includes pi (what you call the “engineer” approach, although I’m not sure to understand what you mean with this word).

Again, this is a representation error, like 0.1. Just because I can write a (decimal) fp number in acceptable synatx doesn’t mean it can be represented in IEEE754 binary format. And it should fail IMO.

No, but we can (correctly) compare them with rational values.

And this is where I disagree. A floating point value is not an approximation. In computer memory it is a precise value and the computer instruction set treats it as such. I do agree that only a minuscule subset of reals (or rationals) can be represented precisely as a float. And I can even write numbers that look like floating point values that aren’t precise floating point values (another example: 1e309). But that’s a choice you make when using floating point arithmetic.

When I’m strictly using fp arithmetic I’m pretty much restricted to what IEEE754 allows. But when I’ve got mixed mode arithmetic I think we can and should do better than reducing everything to floating point with all its known issues.

I do (now) see the points. At the same time I still very much have my doubt this would be a good move. After all, we are still faced with

  • Incompatibility wrt mixed integer/float, both to other systems and the ISO standard.
  • Dealing with non-normal floats. Raising an error as @abramobagnara suggests has its own problems. I.e., SWI-Prolog standard order of terms handles all numbers in the same class, only resolving clashes (save value, different type) based on the type. Standard order of terms cannot raise exceptions.
  • The fact that, unlike integers and rational numbers, most floats do not have the value intended by the user. Seems there is a lot of confusion how you should call that but, I think, the fact remains. I always intended to keep these worlds as separated as possible. In the rational world one can typically represent the user’s intended and the math is precise. In the float world, neither holds.

For comparison there is a fair argument and also 1r10 =:= 0.1 failing has some value as it indicates something fishy is going on. Checking the Common Lisp story, we see they compare rationals and floats as rationals. Luckily they do state that combining the two in a function (+.-,*, …) does convert the rational to a float. This seems a bit inconsistent to me. If our point of view is that floats are precise numbers and rationals are a superset of floats we can do the math as rational numbers. But then, Common Lisp seems to state

The types rational and float are disjoint subtypes of type real.

This is all getting pretty confusing :frowning: Breaking compatibility seems a bad idea. Adding yet another flag is always a bad idea … It would be nice if there were a bigger group deciding on these matters …

I guess I was not clear: I meant that mixing rational and special float values (i.e. infinity and nan as subnormal and signed zeroes have nothing special) in an arithmetic binary operator should lead to an exception (as very likely this was not intended).

In case it needs emphasis, I’m only concerned here with comparison of numbers of mixed types. Outside this scope is mixed mode arithmetic and converting external representations (syntax) to floating point values.

Specifically what are we talking about here. ISO only has basic integers and floats. There is a choice to be made when comparing these. Sticking with standard practice, just convert the integer to a float (double(int)) and compare floats. For all practical purposes, this is fine since the magnitude of most integers will be less than 2^53 (9007199254740992). However, with unbounded integer support, the option does exist to convert larger 64 bit integers to GMP “bignums” and close this “loophole”. I really doubt whether this will be a big source of incompatibility, but I’m not overly fussed about it (perhaps I should be).

So I’m looking for incompatibilities with systems with extended arithmetic such as that supported by GMP as used in SWIP. Specifically what is the set of Prolog programs which will break if ported to a version of SWIP with mathematically correct comparison semantics?

And, FWIW, the current implementation is incompatible with any program using GMP in the context we’re talking about here.

The existing doc on standard term ordering is incomplete in that it doesn’t really say how rationals are treated. But I would assume by treating “numbers” as a group for standard ordering purposes, it would inherit arithmetic comparison for ordering numbers in that group. I don’t see that as a practical problem.

I don’t see why any errors need be generated by comparison per se. (Generating a value for the comparison is a different matter, but that has nothing to do with standard term ordering.)

But enforcing separation is impossible - that train has left the station. IMO, that’s a programmer’s decision and hopefully it’s an informed one. I really don’t see why that should be an excuse for not implementing mathematically correct, mixed mode comparison when it’s pretty easy to do so.

Performing mathematically correct arithmetic over mixed modes is a separate, and complicated, topic. Suffice it to say that I think Common Lisp got comparison right. But you’re right we could dispense with floats and just do rational number arithmetic. That would present an even greater incompatibility exposure IMO, not to mention a significant performance hit. So I’m not proposing that should be done, just get comparison right and leave the decision of whether to use floats or rationals up to the programmer. He has all the tools (including a configurable tripwire for auto-converting rationals which get too “big” to floats) at his disposal to do as he sees fit. However:

I don’t understand this at all unless they’re talking about the Lisp implementation rather than in the mathematical sense. I wonder what they mean by type real.

So I’m still not seeing the significant compatibility issue. It’s a bug IMO, and bugs should be fixed even if the fix is “incompatible”. (Not sure what program would rely on a mathematically incorrect numeric comparison.)

And yes, no new flags please, and yes, it would be nice if there was a bigger group. But there doesn’t appear to be one, so I’d rather be a leader than a follower in this respect. I’m happy to take my cues from elsewhere if anybody can point me at them.

Also, FWIW, I’ve built a mathematically correct version of cmpNumbers and tested it to some degree using GMP and an augmented version of libbf. I’ve also tested it on some significant clpBNR examples which supports mixed mode interval bounds. The only “incompatibility” I’ve found so far is the rationalize test case noted earlier. (It was the issue of detecting when an interval is a point value (lower_bound == upper_bound) that lead me down this rabbit hole.)

Or maybe it was intended; I have examples of that. In any case “there’s a flag for this”.