Adding mathematically correct comparison of numbers (reals)

As discussed at length in a previous thread (Arithmetic functions rational/1 and rationalize/1), the arithmetic comparisons in most Prologs tested are mathematically incorrect, largely due, I suspect, to the ISO standard which says mixed numeric type comparisons are to be done by converting non-float values to floats before comparing with a float. Rounding errors can result in numbers appearing to be equal which are not, leading to logical inconsistencies (depending on your interpretation of floating point numbers), e.g.,

?- 1.1 =:= 11r10.
true.

?- 1.1*1.1 =:= 11r10*11r10.
false.

?- 2^1024 < inf.
false.

It was deemed unacceptable to change the current arithmetic comparison semantics since that could jeopardize compatibility with other Prologs and possibly(?) adversely affect the semantics of currently working code. However, it was suggested that adding arithmetic functions to support mathematically correct comparisons as an alternative would be acceptable. So a proposal:

Add a compare(N1,N2) arithmetic function where N1 and N2 are numbers, or expressions which evaluate to numbers, and are interpreted as point values on the (ordered) real number line. This function returns -1 if N1 is less than N2, 0 if N1 equals N2 and 1 if N1 is greater than N2; all comparisons being mathematically correct, i.e., not subject to rounding errors. Simple examples:

% 2 < pi
?- N1 = 2,  -1 =:= compare(N1,pi).
N1 = 2.

% 1.1 =\= 11r10
?- N1 = 1.1, N2 = 11r10, 0 =\= compare(N1,N2).
N1 = 1.1,
N2 = 11r10.

If either N1 or N2 is nan, compare/2 either evaluates to nan or and error is generated depending on the value of the float_undefined global environment flag.

An additional use of this function is the reification of comparison operations, i.e., the conversion of the boolean result of a comparison to a number.

In addition, functions maxreal/2 and minreal/2, similar to max/2 and min/2, using mathematically correct compare would be supported. These two functions would also differ from the current max/2 and min/2 in that they would favour any rational (precise) value in the case of equality, so as to avoid unnecessary “pollution” of precise arithmetic with floats.

Note that any mixed mode arithmetic is unaffected by these functions, i.e. rounding errors will occur when floats are involved. But they do provide equivalent functionality to that found in other languages, e.g., CommonLisp and Python, without affecting existing Prolog compatibility, and without introducing any new global arithmetic flags (which are already complicated enough).

Sounds good. I do have some problems with the names. maxreal refers to real numbers, while the thing is about rational numbers, considering floats to be concrete rational numbers. I’d also like a consistent naming between compare and the min/max functions with this behavior. My preference would be rcompare, rmin and rmax, referring to the r in rdiv and 1r3. That also gets them nicely together in the alphabetical list of functions :slight_smile:

1 Like

I guess I have a problem thinking about this as a specifically as a rational number issue - that’s more of an implementation perspective, and IMO the same trap as the ISO standard fell into, i.e., converting integers to floats without specifying the range of either.

I do think of this as comparing all representable numbers (floats, integers, rationals) as values in the ordered set of real numbers, hence my choice of names. It’s as much about comparing floats and big ints, for example, as comparing rationals with floats.

In terms of logically grouping the list of functions, maxreal and minreal belong with max and min, and compare is a bit of a new animal, since all the existing comparisons are predicates. These new functions are not closely related to rdiv or rational literals.

While I’m more interested in the functionality, the names are important. I’m OK with alternative names as long as they reflect the semantics of the functions rather than implementation.

I don’t know. Having a function compare confuses easily with the predicate compare/3, also because predicate indicators are often also used for functions (which is a pity).

The name maxreal suggests it is about real numbers which can’t be the case as finite computers cannot represent these (other than symbolically). As is, we have floats and rationals (and integers as a subclass thereof but SWI-Prolog considers them just as much a type as code is an integer in the domain 0..0x10ffff). Floats are mathematically rationals and that is now you like to see them (I believe). So, I see no problem hinting at rationals in the names.

Any support or other ideas?

Just to point out that there are already arithmetic functions that are also predicates: integer/1, float/1 and rational/1, and they even have the same arity. If the compare/3 predicate didn’t exist I don’t think there would be any confusion about what the “compare” function should be called so I’m not sure your argument is particularly compelling. Maybe something like comparing might be more acceptable but it feels a little excessive and artificial to me.

All instances of the numeric types in SWIP are subsets of real numbers, so I think real is a valid label for the inclusion, even if the currently supported set of numbers is only a partial set of the reals. If somehow a new numeric type representing a different subset of reals was added, it would be included in the semantics of these functions.

I would not be adverse to calling them all rationals, if the existing system was consistent in this respect, but rational(1.1) is false. The r syntax is specifically associated with SWIP rational functionality and the proposed functions are more general IMO. And rightly or wrongly, I think many see rationals as fractions, even though they specifically include integers. So I would prefer function names that didn’t potentially mislead by implying any particular “type” of number.

As I said, I’m not particularly wedded to the proposed names but I would like the names to be semantically descriptive and somewhat neutral.

A couple more points:

  1. Two real numbers can always be compared even if the full set of real numbers can’t be represented in finite memory because comparison does not require the generation of any numbers, i.e., those that couldn’t be represented in finite memory.

  2. In SWIP the IEEE infinities are floats that are not convertible to rationals, i.e., SWIP floats are not a strict subset of SWIP rationals. However, the IEEE infinities can still be considered to be values in the (extended) set of reals for comparison purposes.

You are right (and they are also confusing :frowning: )

That might be the solution for rcompare, rmin and rmax. You consider them real compare, etc., I consider them rational compare, etc. As long as we do not document either interpretation we are all happy :slight_smile:

One of these things were we should also agree to disagree. To you, Inf is larger than anything concrete, to me, Inf is merely larger than the largest number that can be represented as a float.

As a passenger A on this thread, Inf is larger than any real numbers clearly by definition.

Your view is incompatible with IEEE float arithmetic and also it would imply to be forced to not accept Inf as an arithmetic operand (as it would not have a fixed value).

?- A is 1.0/inf.
A = 0.0.
?- A is 1e308*10.0.
ERROR: Arithmetic: evaluation error: `float_overflow'
ERROR: In:
ERROR:   [10] _1526 is 1.0e+308*10.0
ERROR:    [9] toplevel_call(user:user: ...) at /usr/lib/swi-prolog/boot/toplevel.pl:1173
?- A is 1.0/1e308/10.
A = 1.0e-309.

As a side note, the float overflow here above is surprising for me.

That is one view, but you get Inf as a result of a computation results in a number that is larger than the maximum number that can be represented. Also addressing @abramobagnara’s issue:

?- set_prolog_flag(float_overflow, infinity).
true.

?- A is 1e308*10.0.
A = 1.0Inf.

Now to claim that any concrete (rational) number is smaller than this number is rather dubious to me. Note that rational numbers can be much bigger than what can be represented by floats.

Floats are weird. -1.0/inf → -0.0, one of these other funny values.

Bottom line, I consider a computation tainted when floats come in. Results are only correct under the weird rules of floating point arithmetic but not in the world where these numbers come from, i.e. “real” real numbers (pi, e, log, sin, …) or rational numbers.

Anyway, let us decide on the name of these three functions and forget about it :slight_smile:

I wanted to say the following obvious mathematical common sense

1 does not imply 2:

  1. x > y for all y which is representable.
  2. x is INF.

Anyway this is a passenger’s comment on common usage of INF.

1 Like

Back to names.

I think compare logically belongs in the same arithmetic function group as sign. Both take any numeric value(s) as arguments and evaluate to -1, 0, or 1. So while it wouldn’t be my first choice, something like cmp_sign or comparesign wouldn’t be too bad.

maxreal/minreal is a bit more problematical. Perhaps just maxR/minR (less emphasis on real) or just maximum/minimum might work.

Hope someone has a better idea.

Not to ignore the other issues.

Not quite. To me the infinities are members of the extended set of reals, which is ordered, and thus they can be compared with any other member. The fact that infinities are internally represented in SWIP by IEEE floating point values is an implementation detail which, I hope, is outside the scope of this discussion.

Or you can just write 1.0Inf (or inf in an arithmetic expression); no computation involved.

I interpret this result as due to a normal floating point rounding error in multiply:

?- A is roundtoward(1e308*10,to_nearest).
A = 1.0Inf.

?- A is roundtoward(1e308*10,to_negative).
A = 1.7976931348623157e+308.

and yes any finite rational number (or float) is smaller than infinity regardless of how that infinity was generated. Again the focus of this discussion is on comparison, not arithmetic in general (like * above).

The funny value here is -0.0. Consider the limit of -1.0/X as X approaches infinity; isn’t it -0.0? I think IEEE defines arithmetic with the infinities pretty well, consistent with standard mathematical interpretations.

Somewhat agree with the emphasis on computation vs. comparison. But comparison is not subject to rounding errors which taint. (Well only if you don’t insist on converting anything non-float to a float before the compare is done.)

Rounding errors mean that IEEE floating point is only an approximation of real numbers, but one that can be pragmatically implemented and seems to work in the vast majority of applications. Hard to argue with success but be aware of the limitations.

I found a few examples where a function named cmp had similar semantics (GMP, Python) to compare.

And for the gory details of arithmetic on the extended set of reals:

The following lines in the wiki page sounds strange for me.

“Without allowing functions to take on infinite values, such essential results as the monotone convergence theorem and the dominated convergence theorem would not make sense.”

I learned in Lebesgue integral on R, if f is defined a.e. (almost everywhere) in R then the integral of any extention of f to a tatal one on R also exists keeping the same value with f. Similar but less weak property holds also for Riemann integeral. I think INF convention has nothing to do with the two theorems, though something like notion of monotone a.e. would be be necessary.

Good. We don’t want capitals in function names. We could do with max_r. cmp_sign seems reasonable, but I would like to connect the comparison function name to the min/max names as they are based on the same comparison. That still brings me to rcompare, rmin, rmax or cmp_r, min_r, max_r, or some variation thereof.

1 Like

I could live with cmp_r, min_r, max_r but I would prefer cmp, minr, maxr.
My reasons:

  • cmp simpler and is already in use in other languages/libraries for similar functionality. I don’t really see the need to connect the three functions using a naming convention; that’s what documentation is for.
  • maxr/minr slightly simpler/ easier to type, but I’m not strongly opinionated. We just need to associate them with max/min semantics, yet have different names.

Anybody needing any of these functions will surely understand the semantics without requiring a somewhat contrived (IMO) naming convention. But I could live with the ‘_r’ suffix if that’s what it takes. (What’s really unfortunate is that we require two numeric compare functions in the first place.)

I await the final word.

“Fixing” naming using documentation does not have my preference and these are linked. What about cmpr, minr and maxr? That is easy to type, connects
them to each other and does not connect them to rdiv.

As @ridgeworks pointed out, this does not deal with Inf because rational(Inf) is not defined. Now standard Prolog only has normal floats, so this is not an issue if you stick with standard Prolog.

The nasty thing is that max() is supposed to evaluate to one of its arguments without converting it. Note that the current min/max preserve the “tainted” float if both are equal as floats, e.g.

?- A is max(1r10, 0.1).
A = 0.1.

?- A is min(1r10, 0.1).
A = 0.1.

For minr/maxr, one will return the float and the other the rational number.

Consider the general problem of comparing two numbers, N1 and N2, e.g., test if N1 =:= N2. If I don’t know the “types” the only “correct” way to test equality is rational(N1) =:= rational(N2). But that means that all floats get converted to rationals before comparison which can be quite inefficient. To compensate you have to break it down using filters, e.g., rational/1 and float/1 to cover the various cases, resulting in considerably more complex code IMO. And if you want to avoid that, you wrap it in another predicate resulting in even more inefficiency.

Finally as @jan pointed out, the ininities are not convertable to rationals, so they would require additional filters and cases.

Efficiency of numeric comparison may not matter for most applications, but interval arithmetic requires intensive use of these functions (think domain intersection).

All cmp does is bury this detail into a C implemented arithmetic function so that filters are unnecessary and rational conversion is done only when necessary. IEEE infinities and nan’s are also handled efficiently. maxr and minr provide efficient corresponding functionality to max and min.

I still prefer cmp but we’re into the territory of diminishing returns so cmpr, maxr and minr it is barring some last minute inspiration.