Rounding transcendentals broken on MacOS (Intel)

peter.ludemann · April 14, 2022, 7:03pm

There’s also the mechanism that’s used for libarchive, which uses some kind of cmake discovery process for what’s available and sets #define’s in a config.h accordingly. (I haven’t looked at it in detail; and it appears to miss a few things - which I intend to investigate a bit more when I have time.)

jan · April 14, 2022, 7:47pm

An alternative is do do do something along these lines (not tested). Like that, the remainder of
the code needs no adjustments and you can use an alternative declaration for the functions for which this generic rewrite does not work.

#define CRLIBM_FUNC(func) \
static double cr_##func(double in) \
{ double f; \
  int roundMode = fegetround(); \
  if ( roundMode != FE_TONEAREST ) fesetround(FE_TONEAREST); \
  switch( roundMode ) {  \
	case FE_TONEAREST  : f = func##_rn(in); break; \
	case FE_UPWARD     : f = func##_ru(in); break; \
	case FE_DOWNWARD   : f = func##_rd(in); break; \
	case FE_TOWARDZERO : f = func##_rz(in); break; \
  } ;\
  if ( roundMode != FE_TONEAREST ) fesetround(roundMode);
  return f;
}

CRLIBM_FUNC(sin)
#define sin(x) cr_sin(x)

Especially using the above it is quite easy. You create drop in replacements for sin(), etc. and you tell Prolog to use these rather than the libm versions.

This is quite different. Libarchive is a bit of a complicated beast as it has had several major revisions in its API and you need to figure out which formats are supported by a particular library. To Prolog, it is just a foreign extension. In this case we want to replace the built-in arithmetic functions sin(), etc. with versions that are more precise. This affects the core system. As we are dealing with a poorly maintained library that also has a different license it gets complicated. That is what makes me consider allowing to place this extension outside the core. The proposed array of functions is a simple solution. Alternatively we could design a different API for arithmetic in C altogether that allows even loading stuff such as GMP dynamically. That is a much more involved project though. A simple array of function pointers for the various float functions is easy. This keeps the core clean and small while it simplifies the license issues. Considering the state of crlibm it still requires a significant amount of work to make it easy to distribute. My current vision is:

Add an array of float function pointers to the C interface (easy)
Add proper CMake support to crlibm (or some alternative)
Turn it into a new GIT submodule below packages
For the user, there would be use_module(crlibm) to get this stuff. This indirection allows users to choose dynamically for correctness, conciseness and/or performance.

This isn’t really hard, bur requires quite a bit of notably cmake knowledge. Volunteers?

ridgeworks · April 15, 2022, 1:45am

I can find little information about the SUN library apart from the github repo which actually contains very little in terms of documentation, even in terms of its basic characteristics other than “correctly rounded”, whatever they mean by that. crlibm is miles ahead in this regard even if it is “orphaned”.

Two obvious deficiencies with libmcr are lack of any rounding mode control that I can see (maybe it’s implicit but that’s not spelled out) and even a smaller subset of supported functions. It also has only one commit 8 years ago (2014) so I don’t think it addresses the ongoing maintenance question. In fact it’s possibly worse because there don’t seem to be any active users compared to crlibm (e.g.,Python and Julia). And SUN no longer exists, do they?

Based on what I can find, I don’t think it’s worth pursuing.

ridgeworks · April 15, 2022, 1:58am

Definitely worth exploring if I understand it right. I’ll work on it and get back to you.

But it’s all moot if the larger library config issues can’t be sorted out.

jan · April 15, 2022, 7:09am

It seems indeed as orphaned as crlibm From an integration point of view it has two advantages: a liberal license and barely any configuration, i.e., simply compile the source using a C compiler and you are done. crlibm figures out stuff about the environment including the CPU. For Intel/AMD it uses bits of assembly. That complicates providing and maintaining a correct and portable build. SUN is bought by Oracle. We can ask them

Noted.

That is an issue. I suspect it is possible to modernize the code and build environment. That does require someone with some experience with cmake, C and portability issues. Sure I can do that. It is too much for me though. So, you’ll have to do it or find someone to do it/help you.

ridgeworks · April 19, 2022, 9:17pm

It really helps to get advice from somebody who knows what they’re doing. This restructuring results in a huge improvement IMO, ending up in single conditional block of code (~120 lines) re-defining all the applicable standard functions to use correctly rounded versions. The rest of the pl-arith.c code is unchanged except for small conditional blocks in initArith(), to initialize crlibm and the float_correct flag, and get_arith_flag to read the new flag. Obviously more testing is required, but I don’t see any need for more changes/additions other than bug fixes to the “custom” correctly rounded versions (tanh, atan2, and pow).

I don’t like to swim against the current, but that wouldn’t be my vision. I really don’t see any advantage to providing a dynamic choice, assuming no additional requirements above what the existing prototype provides. The only reason to provide a choice at all is the licensing issue. Consider:

Correctness - there is no “more correct”, it either it is or it isn’t. In general, no other library option that I’m aware of provides any guarantees of correctness, let alone guarantees over all the IEEE rounding modes. And if you have a correct option, why would you choose to use a “less correct” option. Further, I think the assessment of “poorly maintained” is a bit harsh, since I wouldn’t anticipate much, if any, need for ongoing maintenance. To what degree are the other versions of libm maintained? WiIl they fix bugs, e.g., rounding transcendentals in the Xcode library, in a timely fashion? Are they evolving to provide better correctness guarantees? In our lifetime? Seems to me they’ve got some catching up to do.
Conciseness - not exactly sure what this means or why the end user would care, but the new implementation model based on your suggestion looks remarkably clean and concise to me. It’s certainly a vanishingly small and isolated piece of the 5000+ lines of code in pl-arith.c. Making it more configurable and putting control in the end user’s hands doesn’t seem to improve this aspect. The current C-preprocessor solution could be replaced by a table filled at initialization time, but it’s not clear to me how that helps. And then there’s the developer documentation required to use this new API, should that ever be needed.
Performance - by the time you wrap these functions with the Prolog VM and C arithmetic function layers, I doubt the raw performance of the library functions has a measurable effect. In any case crlibm was competitive with other libm versions when it was being developed and I doubt the situation has changed much since.
Precision - not on your list, but it may come up for discussion. I think the number applications that require more than IEEE double precision is vanishingly small and those that are looking for it are almost definitely not looking at Prolog for a solution. Implementation of higher precision floats would have ramifications well beyond the small set of elementary arithmetic functions; a solution using external libraries and blobs might well apply for this. But overall, the rationale for higher precision Prolog floats seems dubious, as does a dynamic solution to support them.

So I would argue that what I’ve already done is sufficient, at least for the foreseeable future. Assuming a successful outcome (more below), it should be the default configuration, just like GMP - why would you settle for less? But also like GMP, it can be omitted from the build if desired for licensing reasons. That would be my vision.

This indeed is the sticking point. It appears that I can do all the necessary platform specific build and test using the existing static library. But I don’t have the requisite C/Cmake skills to advance beyond that, although I’m happy to contribute whatever I can if somebody else is leading the charge.

If we can’t solve this impasse at some point (doesn’t have to be a short term solution) SWIP will have to continue to rely on the various existing platform implementations of libm, bugs and all. (I’ll raise an issue against the rounding sensitive bug on MacOS so it’s captured even if there’s not much SWIP can do to fix it.)

ridgeworks · April 22, 2022, 9:01pm

Sorry for the long post but here’s a Plan B which, while not quite as good as Plan A, may be feasible in the short term. To recap:

As most know, unlike arithmetic using rational numbers (which include integers), arithmetic using floating point values does not obey the standard mathematical laws (associativity, distributivity, etc.) due to rounding errors. But arithmetic over reals can be made sound using sets of real values defined by floating point bounds; various schemes include interval arithmetic, ball arithmetic, and enum’s, but they’re all based on the same principle. However this requires control over rounding as defined by the IEEE floating point standard. This control is only mandated to be correct over the normal arithmetic operations and sqrt and this means there are no such guarantees over other real functions like exp, log, trig. functions, etc. and this can be easily demonstrated using a simple predicate like:

round_value(Exp,[Rn,Rd,Ru,Rz]) :-
	Rn is roundtoward(Exp,to_nearest),
	Rd is roundtoward(Exp,to_negative),
	Ru is roundtoward(Exp,to_positive),
	Rz is roundtoward(Exp,to_zero).

A few examples using SWISH:

?- X is pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
Rd = Rn, Rn = Ru, Ru = Rz, Rz = 0.7071067811865475,
X = 0.7853981633974483

?- X is -pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
Rd = Rn, Rn = Ru, Ru = Rz, Rz = -0.7071067811865475,
X = -0.7853981633974483

?- X is pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
Rd = Rn, Rn = Ru, Ru = Rz, Rz = 1.0,
X = 1.5707963267948966

?- X is -pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
Rd = Rn, Rn = Ru, Ru = Rz, Rz = -1.0,
X = -1.5707963267948966

?- X is pi,round_value(X**X,[Rn,Rd,Ru,Rz]).
Rd = Rn, Rn = Ru, Ru = Rz, Rz = 36.4621596072079,
X = 3.141592653589793

So the math library used by SWIP on SWISH just seems to ignore the rounding mode. At least it’s consistent, although it’s not correct because, except for exact values (which these are not), rounding up and rounding down should never produce the same value.

The other platform I have access to is MacOS which generates quite different results:

?- X is pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = 0.7853981633974483,
Rn = Rd, Rd = Rz, Rz = 0.7071067811865475,
Ru = 0.7071067811865476.

?- X is -pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = -0.7853981633974483,
Rn = Rd, Rd = Ru, Ru = -0.7071067811865475,
Rz = -0.7071067811865476.

?- X is pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = 1.5707963267948966,
Rn = 1.0,
Rd = Rz, Rz = 0.9999999999999999,
Ru = 1.000000000251173.

?- X is -pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = -1.5707963267948966,
Rn = Ru, Ru = -1.0,
Rd = -1.000000000251173,
Rz = -0.9999999999999999.

?- X is pi, round_value(X**X,[Rn,Rd,Ru,Rz]).
X = 3.141592653589793,
Rn = Rd, Rd = Rz, Rz = 36.4621596072079,
Ru = 36.46215960720791.

The result for pi/4 is actually correct, but -pi/4 is definitely wrong since rounding towards zero actually produces a more negative value than any of the other modes. And, as in the incorrect SWISH results, rounding up and rounding down produces the same value. The tests around ±pi/2 demonstrates a significant bug on the order of the 6 least significant digits and produces an answer outside the ±1.0 range of the sin function so definitely broken there.

Using crlibm on MacOS produces correct results:

?-  X is pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = 0.7853981633974483,
Rn = Rd, Rd = Rz, Rz = 0.7071067811865475,
Ru = 0.7071067811865476.

?- X is -pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = -0.7853981633974483,
Rn = Ru, Ru = Rz, Rz = -0.7071067811865475,
Rd = -0.7071067811865476.

?- X is pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = 1.5707963267948966,
Rn = Ru, Ru = 1.0,
Rd = Rz, Rz = 0.9999999999999999.

?- X is -pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = -1.5707963267948966,
Rn = Rd, Rd = -1.0,
Ru = Rz, Rz = -0.9999999999999999.

?-  X is pi, round_value(X**X,[Rn,Rd,Ru,Rz]).
X = 3.141592653589793,
Rn = 36.4621596072079,
Rd = Rz, Rz = 36.462159607207894,
Ru = 36.46215960720791.

So here’s three different implementations of the floating point elementary function, all with different answers. And other than crlibm the answers are wrong depending on the rounding mode. But the thing to note is they all agree for the default “to_nearest” rounding mode which is almost universally used except by those of us interested in correctness.

So if we can make the (perhaps dubious) assumption that the various math libraries get the correct “to_nearest” answer for these functions within 1ulp (unit of least precision), rounding can be implemented using the nexttoward C function. Such rounding is not as precise as possible (normally 2ulp’s between round up and round down values instead of 1) and some care must be taken to ensure rounding doesn’t violate the mathematical definition of the function (e.g., sin values > 1.0) or to round infinite values to finite ones, but that’s pretty manageable. It only requires an addition to pl-arith.c using the redefinition of library functions as for the latest crlibm version (no new libraries to import or build config issues) and produces the following results:

?- X is pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = 0.7853981633974483,
Rn = 0.7071067811865475,
Rd = Rz, Rz = 0.7071067811865474,
Ru = 0.7071067811865476.

?- X is -pi/4, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = -0.7853981633974483,
Rn = -0.7071067811865475,
Rd = -0.7071067811865476,
Ru = Rz, Rz = -0.7071067811865474.

?- X is pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = 1.5707963267948966,
Rn = Ru, Ru = 1.0,
Rd = Rz, Rz = 0.9999999999999999.

?- X is -pi/2, round_value(sin(X),[Rn,Rd,Ru,Rz]).
X = -1.5707963267948966,
Rn = Rd, Rd = -1.0,
Ru = Rz, Rz = -0.9999999999999999.

?- X is pi,round_value(X**X,[Rn,Rd,Ru,Rz]).
X = 3.141592653589793,
Rn = 36.4621596072079,
Rd = Rz, Rz = 36.462159607207894,
Ru = 36.46215960720791.

The good:

produces correctly rounded results with a fairly straight forward addition to pl-arith.c in one place; no other code changes, and no conditional compilation, required
no changes to build process (i.e., the cmake “quagmire” - my term)
no software licensing issues

The bad:

depends on the possibly dubious assumption that the math libraries generate answers within 1ulp in (default) to_nearest mode. (Some simple testing similar to the above examples would help with this determination.)
produce slightly larger range of rounding values (2ulp’s instead of 1)

The ugly:

nothing really; the vast majority of users don’t go near rounding modes and the executed code in those cases remains the practically the same other than a quick check to see what rounding is in effect when these elementary functions are called.

Summary - not as good a solution as Plan A using crlibm but probably good enough, and much better than the status quo.

ridgeworks · April 27, 2022, 6:45pm

As there’s been no followup to the discussion, I’ve generated the following PR impelmenting Plan B for consideration.

github.com/SWI-Prolog/swipl-devel

Wrap elementary functions to correct issues with rounding mode support in various libm's

SWI-Prolog:master ← ridgeworks:master

opened 06:43PM - 27 Apr 22 UTC

ridgeworks

+179 -1

Various platform libm's produce incorrect results when used with IEEE float roun…ding modes other than 'to_nearest'. This PR wraps these functions with additional code to ensure rounding mode correctness. See comment in 'pl-arith.c' for further details. A few additional test cases are included to detect incorrect rounding, but these are by no means exhaustive. Also see <https://swi-prolog.discourse.group/t/rounding-transcendentals-broken-on-macos-intel/5166> for lengthy discussion preceding this PR. Note that this fix does not address any precision issues in the various libraries. A proven implementation, such as 'crlibm', is required to address this issue.

jan · May 1, 2022, 9:50am

Thanks. Merged.

ridgeworks · May 1, 2022, 7:53pm

Just FYI: I found a recent (Feb., 2022) comparing accuracies of various math libraries: https://hal.inria.fr/hal-03141101 ; in particular the first page and Table 3 are relevant.

This impacts our basic assumption that round to nearest is accurate to 1 ulp. The situation is better than I had thought but all have one or more issues with the suite of SWIP supported elementary functions. And that’s not including full rounding mode support.

Even so, I think this PR, while not a perfect solution, considerably improves the situation.

jan · May 3, 2022, 12:00pm

Pushed an update to the docs pointing at this publication. Thanks for sharing. I guess we should keep the crlibm option in mind and make sure the work done on it doesn’t get lost.

ridgeworks · May 4, 2022, 12:30am

Good idea to reference the paper from the docs. Wider dissemination may increase the chances of getting improvements to the various libraries, although that’s probably a long shot.

For me, the major effort with crlibm will be getting it in a suitable form to use in the builds. The changes to the source code are fairly small and similar to this PR, so that part shouldn’t be a major issue. I’ll also try and track what Rust is doing as they migrate from crlibm to Metalibm.

jan · May 4, 2022, 8:14am

Considering that the paper uses the GMP mpf_* functions to validate the rounding, can’t we do the same? That avoids the whole issue of (new) dependencies. Now they seem to run the mpf_* functions at a higher precision. We would have to figure out what precision is required and what the impact is on performance.

Another angle might be to look at ECLiPSe. As they have support for real intervals they probably thought about these issues.

ridgeworks · May 4, 2022, 2:24pm

I assume you mean the GNU MPFR library referred to in the paper which isn’t the same as GMP multi-precision floats, e.g., the latter has no support for the elementary functions.

Admittedly, I haven’t looked at MPRF in depth, but first impressions:

Library is built from source using make, so it appears to me (not an expert) that it has the same build issues as crlibm.
It uses a different floating point format (not IEEE754 doubles) so the API is similar GMP rationals, i.e. the Prolog float values must be converted to/from the MPFR format. This adds another layer of complexity to the elementary functions (you certainly wouldn’t want to use it for simple arithmetic).
From the crlibm document (they’re the experts):

When stricter guarantees are needed, some multiple-precision packages like MPFR [33] offer correct rounding in all rounding modes, but are several orders of magnitude slower than the usual mathematical libraries for the same precision.

and

MPFR is a multiprecision package safer than libultilm as it uses arbitrary multiprecision. It provides most of elementary functions for the four rounding modes defined by the IEEE-754 standard. However this library is not optimized for double precision arithmetic. In addition, as its exponent range is much wider than that of IEEE-754, the subtleties of subnormal numbers are difficult to handle properly using such a multiprecision package.

So my current assessment would be that crlibm would still be the favoured approach if it could be justified. But I could work with any suitable library that could be incorporated into the SWIP build process.

Topic		Replies	Views
Careful with the floats Help!	11	885	July 29, 2021
Another floating point casualty: round/integer arithmetic functions Predicate bug	7	829	March 24, 2022
Bug in nexttoward/2 and/or float/1 Help!	4	395	September 28, 2022
Correct floating point arithmetic Discussion	57	1839	October 26, 2022
Rounding error in (/)/2 Help!	18	678	October 14, 2022

Rounding transcendentals broken on MacOS (Intel)

Related topics