I took some rainy Sunday afternoon time to have a closer look at LibBF as alternative for LibGMP. Summarizing the potential benefits:
- Cleaner embedding interface can simplify and speedup the binding to Prolog
- Uniform data representation for big integers, floats and decimal floats means we need to do the tricky work for converting the data structures to/from Prolog only once.
- Provides arbitrary precision as well as decimal floats.
- Claims correct rounding with many stateless rounding modes.
- Small
- Simple plain C code
- MIT license. Together with small, this means there is no reason anymore to make bignum support a build-time option.
- Written by Fabrice Bellard. Find his name, he has quite a track record
There are also some problems
- Not as widespread as libGMP. That may imply maintenance problems. On the other hand, the code is plain C and C as well as math principles are a rock solid basis.
- Lacks quite a few functions, notably rationals.
- Slower than LibGMP.
I wanted to understand the options, so I’ve written a partial LibGMP emulation using LibBF. You can find that in a branch libbf
of swipl-devel.git.
So far, the status is:
- Identified the 92 functions we use from LibGMP and put them in
bf_gmp.h
- Implemented 52 of them. Except for 3, all are short inline functions
- Implemented exchange to the Prolog stacks and recorded database (used for toplevel variable reuse).
The basic big integer (+,-,*,/,^) stuff runs. Whenever you touch a function that is not implemented it will tell you (and often crash). Findings:
- Full emulation seems not very hard. Quite a few of the 40 remaining functions are simple one-liners, some require some work. I wouldn’t expect completing the emulation to take more than a week.
- Performance of ^ is poor. Seems it is using the general approach for floats here (LibBF is primarily a floating point library). Seems there are integer related functions in the non-public part of the library. Otherwise it is not too hard to write.
- LibBF only stores the bits between lsb and msb. That means it can deal more efficiently with bitmaps than LibGMP. It lacks most functions to do so, but these are easy to implement.
- Performance of basic bigint +,-,*,/ is about twice as slow as LibGMP. As the core distribution indicates, one can gain about 40% on Intel/AMD using some platform dependent optimization flags. If we could leverage the better API we are likely to win on bigints of just a few limbs.
A possible line to continue is
- Complete the emulation. This would setup the system to (probably by default) use LibBF and as a build option use LibGMP. The result would be somewhat slower, but actually realizing permissive licensing for the complete system. Right now, anyone who cannot handle the LGPL license need to drop big integers.
- Abstract several of the operations we need and implement them independently on LibGMP and LibBF to based them on the optimal combination of primitives.
- Abstract the interface such that we can actually profit from LibBF’s better embedding interface. As we are now emulating LibGMP’s
void
functions, use a complicated setjpm/longjmp and allocation wrapper we cannot profit.
After that we could consider if/how to support arbitrary precision and/or decimal floats.
Comments? Volunteers? Commercial support to make this happen?