See Porting the SWI-Prolog benchmark suite: comparing 8 Prolog systems - #31 by jan for updated figures for
queens_clpfd
, improved timing, updated Ciao Prolog and Scryer Prolog.
I did some work on GitHub - SWI-Prolog/bench: Prolog benchmarks (`van Roy' set). This suite is used for training using “Profile Guided Optimization” as well as to assess the impact on changes. It used to run on SWI-Prolog, SICStus and YAP. As the driver heavily dependent on conditional compilation and a Quintus Prolog derived module system, the portability was limited.
I added a portability layer that relies on rewriting the original benchmark programs. This allows us to generate something that can work easily on different Prolog systems (using SWI-Prolog for the rewrite process as well as scripting the workflow). After some refinement, adding a system is now doable in less than an hour. Currently it supports SWI-Prolog in several configurations and 7 other systems:
Identifier Description
――――――――――――――――――――――――――――――――――――――――――――――――――
swipl SWI-Prolog (-O,PGO) 9.1.19
swipl-no-pgo SWI-Prolog (-O,no PGO) 9.1.19
swipl-no-O SWI-Prolog (PGO) 9.1.19
swipl-st SWI-Prolog (no threads,-O) 9.1.19
swipl-wasm SWI-Prolog (WASM,-O) 9.1.19
swipl-win64 SWI-Prolog (Win64,-O) 9.1.19
gprolog GNU-Prolog 1.6.0
yap YAP 6.3.4
sicstus SICStus Prolog 4.8.0
scryer Scryer Prolog 0.8.127
xsb XSB 5.0.0
trealla Trealla Prolog 1.6.18
ciao Ciao Lang 1.21.0
For SWI-Prolog, “-O” means running swipl -O
(optimized arithmetic while many of the benchmarks do a lot of arithmetic). “PGO” means compiled using Profile Guided Optimization, “no threads” means single thread version, “WASM” the Web Assembly version running under node.js and “Win64” the Windows binary running under Wine. SICStus Prolog is the official binary version, Other systems were built from source with default options.
Hardware and OS: All benchmark are executed on AMD3950X, 128Gb core, Ubuntu 22.04
Benchmarks that use more than minimal ISO:
- perfect – requires unbounded integers.
- queens_clpfd – requires clp(fd)
- pingpong – requires tabling
- fib - requires tabling and unbounded integers (XSB result is wrong)
- moded_path – requires tabling with answer subsumption
- det – Benchmarks single sided unification (
=>
rules)
Below I’ll share some charts. Missing bars implies the system could not run the benchmark. In some cases this can probably be fixed, in others it is lack of features, such as unbounded integers, clp(fd), tabling, etc. Giving SWI-Prolog (GIT version), GNU-plot (for the charts) the command to generate the chart is added. The number of iterations of all benchmarks has been calibrated in 2021 such that each benchmark took about 1 second on SWI-Prolog 8.5.2.
Some classic systems
swipl compare.pl -o classics.svg swipl gprolog yap sicstus xsb ciao
The “modern” systems vs. SWI-Prolog
swipl compare.pl -o modern.svg --ymax 50 swipl trealla scryer
- The
sieve
benchmark tests the dynamic database. It is, AFAIK, fully ISO, but not accepted by Scryer. Trealla accepts it, but didn’t finish in an hour. Trealla seems to get slower on this on each iteration.
SWI-Prolog using different compilers and setup
This comparison uses gcc 11.4 and clang 15, where for gcc we compile with and without PGO (Profile Guided Optimization) and we also built the version without multi thread support. On Clang, using PGO makes the system slower, so we stopped using that.
swipl compare.pl -o c-compiler.svg swipl swipl-no-pgo swipl-st swipl-clang-15
SWI-Prolog native vs WASM
This comparison compares the native AMD64 binaries with the WASM version compiled using Emscripten 3.1.44 (based on Clang) running on node 18.17.1.
All data
Below is the CSV (well, the separator is |
as GNU-Plot does not like ,
in quoted fields) with all raw data. Generated using
swipl compare.pl -o all.svg swipl swipl-no-pgo swipl-clang-15 swipl-no-O swipl-st swipl-wasm swipl-win64 gprolog yap sicstus scryer xsb trealla ciao
all.csv.log (5.0 KB)
Discussion
There is a lot to say about this figures. At the same time, benchmarks are always highly debatable. It is well possible that notably the really poor results on some individual benchmarks are caused by a trivial oversight in the implementation. Some benchmarks failing on some implementations are probably easily fixed and some are probably caused by me not being patient enough to find a work around.
Porting turned out to be hard. I was surprised how hard it is to load one file from another Prolog file. Some systems take the name of the module to load relative to the file in which the directive appears. Some use a search path and do not like the extension (XSB). For Trealla I failed to figure out the rules and the only solution that worked was including all code in one file except for the benchmarks itself and run the benchmarks from the directory holding the source. Porting to the “classic” systems was a lot easier because error messages are much clearer, documentation is better and I experienced no crashes on any of these. My respect for Paulo Moura, making Logtalk run on these systems, has grown a lot!
A few highlights
- SICStus Prolog is unbeatable on this benchmark set. Congrats!
- While “Scryer Prolog” is advertised as “A modern Prolog implementation written mostly in Rust”, its GIT history goes back 7 years. It has a long way to go wrt. performance, some context to error messages and even simply loading a file (include/1 does not work, only using use_module/1 I managed to load non-module files). “Trealla” is faster, but has some weird outliers (poly_10 and sieve) and I had to work around many crashes.
- WASM is always claimed to provide “near native performance”. On average the WASM version of SWI-Prolog is over 5 times slower. Ok, about 30% of this goes to Clang vs. GCC.
Future work
If you think some system is handled unfairly, please let me know.
- Include more systems. Please provide a PR for your system.
- Include benchmarks that cover other areas. Think about GC, built ins (sort/2, findall/3, etc.), program load time, indexing large databases, coroutining, constraints, tabling.
- Include memory usage, possibly monitoring using time(1) on Linux?
- Improve the GNU plot output. E.g. show value on hover, somehow break exceptional long bars into two. Both seem to be possible. Any GNU plot wizards around here?
Acknowledgements
@j4n_bur53 brought benchmarking under the attention.
I would like to thank the SICStus team for providing me with a permanent license for SICStus Prolog.