I have redone performance comparison in the classical van Roybenchmark suite extended with some more recent benchmarks (this suite is used for profile guided optimization). First made the benchmark run again on YAP 6.5.0 and disabled the additions for yap except sieve which needed no porting.
For a long time the performance on naive reverse was the holy grail in Prolog country . I never bothered. Nrev basically evaluates the performance of append/3. Quite likely some systems have dedicated recognition and VM support to speedup what append needs. SWI-Prolog has some of that too: dedicated indexing for predicates with two clauses, one having [] as first argument and the other [_|_].
Updated the above. Where I claimed the comparison against the PGO compiled one, this was actually not the case. Added a column using the really PGO compiled version of SWI-Prolog. This gives a seriously better picture
Sure, but I do not have access to SICStus. If anyone has, extend the portability of the benchmark and run it. I’ve created the charts the lazy way with Libre Office calc
Some more insight comes from the Pereira benchmarks, which test specific aspects of Prolog. I’ve created a plot comparing the same YAP and SWI-Prolog (with PGO).
Having a closer look at this I (tentatively) come to these observations:
The arg(N) test stands out. It doesn’t really measure arg/3 but a 100 fold chain of calls like this arg5(N, T, R) :- arg(N, T, X), arg6(N, X, R).
only a small part of this is spent on arg/3 itself.
Also considering the args(N) tests, the main bottleneck wrt YAP seems to be pushing the arguments for the next call, in particular for calls subject to LCO (Last Call Optimization).
The shallow_backtracking test assumes no multi-argument indexing. As both YAP and SWI-Prolog have this we get 0/0
I was more or less aware of this. I’ve made similar comparisons long ago (around version 5.4 I think). Lots of things have improved since. I have the impression that the attention Vitor and colleagues have put in writing the C code such that the C compiler gets everything right is no longer required due to improvements of the C compiler technology, in particular when helping the compiler using profile guiding.
What remains mostly the consequence of a register based design (the WAM) vs. the stack based design of the ZIP VM. AFAIK this is not fundamental. B-Prolog (now Picat) does get last calls efficient AFAIK and also uses a stack based design.
Update for the classical benchmark set (see first post). Versions:
SWI-Prolog 8.3.6-9-g47015b5c4 (upgraded from 8.3.5)
YAP 6.5.0
SICStus 4.6.0 (new)
SWI
YAP
SICSTUS
boyer
0.116
0.200
0.044
browse
0.163
0.140
0.040
chat_parser
0.297
0.208
0.065
crypt
0.107
0.242
0.023
fast_mu
0.159
0.252
0.037
flatten
0.170
0.270
0.038
meta_qsort
0.157
0.148
0.032
mu
0.191
0.153
0.041
nreverse
0.101
0.094
0.020
poly_10
0.150
0.222
0.031
prover
0.228
0.192
0.044
qsort
0.166
0.170
0.029
queens_8
0.114
0.348
0.019
query
0.129
0.236
0.034
reducer
0.220
0.214
0.042
sendmore
0.127
0.192
0.032
simple_analyzer
0.201
0.229
0.043
tak
0.142
0.275
0.023
zebra
0.197
0.162
0.070
sieve
0.208
1.546
0.697
queens_clpfd
0.238
-
-
pingpong
0.216
0.073
-
fib
0.220
0.055
-
Or, as chart (shorter bar is better)
The performance of SICStus 4.6.0 on these benchmarks is truly impressive. Note that SICStus 4 compiles to native code for the AMD64 instruction set.
Now the question is what is the value of all this? I’m doing some work on ALE, notably the not-yet-public version 4. ALE compiles a grammar into Prolog program, for ALE4 to a constraint program. Now we get
SWI
SICStus
105 sec
143 sec
Running the grammar is still a bit dubious to compare as there is still a porting issue due to which not all sentences are correctly parsed by the SWI-Prolog port. The current result on a few sentences is below.
SWI
SICStus
3.3 sec
1.9 sec
Note that the code is written for SICStus 3 and ported to SICStus 4 by the developers and to SWI-Prolog by me. It is not unlikely the choices are not optimal for each Prolog. For example the code quite heavily uses clause/2 for facts holding huge structures in the head arguments rather than calling the dynamic predicate which is claimed to be faster in SICStus 3, but surely slower in SWI-Prolog. The benchmark set indicates SWI-Prolog is only faster than SICStus on the dynamic database. Profiling the grammar compilation says only 2.1% of the time is spent on assert/1.
Another interesting observation is that while the recent VM changes to SWI-Prolog for faster last calls and in the current GIT version for inline arg/3 calls (these are heavily used by ALE) have significant impact on some dedicated benchmarks such as the Peirera set, but hardly affect larger programs.
It won’t make much difference for this. SWI-Prolog does have specific code to deal with two clauses, one of which has [] and the other [_|_] as first argument.
This code would be analyzed as something like the following (the compiler probably does something slightly different, but effectively equivalent for this situation):
As discussed before, SWI-Prolog’s hash based indexes are for many clauses. Except for a few special cases and the classical first argument indexing, nothing happens for predicates with just a few clauses.
It would be nice if SWI-Prolog did this transformation automatically (and a lot of other transformations). I think that Peter Van Roy’s Aquarius compiler did such things, but it’s a non-trivial amount of work to write such a compiler.