Performance SWI-Prolog 8.3.5 vs YAP 6.5.0

jan · August 11, 2020, 3:20pm

I have redone performance comparison in the classical van Roy benchmark suite extended with some more recent benchmarks (this suite is used for profile guided optimization). First made the benchmark run again on YAP 6.5.0 and disabled the additions for yap except sieve which needed no porting.

Details:

YAP @3298a680300b350a8c56634d020efae252d09422 (Jan 5, 2020)
- Default ./configure && make && make install
SWI-Prolog V8.3.5-18-g7f94d2ebf
- cmake -G Ninja .. && ninja
SWI-Prolog V8.3.5-18-g7f94d2ebf (PGO)
- cmake -G Ninja .. && ../scripts/pgo-compile.sh && ninja
System
- Ubuntu 20.04, gcc 9.3.0, AMD Ryzen 9 3950X, 64Gb mem

	yap	swipl	swipl -O	swipl -O (PGO)
boyer	199	273	275	189	137.19%	138.19%	94.97%
browse	150	258	259	176	172.00%	172.67%	117.33%
chat_parser	219	390	393	298	178.08%	179.45%	136.07%
crypt	255	265	131	109	103.92%	51.37%	42.75%
fast_mu	266	317	239	175	119.17%	89.85%	65.79%
flatten	295	316	279	185	107.12%	94.58%	62.71%
meta_qsort	152	243	234	165	159.87%	153.95%	108.55%
mu	160	320	327	211	200.00%	204.38%	131.88%
nreverse	92	269	273	122	292.39%	296.74%	132.61%
poly_10	231	271	209	159	117.32%	90.48%	68.83%
prover	186	334	337	225	179.57%	181.18%	120.97%
qsort	179	378	306	195	211.17%	170.95%	108.94%
queens_8	373	303	163	117	81.23%	43.70%	31.37%
query	247	298	146	132	120.65%	59.11%	53.44%
reducer	219	337	331	229	153.88%	151.14%	104.57%
sendmore	204	364	140	120	178.43%	68.63%	58.82%
simple_analyzer	238	343	313	222	144.12%	131.51%	93.28%
tak	285	299	207	136	104.91%	72.63%	47.72%
zebra	170	332	340	190	195.29%	200.00%	111.76%
sieve	1524	277	239	195	18.18%	15.68%	12.80%

table generated using https://www.tablesgenerator.com

As chart, shorter bar is better

swi · August 12, 2020, 1:20am

Interesting that nreverse is 3 times slower in swi.
Do you have an idea why this is?

EDIT: Since it is only 30 integers, probably most of the cost is in some kind of initial setup for the call.

jan · August 12, 2020, 7:00am

For a long time the performance on naive reverse was the holy grail in Prolog country . I never bothered. Nrev basically evaluates the performance of append/3. Quite likely some systems have dedicated recognition and VM support to speedup what append needs. SWI-Prolog has some of that too: dedicated indexing for predicates with two clauses, one having [] as first argument and the other [_|_].

jan · August 12, 2020, 8:13am

Updated the above. Where I claimed the comparison against the PGO compiled one, this was actually not the case. Added a column using the really PGO compiled version of SWI-Prolog. This gives a seriously better picture

grossdan · August 12, 2020, 9:47am

Does it make sense to have such a comparison with one of the commercial prolog such as Sicstus.

thanks,

Dan

jan · August 12, 2020, 10:05am

Sure, but I do not have access to SICStus. If anyone has, extend the portability of the benchmark and run it. I’ve created the charts the lazy way with Libre Office calc

jan · August 12, 2020, 11:15am

Some more insight comes from the Pereira benchmarks, which test specific aspects of Prolog. I’ve created a plot comparing the same YAP and SWI-Prolog (with PGO).

Having a closer look at this I (tentatively) come to these observations:

The arg(N) test stands out. It doesn’t really measure arg/3 but a 100 fold chain of calls like this
arg5(N, T, R) :- arg(N, T, X), arg6(N, X, R).
only a small part of this is spent on arg/3 itself.
Also considering the args(N) tests, the main bottleneck wrt YAP seems to be pushing the arguments for the next call, in particular for calls subject to LCO (Last Call Optimization).
The shallow_backtracking test assumes no multi-argument indexing. As both YAP and SWI-Prolog have this we get 0/0

I was more or less aware of this. I’ve made similar comparisons long ago (around version 5.4 I think). Lots of things have improved since. I have the impression that the attention Vitor and colleagues have put in writing the C code such that the C compiler gets everything right is no longer required due to improvements of the C compiler technology, in particular when helping the compiler using profile guiding.

What remains mostly the consequence of a register based design (the WAM) vs. the stack based design of the ZIP VM. AFAIK this is not fundamental. B-Prolog (now Picat) does get last calls efficient AFAIK and also uses a stack based design.

SWI-Prolog vs YAP on the Peirera benchmarks. Below 100% means SWI-Prolog is faster, above 100% YAP is faster

meditans · August 12, 2020, 7:33pm

Thanks for the benchmarks, what does PGO stand for?

Boris · August 12, 2020, 8:09pm

Profile Guided Optimization.

There is a section on it in here: https://github.com/SWI-Prolog/swipl-devel/blob/7c37bf00d2195ca25a32bca855c0daaad88780b5/CMAKE.md

jan · August 30, 2020, 1:43pm

Update for the classical benchmark set (see first post). Versions:

SWI-Prolog 8.3.6-9-g47015b5c4 (upgraded from 8.3.5)
YAP 6.5.0
SICStus 4.6.0 (new)

	SWI	YAP	SICSTUS
boyer	0.116	0.200	0.044
browse	0.163	0.140	0.040
chat_parser	0.297	0.208	0.065
crypt	0.107	0.242	0.023
fast_mu	0.159	0.252	0.037
flatten	0.170	0.270	0.038
meta_qsort	0.157	0.148	0.032
mu	0.191	0.153	0.041
nreverse	0.101	0.094	0.020
poly_10	0.150	0.222	0.031
prover	0.228	0.192	0.044
qsort	0.166	0.170	0.029
queens_8	0.114	0.348	0.019
query	0.129	0.236	0.034
reducer	0.220	0.214	0.042
sendmore	0.127	0.192	0.032
simple_analyzer	0.201	0.229	0.043
tak	0.142	0.275	0.023
zebra	0.197	0.162	0.070
sieve	0.208	1.546	0.697
queens_clpfd	0.238	-	-
pingpong	0.216	0.073	-
fib	0.220	0.055	-

Or, as chart (shorter bar is better)

swi-yap-sicstus

The performance of SICStus 4.6.0 on these benchmarks is truly impressive. Note that SICStus 4 compiles to native code for the AMD64 instruction set.

Now the question is what is the value of all this? I’m doing some work on ALE, notably the not-yet-public version 4. ALE compiles a grammar into Prolog program, for ALE4 to a constraint program. Now we get

SWI	SICStus
105 sec	143 sec

Running the grammar is still a bit dubious to compare as there is still a porting issue due to which not all sentences are correctly parsed by the SWI-Prolog port. The current result on a few sentences is below.

SWI	SICStus
3.3 sec	1.9 sec

Note that the code is written for SICStus 3 and ported to SICStus 4 by the developers and to SWI-Prolog by me. It is not unlikely the choices are not optimal for each Prolog. For example the code quite heavily uses clause/2 for facts holding huge structures in the head arguments rather than calling the dynamic predicate which is claimed to be faster in SICStus 3, but surely slower in SWI-Prolog. The benchmark set indicates SWI-Prolog is only faster than SICStus on the dynamic database. Profiling the grammar compilation says only 2.1% of the time is spent on assert/1.

Another interesting observation is that while the recent VM changes to SWI-Prolog for faster last calls and in the current GIT version for inline arg/3 calls (these are heavily used by ALE) have significant impact on some dedicated benchmarks such as the Peirera set, but hardly affect larger programs.

The overall picture is rather unclear to me

swi · August 30, 2020, 6:40pm

This is looking better and better; thanks!

Is this with PGO enabled for the benchmarks?

jan · August 31, 2020, 6:55am

Yes. SWI-Prolog was compiled on Ubuntu 20.04, gcc 9.3 on AMD 3950X using

../scripts/pgo-compile.sh
ninja

jan · October 31, 2020, 4:35pm

It won’t make much difference for this. SWI-Prolog does have specific code to deal with two clauses, one of which has [] and the other [_|_] as first argument.

peter.ludemann · November 2, 2020, 4:57pm

This code would be analyzed as something like the following (the compiler probably does something slightly different, but effectively equivalent for this situation):

sorted([X|YZ]) :-
    YZ=[Y|Z],
    after(X,Y,T), sat(T), sorted([Y|Z]).
sorted([_|_]).

Both clauses’ heads match [|], so there’s a choice point.

jan · November 2, 2020, 5:02pm

As discussed before, SWI-Prolog’s hash based indexes are for many clauses. Except for a few special cases and the classical first argument indexing, nothing happens for predicates with just a few clauses.

j4n_bur53 · November 2, 2020, 6:49pm

Nope, I saw this in Kuniaki Mukais code :

inner_product([X], [Y], X*Y):-!.
inner_product([X|Xs], [Y|Ys], #((X*Y),Z)):- inner_product(Xs, Ys, Z).

Since [] is at the end of a list, and deep indexing is even not possible
since the second clause has a variable as tail,

it will probe the first clause for every element it traverses. Changing the
code could also lead to slight performance improvement.

peter.ludemann · November 2, 2020, 6:51pm

I don’t understand … [_|_] is the same as [_]:

?- [A|B] = [C].
A = C,
B = [].

Anyway, to get full indexing, you’d need to rewrite like the following, I think (code might contain typos):

sorted([X|YZ]) :- sorted2(YZ, X).

sorted2([Y|Z], X]) :-
   after(X, Y, T),
   sat(T),
   sorted([Y|Z]).
sorted2([], _).

It would be nice if SWI-Prolog did this transformation automatically (and a lot of other transformations). I think that Peter Van Roy’s Aquarius compiler did such things, but it’s a non-trivial amount of work to write such a compiler.

peter.ludemann · November 2, 2020, 7:24pm

Oops. I meant: [_] is the same as [_|Zs], Zs=[] (or [_] is the same as [_|[]]).

Topic		Replies	Views
Porting the SWI-Prolog benchmark suite: comparing 8 Prolog systems Nice to know benchmark , performance , comparison	27	2197	November 29, 2023
SWI does some good garbage-collection/optimisation these days General	23	1063	October 28, 2021
Recursion and weird benchmarking Discussion	1	371	August 22, 2021
VM code refactor - thoughts? Request For Comments on-topic-only	113	4083	May 30, 2021
ChatGPT SWI-Prolog assistent Nice to know chatgpt , llm	28	475	July 20, 2025

Performance SWI-Prolog 8.3.5 vs YAP 6.5.0

Related topics