Benchmarking [CLP(B) libraries and other stuff]

Well almost, in your old ZDD post, you did XOR representation counting
by converting into OR representation and you didn’t have a primitive
zdd_xor, so you had to do it like this here:

gf2_sat_count(P, C, As, U, V, S):- cofact(P, t(_-J, L, R), S),
	C0 is C * (2^J-1),
	gf2_sat_count(R, C0, [L-C|As], U, V, S).

I don’t know whether this is correct. Do you have the GF(2) already converted
to ZDD, otherwise I don’t see this as correct for an XOR representation.
You find a counting based on XOR representation in my tree2.p.

I am also using this as a benchmark for Dogelog Player now. Dogelog Player
has currently a garbage collection problem in these test cases. So they give
me a nice playground to experiment with improvements:

284063811-bd4f6853-a349-40ec-92ab-f0120d1302e2

GNU Prolog is blazing fast, but you need to increase memory setting GLOBALSZ.

Its good to publish benchmarks, but I cannot find a link to the source. That makes them useless, even more so if you do not include the data for the Prolog system this forum is about.

There are quite different claims at places and also your

suggests GNU Prolog is top (speed). It is still widely believed SWI-Prolog is (very) slow and I think that assumption should be corrected where fair.

Adding a time (wall) that is a bit easier to read and a total, I get (AMD3950X, Ubuntu 22.04)

swipl -O -l samples/common/17_bench/crypto/tests/suite.p time.pl

5 ?- suite, total.
sator18 0ms
sat2or18 162ms
satxor18 422ms
sat2xor18 0ms
countor18 0ms
count2or18 115ms
countxor18 106ms
count2xor18 129ms
Total: 934ms
true.

What makes you think I share this believe? I wrote:

But to have a test that says something, since you test on
a different machine, you would need to create a test report
like I did, with SWI-Prolog and GNU Prolog side by side.

Maybe you can use my library(runner) (*) and library(report)
for that. Or roll your own report generator. Or juggle manually
from commend line. I also still use command line from time

to time to get quick comparisons. Maybe you can also change
peoples believe systems if you show side-by-side results?
But doing such reports is a lot of initial investment, but

once you have them, its payback time.

Edit 19.11.2023
(*) Funny approach I am using now, I started with a kind of PlUnit,
which counts success and failure, and turned it additionally into a benchmark
utility not counting success and failure, but summing

measured time. A lot of code reuse possible! Plus you need to make the
PlUnit more flexible so that it can generate and integrate results from
different runs, like with different Prolog systems or different platforms.

I tried to generalize GitHub - SWI-Prolog/bench: Prolog benchmarks (`van Roy' set). I managed to revive the YAP and SICStus ports. We know these systems are fast. This results in
bench

Note that SWI-Prolog did make some progress. These timings were calibrated on 1 seconds for 8.5.2 on the same hardware. In part, this is due to improvements in GCC. The missing bars are not really fast, but benchmarks that do not run on the target system. queens_clpfd can surely run on SICStus with porting.

Upto zebra, these are the classical “van Roy” benchmarks. Others were added to get wider coverage:

  • sieveSieve of Eratosthenes using assert/1 and retract/1 (to benchmark these)
  • queens_clpfd – as the name says, for clp(fd). By Markus Triska.
  • pingpong – Tabling test
  • fib – Tabling and big integer test
  • moded_path – Tabling with answer subsumption
  • det – Single sided unification and det/1 evaluation.
  • eval – Evaluate large expressions (from discussion here)
  • average – Not a benchmark, but the average over all of them.

I tried to include

  • Scryer – Says caught: error(syntax_error(inconsistent_entry),use_module/1) on anything it doesn’t like, which makes debugging a little hard.
  • Ciao – Probably possible, but I could not find a way to load a non-module file into a module.
  • Trealla – Crashes on a segmentation fault when loading the driver (current GIT, passes tests)
  • GNU-Prolog – Needs a way to make the deal with Prolog systems without modules.
  • XSB – Doesn’t like nested conditional compilation (there is a bit of :- if(...).)

Please contribute benchmarks or ports. Unfortunately some old benchmarks have unknown copyright. New ones must have an open source license that is accepted by the Linux distributions.

1 Like

For example this here .sh, what shell does it even expect?

https://github.com/SWI-Prolog/bench/blob/f7f951651b4b5658835bc4060f7b9cd1914c20dd/all.sh

If I want to run these tests under Windows 10, I have to create my own .bat right?

There can be also a lot of troubles with dump files created during
a bechmark run and if your test cases are connected to
a GIT, SVN, what ever repository. You are creating .csv files.

On the other hand I have the location where to put the dumps
configurable. Same with the location where to put the reports. So
that test cases do not get polluted with this stuff. Otherwise for

example in GIT, you would need to put them on ignore, but you
might still see them in an IDE, creating a big mess. Or to avoid the
mess you would remove them after a run? Well I got already

conformatble that the dump folder is under local history of the IDE,
so I can see changes there. Which is also quite convenient.

I am afraid that it sounds as “馬の耳に念仏” in Japanese (“reading the nenbutsu into a horse’s ear”).In fact, too much for me to digest. Anyay, I notice that I should learn from you and ZDD experts.

I have managed to write a couple of codes to count the number of solutions (assignments) for given a polynomial or
a sytem of polynomias over GF(2). Both are below but only main parts.
Rest are untouched as previous version. One of new ones uses XOR operation.
Both of them works well compared with my previous one, but it is not drastically efficient.

I borrowed the same benchmark on the unitary matricies over GF(2).

Codes on counting directly.
gf2_card(_,  0, 0, _):-!.
gf2_card(Vs, 1, N, _):-!, length(Vs, K), N is 2^K.
gf2_card(Vs, X, N, S):- memo(gf2_card(X, Vs)-N, S),
	(	nonvar(N) -> true
	; 	cofact(X, t(A, L, R), S),
		divide_at(A, Vs, H, T),
		gf2_card(T, L, I, S),
		gf2_join(L, R, U, S),
		gf2_card(T, U, J, S),
		N0 is I + J,
		length(H, K),
		N is N0*(2^K)
	).
Codes on building the assignment space first.
%
gf2_solve(_,  0, 0, _):-!.
gf2_solve(Vs, 1, P, S):-!, zdd_power(Vs, P, S).
gf2_solve(Vs, X, P, S):- memo(gf2_unit(X, Vs)-P, S),
	(	nonvar(P) -> true
	; 	cofact(X, t(A, L, R), S),
		divide_at(A, Vs, H, T),
		gf2_solve(T, R, R0, S),
		zdd_insert(A, R0, R1, S),
		(	L = 0 -> P0 = R1
		;	gf2_solve(T, L, L0, S),
			cofact(L1, t(A, L0, L0), S),
			zdd_xor(L1, R1, P0, S)
		),
		zdd_power_rev(H, P0, P, S)
	).
Benchmark on unitary matrix over GF(2)

% ?- between(1, 7, I), orth_matrix(I, M),
%	time((zdd gf2_poly_solve_list(M, Sols), card(Sols, C))),
%	format("Num of sols for orth-matrix(~w) = ~w~n~n", [I, C]),
%	fail; true.
%@ % 308 inferences, 0.000 CPU in 0.000 seconds (91% CPU, 4812500 Lips)
%@ Num of sols for orth-matrix(1) = 1
%@
%@ % 3,942 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 10950000 Lips)
%@ Num of sols for orth-matrix(2) = 2
%@
%@ % 27,493 inferences, 0.002 CPU in 0.002 seconds (98% CPU, 14439601 Lips)
%@ Num of sols for orth-matrix(3) = 6
%@
%@ % 181,676 inferences, 0.013 CPU in 0.013 seconds (97% CPU, 14279337 Lips)
%@ Num of sols for orth-matrix(4) = 48
%@
%@ % 1,475,875 inferences, 0.117 CPU in 0.119 seconds (98% CPU, 12665737 Lips)
%@ Num of sols for orth-matrix(5) = 720
%@
%@ % 25,199,511 inferences, 2.618 CPU in 2.667 seconds (98% CPU, 9625857 Lips)
%@ Num of sols for orth-matrix(6) = 23040
%@
%@ % 844,701,675 inferences, 108.134 CPU in 110.328 seconds (98% CPU, 7811610 Lips)
%@ Num of sols for orth-matrix(7) = 1451520
%@
%@ true.

Yes. surely memo is useless for the case. Do you think memo is useless
in general for counting solutions in GF(2) ? I am not sure, but it may be. It is my custom to use it as table of SWI-Prolog without considering polynomials in GF(2).

Without memo.

%@ % 25,196,259 inferences, 1.856 CPU in 1.902 seconds (98% CPU, 13574166 Lips)
%@ Num of sols for orth-matrix(6) = 23040

With memo.

%@ % 25,199,511 inferences, 2.618 CPU in 2.667 seconds (98% CPU, 9625857 Lips)
%@ Num of sols for orth-matrix(6) = 23040

I have tested for a case like counting the number of elements in a power set, in which memo is necessary. With memo, counting for solutions for a polynomial with 1000 variables is done in a few second. Otherwise (without memo) for only 20 variables, counting seems going to looping. So I have to conclude that memo is necessary also for counting solutions for GF2.

Let me add comments to the test, which made me to conclude that memo (like table ) is necessary.

I used the “power polynomial” over variables x1, x2, …, xn defined by this.

 P1 = 1 + x1
 P(n+1) = Pn + x(n+1)*P(n)

One can produce in prolog by power_poly below, using 1,2,3,
as variables in the test below for simplicity.

% ?- power_poly(1, X).
% ?- power_poly(2, X).
%@ X = 1+x(1)+x(2)*(1+x(1)).
% ?- power_poly(3, X).
%@ X = 1+x(1)+x(2)*(1+x(1))+x(3)*(1+x(1)+x(2)*(1+x(1))).

power_poly(1, 1 + x(1)):-!.
power_poly(N, P+x(N)*P):- N0 is N-1, power_poly(N0, P).

The polynomial P(1000) is too large as a usual prolog term. So I created it
in ZDD using ZDD commands as below

BTW, according to the result by gf2_solve, solution seems unique, right? I tested for only 1, 2 by hand. If true, I rediscovered a " theorem" though
I have not a proof yet.

EDIT Aha, it is obvious by induction like a joke, though I checked by hand.

% ?- numlist(1, 1000, Ns),
%	time(((zdd zdd_power(Ns, P), gf2_solve(Ns, P, Sols),
%	card(P, D), card(Sols, C)))).
%@ % 256,597 inferences, 0.019 CPU in 0.021 seconds (94% CPU, 13240983 Lips)
%@ Ns = [1, 2, 3, 4, 5, 6, 7, 8, 9|...],
%@ P = 1001,
%@ Sols = C, C = 1,
%@ D = 10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376.

%@ % 222,001 inferences, 0.018 CPU in 0.020 seconds (90% CPU, 12586518 Lips)
%@ Ns = [1, 2, 3, 4, 5, 6, 7, 8, 9|...],
%@ P = 1001,
%@ Sols = C, C = 1.

For now, portability of the driver and tests themselves is a bigger issue. The most serious issue is that most benchmark are small programs, many of which stress arithmetic. That is not very representative. SWI-Prolog mostly likes big programs with large data sets. On various real world programs its performance has been reported to be pretty similar to e.g., YAP and SICStus. Sometimes much better (notably when large data sets or using lots of dynamic predicates), sometimes much worst (notably when constraints or tabling are important).

To avoid pollution of the directory with the intermediate files is easy. More annoying is that several Prolog systems insist adding compiled versions to the directory.

Possibly I’ll rewrite the scripts that connect the whole thing and makes the final output in (SWI-)Prolog :slight_smile:

No, I can’t do that. For dimension n = 7, 3.73.gb was needed. (Activity Monitor) . I remember that for n = 8, 80 gb was needed several years ago. Anyway there must be much room in my codes to be revised. But for the time being, I have no plan to revisions. No idea for revision.

% ?- orth_matrix(7, _M),
%	profile(time((zdd gf2_poly_solve_list(_M, Sols), card(Sols, C)))).
%@ % 955,921,898 inferences, 119.263 CPU in 124.051 seconds (96% CPU, 8015238 Lips)
%@ Sols = 5138659,
%@ C = 1451520.

I have revised around GF(2) polynomials, which is not uploaded yet, the essential part of revision were posted on predicates gf2_poly and gf2_solve.

I don’t know about FBDD. Neither about things at edges of related researches. ZDD is a system of handling families of sets. To be honest, this is all what know about ZDD. (Plus rehash, functor as array, setarg for implementation, which were due to advice by Jan W.) Sorry for my poor knowledge compared with yours, and
for being reluctant to learn the state of the art.

The log below says that the family X has 2^100 elements. X=101 means that this big family is represented with only at most 101 “nodes”, which might be well known as memory efficiency of ZDD/BDD. A family of sets is represented in a binary DAG. Thus, I’m afraid that substructure sharing in ZDD/BDD could not be measured with term_factorized/3.

% ?- zdd X<<pow(:numlist(1, 100)).
%@ X = 101.

Assuming XOR as GF(2)-algebra
if the big prolog term (“power polynomial”) P(n) could be read in as GF(2) polynomial internally, the binary DAG of P(n) clearly is isomorphic to that of the power set of a set of cardinality n. In fact, my GF(2) solver can read P(20), but not for P(30). However, it can solve some how internalized P(1000) in about 1/100 second. As a prolog term P(1000) is more than surely too big, it is not clear for me how your system solves P(1000). Although surely I am missing idea on XOR, it is curious. My question is: as for P(1000), can it solve as fast as my gf2_solve solves it in a second ?

+ is for XOR.

(x*x=x, x+x=0)

My question is about polynomial P(n).

P(1) = 1 + x
P(2) = 1 + x + y + x*y
P(3) = 1 + x + y + z + x*y + y*z + z*x + x*y*z
...

My GF(2)-solver solves P(20)=1, but not for P(30), which is too big to be read. But as internal form is clearly known,
it can count the number of solutions in GF(2) even for P(n) with P=1000, and more over very quickly. How about your system ? I would like to see
power of your XOR representation and solving for P(n)=1, P(30) for example.

Looks very fast. I see P(n) for n=17.
How about P(30) ? and the number of solutions of P(30)=1. Finally for P(1000) ?
is it also possible in a similar way to my gf2_solve ? This is just for my curiosity. If it takes time for you, I will test it later.

You seem to do in a day what I would need one month to do.
However, I would like to say ZDD is so simple for programming
that I could do a similar work in ZDD in a couple of days.

Anyway, Congratulation !