Partition number in ZDD reply

You can do it in one single predicate:

:- table p/2.
p(0, 1) :- !.
p(N, X) :-
    aggregate_all(sum(Z), (between(1,inf,K), M is K*(3*K-1)//2,
           (M>N, !, fail; L is N-M, p(L,Y), Z is (-1)^K*Y)), A),
    aggregate_all(sum(Z), (between(1,inf,K), M is K*(3*K+1)//2,
           (M>N, !, fail; L is N-M, p(L,Y), Z is (-1)^K*Y)), B),
    X is -A-B.

Here is a results:

/* SWI-Prolog 8.3.21 */
?- time(p(6666,X)).
% 13,962,294 inferences, 2.610 CPU in 2.743 seconds (95% CPU, 5350059 Lips)
X = 1936553061617076610800050733944860919984809503384
05932486880600467114423441282418165863.

Thank you for elegant more prolog-like codes. Fortunately my codes returns the same partition number for 6666 with your codes:

% ?- time(zdd((pn(6666, P)))).
%@ % 10,873,956 inferences, 1.154 CPU in 1.179 seconds (98% CPU, 9422855 Lips)
%@ P = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863.

Kuniaki Mukai

It really souds like an algorithm of demon. Thanks for the side note.

K. Mukai

@j4n_bur53

Sorry, I missed your StackOverflow question. It’s answered now: loops - Partition function P in Picat - Stack Overflow .

Here’s my Picat solution, inspired by the Maple approach, using a foreach while loop (and table):

table
partition1(0) = 1.
partition1(N) = P =>
  S = 0,
  K = 1,
  M = (K*(3*K-1)) // 2,  
  while (M <= N)
     M := (K*(3*K-1)) // 2,  
     S := S - ((-1)**K)*partition1(N-M),
     K := K + 1
  end,
  K := 1,
  M := (K*(3*K+1)) // 2,
  while (M <= N)
     M := (K*(3*K+1)) // 2,  
     S := S - ((-1)**K)*partition1(N-M),
     K := K + 1
  end,
  P = S.

Your (neat) SWI-Prolog version takes about 1,9s for p(6666) on my machines

?- time(p(6666,X)), write(X), nl.
% 13,959,720 inferences, 1.916 CPU in 1.916 seconds (100% CPU, 7285567 Lips)
193655306161707661080005073394486091998480950338405932486880600467114423441282418165863
X = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863.

My (not as neat) Picat version takes about 0.2s

p(6666) = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863

CPU time 0.214 seconds.

Do, i see this correctly, your java based solution is almost as fast as swi prolog’s c-based solution, when the code is written well optimized.

@j4n_bur53 When I wrote “mimicked” the SWI-Prolog approach in Picat, I just meant a solution that didn’t use the explicit while loop. The tests I did was using the latest SWI-version (I always have the latest SWI-Prolog version. :-)). Sorry about the confusion.

Regarding you comment about the positions of the M := line (first or last in the while loops), I tested both versions and they don’t seems to differ at all, neither in result or in speed. However, I checked the number of steps in partition1(10) for the while loops of each version: my version (the M line first) takes 50 steps while your version (M line last) takes 29 steps, so I change to your version.

It’s interesting that there are no changes in times between these versions for partition1(6666) so I tested with p(66666). But there’s no discernible difference in times, both versions ranges between 16.9s and 17.6s.

But I keep your version. Thanks!

Interesting.

Is this feature a non-stack-passing technique tightly integrated into your other prolog implementation features (or even with the java vm) – or could such an optimization be introduced in swi-prolog as well.

I imagine that since parameter passing is so ubiquitous – essentially happens all the time, that this would make a significant difference.

Dan

@j4n_bur53 In Picat there’s an option to convert Picat code to B-Prolog (compile_bp/1). I can do this later today, but now I have to go to the dentist…

That’s too bad – archane isn’t an engineering tradeoff, If it works well – is there an engineering tradeoff one makes to gain the speed.

Dan

Thanks.

In this text you can see the engineering tradeoff: speed of parameter passing (significantly increased) vs. speed of data access (“somwhat slower”) so this might make a lot of sense to do.

Regarding the CPU architecture – since, say, SWI Prolog, runs as a VM, an engineering decision that aims to align with the CPU design for function calling, doesn’t make sense – since this is all happening in the VM and hence a “virtual” CPU …

What you refer to reminds me of structure sharing – is this also what is going on?

Is SWI Prolog structure sharing as well?

Dan

Thank you for the reference.

It is the first time I started reading about a Prolog implementation that i can actually readily understand …

Interestingly, the approach they took – to focus on creating (CPU machine) instructions for unification in the head – would not readily work for implementing SSU – where you want to defer unification to the body – to ensure steadfastness.

… if i gathered this correctly …

Dan

Yes. – will edit

I think what is remarkable here as well is that your Prolog is implemented in Java – not a language known for high (or system level) performance. Yet, it performs very well.

Does it mean that quite a bit more performance can be obtained in swi-prolog if it implements the 'archaic" approach to handing argument passing.

Or, does it mean that Java along with an efficient-oriented way to program it, can get performer close to c.

@j4n_bur53 I’ve compiled the program to B-Prolog code now. It the definition of 'partition1&1 predicate + a call to partition1(6666). Here’s the converted file: http://hakank.org/picat/partition_function_bp.pl

Comments:

  • Picat’s table directive was not ported so one have to add this manually table 'e$$glb$$f$$partition1'/2. (I have not done this in the public file, though).
  • to run it load the file in B-Prolog and run
$ bp
B-Prolog Version 8.1, All rights reserved, (C) Afany Software 1994-2014.
| ?- [partition_function_bp].
consulting::partition_function_bp.pl

yes
| ?- time('e$$glb$$f$$partition1'(A,6666)).

CPU time 1.139 seconds.

A = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863
yes

Notice that this (i.e. running in B-Prolog v.8.1, from 2014) it’s much slower than running in Picat (1.14s vs 0.2s) . I don’t know what optimization Neng-Fa has done in Picat during these 6 years…

Hope this helps.

This is way above my head …

Is this a local optimization to the is operator – to make its binding more efficient?

Or is this a way the VM could do argument binding before goal calls.