:- table p/2.
p(0, 1) :- !.
p(N, X) :-
aggregate_all(sum(Z), (between(1,inf,K), M is K*(3*K-1)//2,
(M>N, !, fail; L is N-M, p(L,Y), Z is (-1)^K*Y)), A),
aggregate_all(sum(Z), (between(1,inf,K), M is K*(3*K+1)//2,
(M>N, !, fail; L is N-M, p(L,Y), Z is (-1)^K*Y)), B),
X is -A-B.
Here is a results:
/* SWI-Prolog 8.3.21 */
?- time(p(6666,X)).
% 13,962,294 inferences, 2.610 CPU in 2.743 seconds (95% CPU, 5350059 Lips)
X = 1936553061617076610800050733944860919984809503384
05932486880600467114423441282418165863.
Here’s my Picat solution, inspired by the Maple approach, using a foreach while loop (and table):
table
partition1(0) = 1.
partition1(N) = P =>
S = 0,
K = 1,
M = (K*(3*K-1)) // 2,
while (M <= N)
M := (K*(3*K-1)) // 2,
S := S - ((-1)**K)*partition1(N-M),
K := K + 1
end,
K := 1,
M := (K*(3*K+1)) // 2,
while (M <= N)
M := (K*(3*K+1)) // 2,
S := S - ((-1)**K)*partition1(N-M),
K := K + 1
end,
P = S.
Your (neat) SWI-Prolog version takes about 1,9s for p(6666) on my machines
?- time(p(6666,X)), write(X), nl.
% 13,959,720 inferences, 1.916 CPU in 1.916 seconds (100% CPU, 7285567 Lips)
193655306161707661080005073394486091998480950338405932486880600467114423441282418165863
X = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863.
My (not as neat) Picat version takes about 0.2s
p(6666) = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863
CPU time 0.214 seconds.
@j4n_bur53 When I wrote “mimicked” the SWI-Prolog approach in Picat, I just meant a solution that didn’t use the explicit while loop. The tests I did was using the latest SWI-version (I always have the latest SWI-Prolog version. :-)). Sorry about the confusion.
Regarding you comment about the positions of the M := line (first or last in the while loops), I tested both versions and they don’t seems to differ at all, neither in result or in speed. However, I checked the number of steps in partition1(10) for the while loops of each version: my version (the M line first) takes 50 steps while your version (M line last) takes 29 steps, so I change to your version.
It’s interesting that there are no changes in times between these versions for partition1(6666) so I tested with p(66666). But there’s no discernible difference in times, both versions ranges between 16.9s and 17.6s.
Is this feature a non-stack-passing technique tightly integrated into your other prolog implementation features (or even with the java vm) – or could such an optimization be introduced in swi-prolog as well.
I imagine that since parameter passing is so ubiquitous – essentially happens all the time, that this would make a significant difference.
@j4n_bur53 In Picat there’s an option to convert Picat code to B-Prolog (compile_bp/1). I can do this later today, but now I have to go to the dentist…
In this text you can see the engineering tradeoff: speed of parameter passing (significantly increased) vs. speed of data access (“somwhat slower”) so this might make a lot of sense to do.
Regarding the CPU architecture – since, say, SWI Prolog, runs as a VM, an engineering decision that aims to align with the CPU design for function calling, doesn’t make sense – since this is all happening in the VM and hence a “virtual” CPU …
What you refer to reminds me of structure sharing – is this also what is going on?
It is the first time I started reading about a Prolog implementation that i can actually readily understand …
Interestingly, the approach they took – to focus on creating (CPU machine) instructions for unification in the head – would not readily work for implementing SSU – where you want to defer unification to the body – to ensure steadfastness.
I think what is remarkable here as well is that your Prolog is implemented in Java – not a language known for high (or system level) performance. Yet, it performs very well.
Does it mean that quite a bit more performance can be obtained in swi-prolog if it implements the 'archaic" approach to handing argument passing.
Or, does it mean that Java along with an efficient-oriented way to program it, can get performer close to c.
Picat’s table directive was not ported so one have to add this manually table 'e$$glb$$f$$partition1'/2. (I have not done this in the public file, though).
to run it load the file in B-Prolog and run
$ bp
B-Prolog Version 8.1, All rights reserved, (C) Afany Software 1994-2014.
| ?- [partition_function_bp].
consulting::partition_function_bp.pl
yes
| ?- time('e$$glb$$f$$partition1'(A,6666)).
CPU time 1.139 seconds.
A = 193655306161707661080005073394486091998480950338405932486880600467114423441282418165863
yes
Notice that this (i.e. running in B-Prolog v.8.1, from 2014) it’s much slower than running in Picat (1.14s vs 0.2s) . I don’t know what optimization Neng-Fa has done in Picat during these 6 years…