Fold over answers

I think I missed the real question by @emiruz. The foldl at the end of their question: in library(aggregate) there are templates, but you cannot provide your own aggregation operation. I think @j4n_bur53 showed something in another thread but I can no longer find it. What would be a possible implementation for a fold for a backtrackable goal?

The idea is just to provide the aggregation loop though a meta-predicate. The first argument is an aggregation operation that is applied in turn to the solutions of the backtrackable goal. Here is a naive attempt:

foldl_bt(Op, Goal, V0, V) :-
    State = state(V0),
    (  call(Goal),
           arg(1, State, S0),
           call(Op, S0, S),
           nb_setarg(1, State, S),
           fail
    ;  arg(1, State, V)
    ).

This seems to work. If I define:

sum(A, B, S) :- S is A + B.

then:

?- aggregate_all(sum(X), between(1, 3, X), Sum).
Sum = 6.

?- foldl_bt(sum(X), between(1, 3, X), 0, Sum).
Sum = 6.

?- foldl_bt(sum(1), between(1, 3, X), 0, Count).
Count = 3.

It also nicely reverses concatenation like the original foldl.

?- foldl(atom_concat, [a,b,c], '', L).
L = cba.

?- foldl_bt(atom_concat(X), (X=a ; X=b ; X=c), '', L).
L = cba.

What am I missing?

PS: Back to the original question

Depending on the processing that is involved, this could look roughly like this:

foldl_bt(aggregation_op(X),
    (   foo(A, B),
        bar(B, C),
        baz(C, X)
    ), 0, Result).
3 Likes

Nothing in particular. As is, it is even quite competitive in speed:

?- time(aggregate_all(sum(X), between(1, 1 000 000, X), Sum)).
% 3,000,001 inferences, 0.145 CPU in 0.146 seconds (100% CPU, 20656958 Lips)
Sum = 500_000_500_000.

?- time(foldl_bt(sum(X), between(1, 1 000 000, X), 0, Sum)).
% 4,000,000 inferences, 0.196 CPU in 0.197 seconds (100% CPU, 20414091 Lips)
Sum = 500_000_500_000.

This though, is due to the fact that the state is a small integer, which makes nb_setarg/3 fast … Well, it is not so bad. Consider

sc(X, C0-S0, C-S) :-
    C is C0+1,
    S is S0+X.

And now :slight_smile: We should recall that aggregate_all/3 using sum(X) or count uses a state-based implementation, but the count-sum(X) (currently) uses findall/3 and aggregation. But, still this is not bad at all.

?- time(foldl_bt(sc(X), between(1, 1 000 000, X), 0-0, SumCount)).
% 3,000,002 inferences, 0.260 CPU in 0.261 seconds (100% CPU, 11541819 Lips)
SumCount = 1_000_000-500_000_500_000.

?- ?- time(aggregate_all(count-sum(X), between(1, 1 000 000, X), SumCount)).
% 14,004,607 inferences, 0.575 CPU in 0.578 seconds (99% CPU, 24354554 Lips)
SumCount = 1_000_000-500_000_500_000.

Profiling shows the picture below.

I am in favour of adding this to the library. Now the name. foldl_bt/4 is surely not good. “fold” must be part of it. The “l” is for left, which is not relevant here. Maybe just “foldall/4” or “fold_all/4”?

3 Likes

I am happy that this is useful! I have been thinking about it for a while but never needed it so some credit goes to @emiruz for asking.

I don’t have any way to make a judgement. After reading the Wikipedia article on the topic, if this is a fold at all, it is still a left fold:

On lists, there are two obvious ways to carry this out: either by combining the first element with the result of recursively combining the rest (called a right fold), or by combining the result of recursively combining all elements but the last one, with the last element (called a left fold). This corresponds to a binary operator being either right-associative or left-associative, in Haskell’s or Prolog’s terminology. With a right fold, the sum would be parenthesized as 1 + (2 + (3 + (4 + 5))), whereas with a left fold it would be parenthesized as (((1 + 2) + 3) + 4) + 5.

But I am certain that others here have deeper understanding of the field.

If it is part of library(aggregate), yet another option would be aggregate_fold, similar to re_fold? But really, both foldall and fold_all are good enough, and more importantly, shorter names.

1 Like

Right.

I’m not a fan of that. Folding is a form of aggregation, I’d say. Other candidates might be fold_answers/4?

Where is yet another issue. library(aggregate) comes to mind. It is a de-facto standard library though, so adding things is less obvious. On the other hand it already contains some only vaguely related predicates such as foreach/2. If this is the target library, fold_all/4 is the name that fits best (IMO).

1 Like

Maybe library(solution_sequences) could be an appropriate place

This is quite useful; thank you. I can see now how backtracking obviates lazy lists for my use cases. I also think that a fold over a Goal is badly needed in the standard library. It has occurred many times that I have to do findall first and then feed it to foldl, or in some of my AoC code, I’ve use lazy_findall + foldl to do roughly what you’ve done above.

1 Like

Some more good news. In my internal version I enhanced nb_setarg/3 to avoid re-copying the old value. That allows for implementing findall/3 as

join(X, L, [X|L]).

my_findall(T, G, L) :-
    foldl_bt(join(T), G, [], L0),
    reverse(L0, L).

Tested over between(1,1_000_000,X), the performance of this is roughly less than twice as slow as native findall/3. I expect it already outperforms findall/3 on a low number of answers (not tested). Timing is less reproducible as GC is involved.

These results are so good that one starts wondering adding this as foldall/4 to the core?

2 Likes

The problem is that aggregate_all/3 doesn’t run always where possible
in constant memory, making it slow. You can try this test case:

/* SWI-Prolog 9.3.16 */
?- aggregate_all((count,sum(X)), between(1, 10000000, X), CountSum).
ERROR: Stack limit (1.0Gb) exceeded

But its relatively easy to re-implement aggregate_all/3 to allow
constant memory also for composite aggregates:

/* Dogelog Player 1.2.5 */
?- aggregate_all((count,sum(X)), between(1, 10000000, X), CountSum).
CountSum = (10000000, 50000005000000).

I still do not give up the hope that there are competent people around
who can make a change and then a pull request.

Edit 29.12.2024
fold_bt/4 with sc/3 has the disadvantage that you do
re-create a result cell for the pair C-S on each iteration.

But why not make a pair value holder so that sys_aggr_next/2
doesn’t create a new result cell each time? Like here:

sys_aggr_init(count, X) :- X = v(_), change_arg(1, X, 0).
sys_aggr_init(sum(_), X) :- X = v(_), change_arg(1, X, 0).
sys_aggr_init((F,G), Z) :- Z = (_,_),   %%% make pair value holder
   sys_aggr_init(F, X), change_arg(1, Z, X),
   sys_aggr_init(G, Y), change_arg(2, Z, Y).

sys_aggr_next(count, Y) :- arg(1, Y, H), J is H+1, change_arg(1, Y, J).
sys_aggr_next(sum(X), Y) :- arg(1, Y, H), J is H+X, change_arg(1, Y, J).
sys_aggr_next((F,G), (X,Y)) :-
   sys_aggr_next(F, X),
   sys_aggr_next(G, Y).

sys_aggr_fini(count, X, A) :- arg(1, X, A).
sys_aggr_fini(sum(_), X, A) :- arg(1, X, A).
sys_aggr_fini((F,G), (X,Y), (C,D)) :-
   sys_aggr_fini(F, X, C),
   sys_aggr_fini(G, Y, D).

And then simply:

aggregate_all(A, G, S) :-
   sys_aggr_init(A, T),
   (G, sys_aggr_next(A, T), fail; true),
   sys_aggr_fini(A, T, S).

For SWI-Prolog replace change_arg/3 by nb_linkarg/3.

I know. I feared performance would be a show-stopper. It turns out that the slowdown from this is quite bearable though. Together with the simple definition and quite clear semantics I think this turns this into a pretty nice building block. How often do we not see findall/3 with some aggregation over the list? Some of this is covered by aggregate_all/3, but “folding” the answers seems way more flexible.

what stops you?

Yes, this was the post that I could no longer find, thank you for re-posting it. Is this as general as allowing the client code to provide an aggregation operation, or is it about allowing composite aggregates for existing templates?