Lazy processing

emiruz · December 26, 2024, 10:50pm

Say I’m doing some list processing in the usual way:

process([X|Xs], Acc, Final) :-
   do_work(X,Y), !, process(Xs, [Y|Acc], Final).
process([], Final Final).

I want to put together a chain like this:

process(In, [], Out1),
process(Out1, [], Out2),
process(Out2, [], Final),
foldl(do_more_work, Final, 0, Done).

However, I want it all to be lazy. That is, nothing should happen until foldl starts pulling items, and then the work should happen just in time. None of the intermediary lists nor the final list should ever be materialised.

How would you change the process predicate to make this sort of lazy processing possible? I am aware of lazy lists but it is not obvious to me how to convert process to use them.

Please assume process outputs items in the same order as the input (although that is not shown).

emiruz · December 26, 2024, 11:35pm

The problem there is that the source (random) is not a list, and the outputs do not chain lists. Both of these things are an issue for my use case.

Here is something that works for me it seems:

do_work([], [], 0) :- false, !.
do_work([X|Xs], Xs, Final) :- Final is X + 1.

test(Total) :-
    lazy_list(do_work, [1,2,3,4], L1),
    lazy_list(do_work, L1, L2),
    lazy_list(do_work, L2, L3),
    foldl([X,V0,V]>>(V is V0+X), L3, 0, Total).

It is an example of piping do_work into itself via list 3 times, and then using foldl to sum up the final output.

emiruz · December 26, 2024, 11:48pm

Last thing I need is to be able to chain things which do not return 1-for-1 lists. For example, say at the top I insert [1,2,3] and the next process creates [4,5,6,7,8,9] from that input. I dont see an obvious way to do that with lazy lists or otherwise.

emiruz · December 27, 2024, 7:59am

I’ve got what i need working with lazy_list except its useless for the purpose since its about 20x slower than a fully materialized version. I had the same experience when using lazy_findall/3.

Boris · December 27, 2024, 10:09am

Sorry about the long reply

This way you show is slightly unusual. It is impossible to tell without a concrete example, of course, so I am only guessing; still, the usual way would be more like:

process([], []).
process([X|Xs], [Y|Ys]) :-
    do_work(X,Y),
    process(Xs, Ys).

… which of course is just a maplist:

maplist(do_work, Xs, Ys).

(And I mean, the two are semantically equivalent, and if you have library(apply_macros), they are also identical).

There could be reasons why this is not good enough, for example

your do_work/2 leaves choice points, but you know for certain that you don’t need them;
you need to reverse the order;
something else?

By reversing the order, here is a silly example to show what I mean:

% your original definition
process([X|Xs], Acc, Final) :-
   do_work(X,Y), !, process(Xs, [Y|Acc], Final).
process([], Final, Final).

do_work(X, Y) :- char_code(X, Y).

and now:

?- process([a,b,c], [], Ys).
Ys = [99, 98, 97].

In comparison:

?- maplist(char_code, [a,b,c], Ys).
Ys = [97, 98, 99].

You should play around and see that reversing the order of the clauses makes no difference; and that in the specific case of char_code/2, removing the cut makes not difference. For a non-deterministic predicate “do_work” it will of course make a difference, you will get all possible combinations. Here is a silly example just to demonstrate:

?- maplist(between(0), [1,2], Ys).
Ys = [0, 0] ;
Ys = [0, 1] ;
Ys = [0, 2] ;
Ys = [1, 0] ;
Ys = [1, 1] ;
Ys = [1, 2].

About lazy evaluation. The Prolog alternative to lazy evaluation is backtracking. You can get very far with it. The huge advantage is that it is very fast and needs far less memory; you never need to materialize any lists. The huge disadvantage is that you lose solutions on backtracking. If you can provide a concrete example, it might be easier to judge if you need lists or lazy lists.

ridgeworks · December 27, 2024, 4:34pm

I assume you know that SWI Prolog lazy lists are implemented as constraints using attributed variables, and that unification of attributed variables is going to be orders of magnitude slower that unification that doesn’t involve attributed variables.

To make it concrete, the attr_unify_hook that gets invoked whenever a variable with a lazy_list attribute is unified:

attr_unify_hook(State, Value) :-
    State = lazy_list(Next, Read),
    (   var(Read)
    ->  call(Next, NewList, Tail),
        (   Tail == []
        ->  nb_setarg(2, State, NewList)
        ;   put_attr(Tail, lazy_lists, lazy_list(dummy, _)),  % See (*)
            nb_setarg(2, State, NewList),
            arg(2, State, NewListCP),
            '$skip_list'(_, NewListCP, TailCP),
            get_attr(TailCP, lazy_lists, LazyList),
            nb_linkarg(1, LazyList, Next)
        ),
        arg(2, State, Value)
    ;   Value = Read
    ).

Bottomline: if your focus includes performance, you should understand the tradeoffs involved in using lazy evaluation (i.e. constraints).

jan · December 27, 2024, 4:43pm

Note that several of the predicates involving lazy lists have a chunk size argument or option. That reduces the number of wakeups. Still, there is quite a bit of work involved in extending the lazy list.

emiruz · December 27, 2024, 6:59pm

Thanks, no I didn’t know any of those things, but it all makes sense now.

emiruz · December 27, 2024, 9:47pm

Here is an example of me trying to use lazy lists to solve an AoC2024 problem. Ironically the memory blows up at the default limit, I’ve no idea why.

% 353,135,695 inferences, 32.161 CPU in 32.171 seconds (100% CPU, 10980304 Lips)
ERROR: Stack limit (1.0Gb) exceeded
ERROR:   Stack sizes: local: 2Kb, global: 1.0Gb, trail: 78.1Mb
ERROR:   Stack depth: 1,967, last-call: 99%, Choice points: 7
ERROR:   In:
ERROR:     [1,967] '$attvar':uhook(lazy_lists, <compound lazy_list/2>, [length:1|_2157958])
ERROR:     [1,966] '$attvar':call_all_attr_uhooks(<compound att/3>, [length:1|_2157996])
ERROR:     [1,965] '$attvar':'$wakeup'(<compound wakeup/3>)
ERROR:     [1,964] lists:append([length:1,546], _2158052, [length:1|_2158066])
ERROR:     [1,513] user:tree([length:451|_2158100], 1996, _2158094)
ERROR: 
ERROR: Use the --stack_limit=size[KMG] command line option or
ERROR: ?- set_prolog_flag(stack_limit, 2_147_483_648). to double the limit.

emiruz · December 27, 2024, 11:00pm

The lazy_list abstraction is practical. It’d be great if you could write something using a similar contract which can be used to pipe lists into processes and processes into each other via lists.

I’m not really sure how memoisation fits into all this, but I think its strange that a lazy list ends up exploding the memory due to overhead.

Topic		Replies	Views
Best way to weave list processing predicates? Algorithm how-to	12	808	July 22, 2021
Using lazy_lists module to give out hash of a Swipl installer in Windows General	1	270	May 30, 2023
Laziness with lazy_lists vs freeze General	8	120	August 29, 2025
Lazy list issue Help!	8	491	November 2, 2021
Fold over answers Request For Comments foldl	8	187	December 30, 2024

Lazy processing

Related topics