Lazy list issue

I’ve found a weird issue adapting the code in library/pure_input.pl to add some generality:

:- meta_predicate(lazy_list(2, -)).
lazy_list(Generator, List) :-
    put_attr(List, lazy, lazy(Generator, _)).

lazy:attr_unify_hook(State, Value) :-
    State = lazy(Generator, Next),
    (var(Next) ->
        call(Generator, List, Tail),
        (var(Tail) ->
            lazy_list(Generator, Tail),
            nb_linkarg(2, State, List)
        ;
            nb_setarg(2, State, List)
        ),
        Value = List
    ;
        Value = Next
    ).


n(Count, End, N, Next, List, Tail) :-
    (Count = 0 ->
        List = Tail,
        Next = N
    ;
        (N = End ->
            Tail = [],
            List = [],
            Next = N
        ;
            List = [N|List1],
            N1 is N + 1,
            Count1 is Count - 1,
            n(Count1, End, N1, Next, List1, Tail)
        )
    ).

n(Count, End, List, Tail) :-
    nb_getval(next, Next),
    n(Count, End, Next, Next1, List, Tail),
    nb_setval(next, Next1).

w([]) :-
    !.
w([H|T]) :-
    write(H),
    write(' '),
    w(T).

test :-
    nb_setval(next, 0),
    lazy_list(n(10, 100), List),
    (
      List = [0,1,2,3,4,5,6,7,8,9|List1],
      writeln(List),
      (List1 = [a] ->
          true
      ;
          true
      ),
      writeln(List),
      (
        List1 = [10,11,12,13,14,15,16,17,18,19,20|_],
        writeln(List),
        false
      ->
        true
      ;
        true
      ),
      writeln(List),
      List1 = [10|_],
      writeln(List)
    ),
    writeln(List),
    w(List).

Some basic tests seems to work as expected also for backtracking case (where the strategy to avoid copying is a bit of black magic under the assumption that pure_input does things in the right way):

Welcome to SWI-Prolog (threaded, 64 bits, version 8.5.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- [q].
true.

?- test.
[0,1,2,3,4,5,6,7,8,9|_15590]
[0,1,2,3,4,5,6,7,8,9|_15590]
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29|_15920]
[0,1,2,3,4,5,6,7,8,9|_15590]
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19|_15712]
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19|_15712]
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
true.

The weird thing is that I get different (wrong) results in debug mode:

Welcome to SWI-Prolog (threaded, 64 bits, version 8.5.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- [q].
true.

?- debug.
true.

[debug]  ?- test.
[0,1,2,3,4,5,6,7,8,9|_19026]
[0,1,2,3,4,5,6,7,8,9|_19026]
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20|_19384]
[0,1,2,3,4,5,6,7,8,9|_19026]
[0,1,2,3,4,5,6,7,8,9,10|_19082]
[0,1,2,3,4,5,6,7,8,9,10|_19082]
0 1 2 3 4 5 6 7 8 9 10 
true.

I believe this is definitely unexpected, am I missing something?

The following shows that the same bug is triggered also using stream_to_lazy_list:

:- use_module(library(pure_input)).

slurp([]).
slurp([_|T]) :-
    slurp(T).

t(List) :-
    \+ \+ slurp(List),
    slurp(List).

test(File) :-
    open(File, read, S),
    stream_to_lazy_list(S, List),
    t(List),
    close(S).
$ swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 8.5.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- [bug].
true.

?- test(big_file).
true ;
false.

?- debug.
true.

[debug]  ?- test(big_file).
ERROR: Stack limit (1.0Gb) exceeded
ERROR:   Stack sizes: local: 0.2Gb, global: 0.5Gb, trail: 20Kb
ERROR:   Stack depth: 1,309,503, last-call: 0%, Choice points: 1,309,502
ERROR:   Possible non-terminating recursion:
ERROR:     [1,309,503] user:slurp([length:1,229|_139811828])
ERROR:     [1,309,502] user:slurp([length:1,230|_139811852])
   Exception: (1,309,502) slurp([45, 32, 51, 50, 41, 10, 10, 47|...]) ? 

What is the bug? As t/1 keeps access to the list as a whole, GC cannot discard it and it will run out of stack. At least, that is my quick assessment.

You are perfectly right, I apologize.

About original problem in my lazy_list implementation on top of this post I’ve found the part I’ve missed from pure_input.pl black magic: if I surround the execution of unify hook with notrace/1 (as in pure_input.pl) the behavior in debug mode and in ordinary mode is the same.

Perhaps someone will find useful my experience.

Unfortunately pure_input code does not clarify why notrace is a fundamental part.

I’ve noted also that lazy_lists.pl works in a different way and seems to still need some term copying.

1 Like

I’m afraid I don’t know either. It was quite a challenge to come up with the implementation. I needed something to replace Ulrich’s implementation of pure input as he did not consent the license change. As I also wanted to get rid of the requirements for something external that allows going back I had a nice challenge :slight_smile: Eventually I managed. If I recall well I didn’t understand the debugger interaction as well and solved it this way. I guess it must be considered a bug in the debugger. IMHO there are more pressing things to improve.

I’m not sure that they are related, but I’ve just discovered that memberchk does not work as expected on lazy lists (at least not how I expect):

$ cat bug.pl
:- use_module(library(pure_input)).

test(File) :-
    open(File, read, S),
    stream_to_lazy_list(S, List),
    memberchk(10, List),
    writeln(List),
    close(S).

test1(File) :-
    open(File, read, S),
    stream_to_lazy_list(S, List),
    once(member(10, List)),
    writeln(List),
    close(S).
abramo@igor:~$ swipl
Welcome to SWI-Prolog (threaded, 64 bits, version 8.5.1)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- [bug].
true.

?- test('bug.pl').
_20172
true.

?- test1('bug.pl').
[58,45,32,117,115,101,95,109,111,100,117,108,101,40,108,105,98,114,97,114,121,40,112,117,114,101,95,105,110,112,117,116,41,41,46,10,10,116,101,115,116,40,70,105,108,101,41,32,58,45,10,32,32,32,32,111,112,101,110,40,70,105,108,101,44,32,114,101,97,100,44,32,83,41,44,10,32,32,32,32,115,116,114,101,97,109,95,116,111,95,108,97,122,121,95,108,105,115,116,40,83,44,32,76,105,115,116,41,44,10,32,32,32,32,109,101,109,98,101,114,99,104,107,40,49,48,44,32,76,105,115,116,41,44,10,32,32,32,32,119,114,105,116,101,108,110,40,76,105,115,116,41,44,10,32,32,32,32,99,108,111,115,101,40,83,41,46,10,10,116,101,115,116,49,40,70,105,108,101,41,32,58,45,10,32,32,32,32,111,112,101,110,40,70,105,108,101,44,32,114,101,97,100,44,32,83,41,44,10,32,32,32,32,115,116,114,101,97,109,95,116,111,95,108,97,122,121,95,108,105,115,116,40,83,44,32,76,105,115,116,41,44,10,32,32,32,32,111,110,99,101,40,109,101,109,98,101,114,40,49,48,44,32,76,105,115,116,41,41,44,10,32,32,32,32,119,114,105,116,101,108,110,40,76,105,115,116,41,44,10,32,32,32,32,99,108,111,115,101,40,83,41,46,10|_26334]
true.

The same effect is visible also with freeze/2:

test2 :-
    freeze(List, List = [12,10]),
    memberchk(10, List),
    writeln(List).

Note that, once documented, it might also be said that memberchk/2 should not be used on arguments triggering unify hooks, but I think it should be clarified which builtins have the same limit.

I see. No, this is not related to the trace mode and tricky stuff in lazy lists. It is related to constraint wakeup in relation to foreign code. This works fine for list elements, e.g.,

102 ?- freeze(X, X = 2), memberchk(E, [X]).
X = E,
freeze(E, E=2).

103 ?- freeze(X, X = 2), memberchk(2, [X]).
X = 2.

104 ?- freeze(X, X = 2), memberchk(3, [X]).
false.

Wakeups inside foreign code is possible, but is always semidet. That is fine for list elements and memberchk/2, but not for the list as a whole as the constraint must be able to produce alternative lists.

Not sure how to fix that. One option might be for the C part of memberchk/2 to have arity 3 to make the unbound tail available. You’d get something like

memberchk(E, List) :-
    '$memberchk'(E, List, Tail),
    (    var(Tail)
    ->   member(E, Tail), !
    ;    true
    ).

That would fix the semantics at the price of some constant overhead.

IMHO it is not a good idea to slow down things for a marginal case, but it might be a good idea to introduce foreign memberchk/3 nevertheless for the user needing such marginal case.
The important thing is that the limits of wakeup in foreign code are accurately documented to avoid unexpected behaviors in code relying on the transparency of such wakeups.

BTW: in your code above can’t memberchk/2 be called recursively instead:

memberchk(E, List) :-
    '$memberchk'(E, List, Tail),
    (    var(Tail)
    ->   Tail = [_|_], memberchk(E, Tail)
    ;    true
    ).

Pushed 90b506900bda7ee6093bb8f308d61215d9e5be72 to address this. Your last remark is correct. Pushed as a second patch.

The performance impact using 1M memberchk/2 calls on a list of length 1. Note that the failure branch is less affected, so this is the worst case possible. Time for the 1M lookups goes from 0.075 to 0.116 sec. For a list of 32 barely measurable (about 1%).

I think correctness comes first, especially with such as de-facto standard predicate. If necessary we could provide a variant under a different name with limited functionality. Note that dropping constraint wakeup for each element would make the code faster as well. I do not know by how much. We could really gain a lot by using ==/2 equality test rather than unification. That means no data is created during the scan and thus there are no garbage collections and stack shifts so we can use the more low-level data representation. That also means we simply fail on a partial list (or raise an exception). Various libraries have code like this under names such as memberchk_eq/2, var_memberchk/2, etc.