Foreach bug?

Thanks for comparing the versions, seems a bug to me.

I see. Yes, the current version destructively resets the goal template, so in fact you just use member/2 wit the same physical r(N) term which is in the end reset to a variable.

If you define mem(E,L) :- duplicate_term(E,E2), member(E2,L). it works fine.

Now it is not clear what to do. The current template based approach is not only a lot faster, but it also solved problems resulting from undesirable copying of the goal arguments. Possibly merely documenting this as a limitation is the best option?

2 Likes

thanks for explaining the reason :slightly_smiling_face:

A simpler and faster workaround:

mem(E,L) :-
	member(r(E),L).

t3(L) :-
	foreach(  between(1,5,N),
	          mem(N,L)).

I think the best would be to put a note in the documentation saying:

Note: The implementation of foreach/2 does not properly handle compound terms (in Goal’s arguments) that share variables with the Generator. As a workaround you can define a goal that does not use compound terms, like in this example: …

I think the example is very important because it is a very tricky issue that is hard to understand, the above workaround can be used for the example.

3 Likes

I’ve ran into a similar problem with cplfd.

pred(S,X) :- S #< X.
works(S) :-
    S in 1..100,
    foreach(member(X,[10,3,10]),pred(S,X)).

doestNotWork(S) :-
    S in 1..100,
    foreach(member(X,[10,3,10]), (S #< X)).

What is different between pred(S,X) and #<(S,X) that the former works but the later not?

In general I don’t think it is a good idea to mix clpfd with foreach or other similar predicates. In your
case I think using something like this is much better:


p(S) :-
	S in 1..100,
	maplist(#<(S), [10,3,10]).

clpfd uses attributed variables, and I suspect the new way of copying terms used in foreach/2 has problems with this. When you use pred/2, there is an indirection between clpfd and foreach and it isolates the variable attributes used by #< inside pred/2, so it is not affected by the new way of copying terms. This is my best guess to explain it at the moment.

1 Like

First of all, I cannot reproduce @jstranik’s problem. Both work for me, giving the same result. Which version are you using?

@swi’s maplist/2 solution is a lot simpler and more robust. I generally have my doubt about foreach/2. It is unclear what needs to be copied. The original implementation had both semantic and performance issues. The current one has different semantic issues as pointed out in this thread :frowning:

Possibly we should deprecate it? Or, could we prove it is sound and safe using an intermediate predicate that we can generate on the fly or using goal_expansion/2?

I (now) tried to reproduce it, and it doesn’t reproduce for me either. Both work for me too.

Doesn’t work for me:

?- version.
Welcome to SWI-Prolog (threaded, 64 bits, version 8.4.0)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

true.

?- works(S).
S in 1..2.

?- doestNotWork(S).
false.

I thought it might be a version issue but I get the same results on SWISH.

Looking at a trace, it appears to me to be some interaction (perhaps the same already noted) between foreach/2 and clpfd’s goal expansion.

@ridgeworks thanks. It is indeed not related to the version but to whether or not the code is loaded that triggers clpfd expansion. If library(aggregate) is loaded, the compiler knows foreach/2 is a meta-predicate and performs clpfd expansion on it, yielding:

?- listing(doestNotWork).
doestNotWork(S) :-
    (   integer(S)
    ->  between(1, 100, S)
    ;   clpfd:clpfd_in(S, 1..100)
    ),
    foreach(member(X, [10, 3, 10]),
            (   integer(X)
            ->  (   integer(S)
                ->  X>=S+1
                ;   A=X,
                    clpfd:clpfd_geq(A, S+1)
                )
            ;   integer(S)
            ->  A is S+1,
                clpfd:clpfd_geq(X, A)
            ;   clpfd:clpfd_geq(X, S+1)
            )).

Now it is a bit unclear what goes wrong. It looks like this is not caused by the copying issues of foreach/2, but by a bug.

First of all, I cannot reproduce @jstranik’s problem. Both work for
me, giving the same result. Which version are you using?

Actually, I cannot reproduce it now either:

Welcome to SWI-Prolog (threaded, 64 bits, version 8.3.29)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- [user].
>: :- use_module(library(clpfd)).

%   library(apply) compiled into apply 0.01 sec, 75 clauses
%   library(apply_macros) compiled into apply_macros 0.01 sec, 53 clauses
%   library(assoc) compiled into assoc 0.01 sec, 98 clauses
%   library(pairs) compiled into pairs 0.00 sec, 21 clauses
%  library(clpfd) compiled into clpfd 0.17 sec, 1,405 clauses
>: pred(S,X) :- S #< X.
>: works(S) :-
>:     S in 1..100,
>:     foreach(member(X,[10,3,10]),pred(S,X)).
>:
>: doestNotWork(S) :-
>:     S in 1..100,
>:     foreach(member(X,[10,3,10]), (S #< X)).
>: ^D% user://1 compiled 0.22 sec, 4 clauses
true.

?- works(S).
S in 1..2.

?- doestNotWork(S).
S in 1..2.

I’m not how I got it not working before. Previously I observed that the
second query failed.

@swi’s maplist/2 solution is a lot simpler and more robust. I
generally have my doubt about foreach/2. It is unclear what needs to
be copied. The original implementation had both semantic and
performance issues. The current one has different semantic issues as
pointed out in this thread :fr

Possibly we should deprecate it? Or, could we prove it is sound and
safe using an intermediate predicate that we can generate on the fly
or using goal_expansion/2?

The goal expansion approach seems reasonable. maplist/2 works in cases
where I spatial representation of data. In case of temporal
representation, foreach/2 seems more appropriate. Of course, I can
always always convert to list using findall.

Anyway, thanks for looking into the issue.

I guess this is due to the way you represent these?

Anyway, I got to the cause. Working on the expanded version like this:

bad(S) :-
    (   integer(S)
    ->  between(1, 100, S)
    ;   clpfd:clpfd_in(S, 1..100)
    ),
    foreach(member(X, [10, 3, 10]),
            g((   integer(X)
              ->  (   integer(S)
                  ->  X>=S+1
                  ;   A=X,
                      clpfd:clpfd_geq(A, S+1)
                  )
              ;   integer(S)
              ->  A is S+1,
                  clpfd:clpfd_geq(X, A)
              ;   clpfd:clpfd_geq(X, S+1)
              ))).

g(X) :- portray_clause(<> :- X), X.

If we run this, we get the output below. What we see is the the execution of the first call bound the variable B to 10. The next call however, only the variables that are shared between the foreach/2 generator and goal are reset (and bound to their new value) and thus we fail (see instantiated second goal).

?- bad(X).
<> :-
    (   integer(10)
    ->  (   integer(A)
        ->  10>=A+1
        ;   B=10,
            clpfd:clpfd_geq(B, A+1)
        )
    ;   integer(A)
    ->  B is A+1,
        clpfd:clpfd_geq(10, B)
    ;   clpfd:clpfd_geq(10, A+1)
    ).
<> :-
    (   integer(3)
    ->  (   integer(A)
        ->  3>=A+1
        ;   10=3,
            clpfd:clpfd_geq(10, A+1)
        )
    ;   integer(A)
    ->  10 is A+1,
        clpfd:clpfd_geq(3, 10)
    ;   clpfd:clpfd_geq(3, A+1)
    ).
false.

Now, this isn’t wrong as it is exactly the same behavior that makes this work as expected (if I’m not mistaken, please correct when I’m wrong):

?- foreach(between(1,5,X), member(X,L)).
L = [1, 2, 3, 4, 5|_] .

I’m afraid I do not see an obvious (or even non-obvious) solution out of this. In this particular case you’d want to handle the variables introduced by the goal expansion special, i.e., you would like to include them into the template that is reset between the iterations. I’m not sure you’d like that to happen in general though. Such a variable could act as an internal accumulator (or maybe not?)

Another approach could be to have the user specifying which variables should not be reset, so we get foreach/3 as e.g.

foreach(member(X,[10,3,10]), [S], (S #< X)).

Would be great if we can find a specification that also fixes the original problem (repeated here):

?- foreach(between(1,5,N),  member(r(N),L)).
L = [r(N)|_].    % rather than the expected L = [r(1),r(2), ...].

Any ideas?

1 Like

Anyway, I got to the cause. Working on the expanded version like
this:

bad(S) :-
    (   integer(S)
    ->  between(1, 100, S)
    ;   clpfd:clpfd_in(S, 1..100)
    ),
    foreach(member(X, [10, 3, 10]),
            g((   integer(X)
              ->  (   integer(S)
                  ->  X>=S+1
                  ;   A=X,
                      clpfd:clpfd_geq(A, S+1)
                  )
              ;   integer(S)
              ->  A is S+1,
                  clpfd:clpfd_geq(X, A)
              ;   clpfd:clpfd_geq(X, S+1)
              ))).

g(X) :- portray_clause(<> :- X), X.

Nice. I now understand how expansion causes problem here. Thanks.

What I found interesting is that I was able to reproduce the problem
when I submitted it in a running session where I reloaded the problem
file repeatedly using make/0. Later I had trouble reproducing the issue
in clean swipl instance.

Another approach could be to have the user specifying which variables
should not be reset, so we get foreach/3 as e.g.

foreach(member(X,[10,3,10]), [S], (S #< X)).

Would be great if we can find a specification that also fixes the
original problem (repeated here):

?- foreach(between(1,5,N),  member(r(N),L)).
L = [r(N)|_].    % rather than the expected L = [r(1),r(2), ...].

Any ideas?

Agree that likely path is to be able to specify in some way what
variables should be reset and which not. The approach

foreach(member(X,[10,3,10]), [S], (S #< X)):

to specify sharing variables feels like introducing yet another pattern.
We already have Variable^(Goal) for bag. and {Variable}/[X]>>() for
yall lambda functions.

Current solutions all seem to have a common problem. E.g. when I define
lambda, I usually think of parameter as new identifiers.

If we want to produce L & A, everything works:

?- maplist([A,p{key:A}]>>true, [1,2,3],L), A=6.
A = 6,
L = [p{key:1}, p{key:2}, p{key:3}].

but stuff breaks, when we reorder the predicates:

?- A=6, maplist([A,p{key:A}]>>true, [1,2,3],L).
false.

The lambdas are not first class citizen and it shows. In lambda
parameter the variable should be unbound variable that is scoped to the
lambda itself. When the parameter is bound in lambda, the variable is
unbound outside, as expected. But when the variable is bound, lambdas
breaks.

I feel I cannot express lambda cleanly in prolog other than carefully
checking that the lambda parameters are unique. That is unfortunate
because lambda are useful in higher level functions.

I would like to have some construct where I can say that the variable name should
be scoped to the subexpression and should not bind anywhere else.

Not sure about the best syntax of that. I could think of something like:

?- A=6, maplist([~A,p{key:A}]>>true, [1,2,3],L).

where ~ would signify fresh variable name for the entire lambda.

I don’t think that this feature can be implemented as a library.

The idea of scoped variables is not easy in Prolog. There is no syntax for that, so variables are scoped over a term. Of course, macro expansion can rename variables in a scope. AFAIK, yall does scoping. This however is effective if the lambda is compiled and not of the lambda expression is dynamically called. This gets us into similar problems as the original foreach/2 issue were semantics are affected in subtle and hard to track ways :frowning:

Good point! Than, it might be something completely different and using one of the existing notations makes it even more confusing :frowning:

Back to the original problem

?- foreach(between(1,5,N),  member(r(N),L)).

Here we must specify r(N) is the term we want to create copies of. A working (as in doing what the naive user would expect to happen) implementation translates the above in:

?- foreach(between(1,5,N),  (duplicate_term(r(N),R), member(R,L))).

So, we need to

  • Specify sub-terms of the 2nd argument that must be duplicated on each iteration
  • Specify which variables in the 2nd argument must (not) be reset on each iteration

The quest is a good syntax and good defaults. I’m not a yall/lambda fan (keeps leaving me confused). The spec might be related though. Anyone?

Your comment reminded me of my old question to myself that why foldl/4 like meta predicate does not exist for integer intervals, as foreach/2 looks like a generalized form foldl/4.

EDIT: the semantics of forall/2 is clear , but foreach/2 is not so clear for me in spite of its familiar name. IMHO, the above two conditions are not clear for the coder.

My package pac has experimental several such mete predicates like foldl/4. For example foldnum. Although they are experimental and for fun, but sometime they are handy in case in a hurry for preparing debugging queries.

EDIT: A simple minded version of foreach. I am afraid this is
too much simplified missing some important issue discussed here.

foreach(A, P, Q, X, Y):- findall(A, P, U), foldl(Q, U, X, Y).

% ?- foreach([I], between(1,10, I), pred([[J], A, B]:- B is A+J), 0, S).
%@ S = 55.

end of EDIT

% ?- show(foldnum(writeln, 1-10)).
%@ pac:pac#16(1), where
%@ pac:pac#16(A):-pac:(A>10),!
%@ pac:pac#16(A):-pac:writeln(A),pac:(B is A+1),pac:pac#16(B)
%@ true.
% ?- foldnum(writeln, 1-10).
%@ 1
%@ 2
%@ 3
%@ 4
%@ 5
%@ 6
%@ 7
%@ 8
%@ 9
%@ 10
%@ true.
% ?- F = plus, foldnum(F, 1-100, 0, R).
%@ F = plus,
%@ R = 5050.
% ?- N=100,  functor(A, #, N),
%	forall(between(1, N, I), nb_setarg(I, A, I)),
%	foldnum(pred(A, ([J, C, D]:- arg(J, A, Aj), D is C * Aj) ),
%			1 - N, 1, S).
%@ N = 100,
%@ A = #(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100),
%@ S = 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000.

For curiosity and testing, I added my version foreach to my pac library. I think discussions in this thread are much more deep than I could understand currently. So I do not insist at all my first intuition.

EDIT:

% ?- alaforeach(I, between(1, 10, I), pred([J, A, B]:- B is A+J), 0, S).
%@ S = 55.

etc(alaforeach, [A, F, G|Args], M, meta:H, P, Q):- (var(F); var(G)), !,
	expand_arg(F, M, F0, P, P0),
	expand_arg(G, M, G0, P0, Q),
	complete_args(foreach(A, F0, G0), Args, H).
etc(alaforeach, [A, F, G|Args], M, H, P, Q):-!,
	expand_arg(F, M, F0, P, P0),
	expand_etc(foldl(G, U), M, G0, P0, Q),
	complete_args(G0, Args, G1),
	H =  ( findall(A, F0, U),  G1 ).

It is neither to me. It calls its action with all instantiation of the generator without backtracing. Action gets new values for the shared variables between generator and action while the other arguments remain the same. There is no foldl/4 like computation where some state S0 is combined with an answer to arrive at S1, etc. There may be no state accumulation, in which case this is the same as forall/2. State may be accumulated by continuously further instantiating a structure (see original member/2 example) or some form of (backtrackable) destructive assignment such as setarg/3 or put_attr/3 (used by constraints as in the last example).

This behavior makes foreach/2 interesting for posting constraints. The thing came to SWI-Prolog by reimplementing library(aggregates) from SICStus. Possibly foreach/2 plays a role internally in the implementation of this library? For example, we can write:

aggregate_all(count, Goal, Count) :-
    State = count(0),
    foreach(Goal, incr(State)),
    arg(1, State, Count).

incr(State) :-
    arg(1, State, C0),
    C is C0+1,
    setarg(1, State, C).

SICStus has mutable terms rather than setarg/3, but the idea stays the same. If they, like the current version of SWI-Prolog’s foreach/2, do something smart that avoids generating the findall/3 intermediate list it all makes sense. If not, it is a bit of a mystery what foreach/2 does in library(aggregates) when practically the only value is in posting constraints. It is even poor for that as a findall/3 + copy_term/2 implementation is really slow while term copying and constraints are a problematic match. As we have seen here, a non-copying implementation also comes with its nasty corners.

2 Likes

If I understand correctly the generator and cosumer model
of SICStus’s foreach, the generator should throws ‘end_of_solutions’ ball to the consumer to exit the call. In the case of the standard foldl, the empty list [] is used as the stop flag for the generator, that is, the exit ball is essential. Othewise, foreach has to ask the coder for the stop condition explicitely, which might not be elegant as foreach construct, I think.

I have no idea how much the findall-fold model of foreach is inefficient compared with the generator-consumer one, but I hope the former will marginally fast, though I am not sure on this.

Aren’t you mixing up the foreach/2 predicate (below) with the foreach construct of logical (do) loops?

From https://sicstus.sics.se/sicstus/docs/latest4/pdf/sicstus.pdf, page 387:

foreach(:Generator, :Goal)
for each proof of Generator in turn, we make a copy of Goal with the appropriate substitution, then we execute these copies in sequence. For example, foreach(between(1,3,I), p(I)) is equivalent to p(1), p(2), p(3). Note that this is not the same as forall/2. For example, forall(between(1,3,I), p(I)) is equivalent to \+ \+ p(1), \+ \+ p(2), \+ \+ p(3). The trick in foreach/2 is to ensure that the variables of Goal which do not occur in Generator are restored properly. (If there are no such variables, you might as well use forall/2.) Like forall/2, this predicate does a failure-driven loop over the Generator. Unlike forall/2, the Goals are executed as an ordinary conjunction, and may succeed in more than one way.

foreach/2 is from library aggregate, implementing notably aggregate_all/3. A findall/3 based approach has to materialize all solutions in a list while this is not needed for many of the aggregate functions. It is rather strange to make a list of all numeric solutions and than sum them.

Perhaps I am mixing it up. In fact, it is my first time to know and use aggregate_all consulting manual after reading your comment.

?- aggregate_all(sum(X), between(1, 3, X), Sum).
Sum = 6.

I took foreach wrongly as a limited coroutine predicate between generator and consumer. That is, if so, as a limited handling local stack is needed, it is beyond the user like me to implement it. Right ? Or it is implemetable by, for example, shift/reset ? Sorry for foolish question.

The original implementation computes the shared variables between the two goals. Then it runs findall/3 to get all values for the generator. Then, for each iteration it copies the shared variable template + the action and unifies the copy of the shared variable template with the next element from the forall list and calls the action. The code as roughly like this:

foreach(Gen, Act) :-
    shared_variables(Gen, Act, Templ),
    findall(Templ, Gen, Sols),
    actions(Sols, Templ, Act).

actions([], _, _).
actions([H|T], Templ, Act) :-
    copy_term(Templ+Act, H+Copy),
    call(Copy),
    actions(T, Templ, Act).

The current implementation relies on an undocumented builtin ‘$unbind_template’/1. This behaves as follows:

  • Given a term T. Identify a subset of its variables, represented as the template v(Var1, Var2, …)
  • Now use T, possibly binding some of the above variables.
  • Call ‘$unbind_template’(Templ). This will make the variables of Templ that have been bound variables again.

In other words, it allows backtracking over a set of bindings without doing Prolog backtracking :slight_smile:

2 Likes

I am confused that your original version foreach is no more than that foreach = findall + foldl.

Sounds nice and great, and what I would like to know is whether the “standard” prolog coder could wrote something equivalent.

EDIT:

I think what I would be trying now might be to make the naive version of foreach much more elegant. Unfortunately I am not familiar to handle Prolog local stack, and without that there is no way for me. Aside possible performance issue, I should be satisfied with “foreach = findall + foldl” for its clean simplicity.

% ?- naive_foreach(V, between(1, 10, V), plus, R).
%@ R = 55.

naive_foreach(V, G, A, R):- retractall(acc(_)),
	assert(acc(0)),
	( 	call(G),
		retract(acc(X)),
		call(A, V, X, Y),
		assert(acc(Y)),
		fail
	;	retract(acc(R))
	).

EDIT: nb_setarg version instead of assert/retract

% ?- foreach1(V, between(1, 10, V), plus, R).
%@ R = 55.

foreach1(V, G, A, R):-
	Acc = acc(0),
	foreach1_(V, G, A, Acc),
	arg(1, Acc, R).

%
foreach1_(V, G, A, Acc):-
		call(G),
		arg(1, Acc, X),
		call(A, V, X, Y),
		nb_setarg(1, Acc, Y),
		fail.
foreach1_(_, _, _, _).

EDIT: Sample query with two dimensional array indexing.

% ?- N = 10000, time((foreach1([V,U], (between(1, N, U), between(1, N, V)),
%	test_action, R))).
%@ % 600,010,004 inferences, 33.882 CPU in 33.906 seconds (100% CPU, 17708672 Lips)
%@ N = 10000,
%@ R = 2500500025000000.

test_action([I,J], X, Y):- Y is I*J + X.

EDIT:
Being inspired by Jan’s explaining foreach, I have introduced a generic predicates fold/[5, 4, 3], which integrates foreach and foldl/4. The fold is what I wanted to have for a long time , and it is similar to for construct in C. The source expanding for fold follows.

As Jan gave me warning, I am afraid I am missing something.
For example, Jan’s codes of foreach uses term_variables but mine does not. I only hope no serious problems exists in my simple minded model. So far so good. Performance and usability is what I wanted.

EDIT: Source codes on fold expansion deleted because of revision.

EDIT: Speed of fold/3 has been improved due to new explicit syntax of port of consumer side. (Port^Consumer).
Need to check closures and global variables.

% ?- N is 10^8, time(fold(J, between(1, N, J), X^(X=X))).
%@ % 200,000,001 inferences, 4.925 CPU in 4.941 seconds (100% CPU, 40607323 Lips)
%@ N = 100000000,
%@ J = X.

% ?- N is 10^9, time(fold(J, between(1, N, J), X^(X=X))).
%@ % 2,000,000,001 inferences, 49.470 CPU in 49.519 seconds (100% CPU, 40428767 Lips)
%@ N = 1000000000,
%@ J = X.
1 Like