Difference in success between maplist and concurrent_maplist

friguzzi · August 1, 2024, 7:11am

I’m using: SWI-Prolog version 9.3.6-22-g94ee9ff79

I’m finding differences in maplist and concurrent_maplist: the same goal using the first succeeds while using the latter fails.

The code to reproduce it is at lp4kg/fbk237 at main · friguzzi/lp4kg · GitHub

The predicates main and mainc differ only because the first uses maplist while the latter concurrent_maplist.
This is what I get:

[rzf@hnode01 fbk237]$ swipl computeinst_oi_100_order.pl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.3.6-22-g94ee9ff79)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- main.
tt(1,/m/040b5k,/film/film/other_crew./film/film_crew_gig/film_crew_role,/m/01pvkk) n(42)
tt(1,/m/085h1,/user/ktrueman/default_domain/international_organization/member_states,/m/06t8v) n(100)
tt(1,/m/050r1z,/film/film/other_crew./film/film_crew_gig/film_crew_role,/m/0ch6mp2) n(36)
qui
true.

?-

% halt
[rzf@hnode01 fbk237]$ swipl computeinst_oi_100_order.pl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.3.6-22-g94ee9ff79)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- mainc.
tt(1,/m/040b5k,/film/film/other_crew./film/film_crew_gig/film_crew_role,/m/01pvkk) tt(1,/m/085h1,/user/ktrueman/default_domain/international_organization/member_states,/m/06t8v) tt(1,/m/050r1z,/film/film/other_crew./film/film_crew_gig/film_crew_role,/m/0ch6mp2) n(39)
n(100)
n(42)
false.

?-

The code for the two predicates is

mainc:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  %maplist(write_ins(Prog),TestAtoms).
  concurrent_maplist(write_ins(Prog),TestAtoms),
  writeln(qui).

main:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  %maplist(write_ins(Prog),TestAtoms).
  maplist(write_ins(Prog),TestAtoms),
  writeln(qui).

friguzzi · August 1, 2024, 10:55am

Another program where the difference appears that is simpler (no asserts and retracts):

github.com

friguzzi/lp4kg/blob/6ad94cfcd85668304be6889d94e66598e2b9afe7/fbk237/computeinst_oi.pl


:-ensure_loaded(train).

:- multifile t/4.
:-ensure_loaded(valid3).
:-ensure_loaded(test).

:-ensure_loaded(out005_noinv).
mainc:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  concurrent_maplist(write_ins(Prog),TestAtoms).

main:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  %concurrent_maplist(write_ins(Prog),TestAtoms).
  maplist(write_ins(Prog),TestAtoms).

This file has been truncated. show original

[rzf@hnode01 fbk237]$ swipl computeinst_oi.pl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.3.6-22-g94ee9ff79)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- main.
i(tt(1,'/m/040b5k','/film/film/other_crew./film/film_crew_gig/film_crew_role','/m/01pvkk'),'/m/01pvkk',['/m/01pvkk']).
i(tt(1,'/m/085h1','/user/ktrueman/default_domain/international_organization/member_states','/m/06t8v'),'/m/06t8v',[]).
i(tt(1,'/m/050r1z','/film/film/other_crew./film/film_crew_gig/film_crew_role','/m/0ch6mp2'),'/m/0ch6mp2',[]).
true.

?- mainc.
i(tt(1,'/m/040b5k','/film/film/other_crew./film/film_crew_gig/film_crew_role','/m/01pvkk'),'/m/01pvkk',['/m/01pvkk']).
i(tt(1,'/m/085h1','/user/ktrueman/default_domain/international_organization/member_states','/m/06t8v'),'/m/06t8v',['/m/06t8v']).
false.

?- i(tt(1,'/m/050r1z','/film/film/other_crew./film/film_crew_gig/film_crew_role','/m/0ch6mp2'),'/m/0ch6mp2',['/m/0ch6mp2']).

peter.ludemann · August 1, 2024, 11:23am

Your code has a “cut” in write_ins/2:

  (find_inst(Prog,Ex)->
    findall(Ex1,find_inst(Prog,Ex1),Atoms0),
    sort(Atoms0,Atoms),
    filter_atoms(Atoms,CorrAns,Inst)
  ;
    Inst=[]
  ),

When I changed the -> to *->, it worked. I’m not quite sure why (I’ll have to think about it a bit), but my intuition is that concurrent_maplist/2 and cuts don’t work well together – there’s this comment in the documentation: “Note that all goals are executed as if wrapped in once/1 and therefore these predicates are semidet .”

jan · August 1, 2024, 11:59am

Cuts should be fine. I ran the code. It is clear something weird happens. As I think of it, could it be that there are variables sharing between elements of the list? Note that in sequential execution such sharing is propagated, but in concurrent it is not.

To work, the goals need to be fully independent goals and the execution of one should not affect the other in some indirect way, e.g., through the dynamic DB. Typically you should also avoid side effects (write) as the ordering will not be defined. If some goal fails, the others are killed. This may take some time though.

Be aware that the overhead is significant. For good results the input and output terms of the individual goals should be relatively small (they are copied) and the amount of work in each goal should be significant.

friguzzi · August 1, 2024, 12:11pm

Apparently the problem was in variable sharing: when I change the predicate write_ins as

write_ins(Prog,Ex):-
  copy_term(Prog,Prog1),
  Ex=..[Pred,_,Ent,Rel,CorrAns],
  Ex1=..[Pred,1,Ent,Rel,_Arg1],
  (find_inst(Prog1,Ex)->
    findall(Ex1,find_inst(Prog,Ex1),Atoms0),
    sort(Atoms0,Atoms),
    filter_atoms(Atoms,CorrAns,Inst)
  ;
    Inst=[]
  ),
  with_mutex(write_i,(write_canonical(i(Ex,CorrAns,Inst)),writeln('.'))).

it works. In fact, Prog has variables that are instantiatied by find_inst/2.
Thanks

friguzzi · August 1, 2024, 4:59pm

I have found a different problem: the sequential and parallel version give different results.
I’m trying to find the first 100 unique instantiations for a KGC query using asserts for storing already found answers. Now assert (and retracts) introduce side effects but each query involves different, non-unifying atoms, so, if asserts and retracts are thread-safe, there should be no problem.
However, the serial and parallel version of the code return different instantiations (and each run the result is different). With

github.com

friguzzi/lp4kg/blob/17eaed1e4ea3d9e0fe5b97eeee5dd2a26b74f554/nations/bug.pl


:-ensure_loaded(train).

:- multifile t/4.
:- dynamic a/2.
:-ensure_loaded(valid3).
:-ensure_loaded(test).
:-ensure_loaded(out_no_inv).

main(NL,LL):-
  mains,
  mainc,
  findall((G,L,L1),(i(G,_,L),ii(G,_,L1),L\=L1),LL),length(LL,NL).

mains:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  maplist(write_ins(Prog),TestAtoms).
  %concurrent_maplist(write_ins(Prog),TestAtoms).

This file has been truncated. show original

I get

[rzf@hnode01 nations]$ swipl bug.pl
Welcome to SWI-Prolog (threaded, 64 bits, version 9.3.6-22-g94ee9ff79)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.

For online help and background, visit https://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).

?- main(N,L).
N = 163,
L = [(tt(1, india, intergovorgs, israel), [israel, ussr, cuba, jordan], [ussr, jordan]), (tt(1, israel, commonbloc1, cuba), [cuba], []), (tt(1, usa, independence, china), [china, india, israel], [israel, india]), (tt(1, burma, reldiplomacy, ussr), [ussr, brazil, netherlands|...], [brazil, netherlands, cuba|...]), (tt(1, uk, pprotests, ussr), [ussr, netherlands|...], [israel, brazil|...]), (tt(1, china, intergovorgs3, india), [india|...], []), (tt(1, burma, relintergovorgs, usa), [...|...], [...]), (tt(..., ..., ..., ...), ..., ...), (..., ...)|...].

where N is the number of queries for which the answer differ (out of 201).

friguzzi · August 1, 2024, 5:14pm

Sorry, consider this file instead of the previous one

github.com

friguzzi/lp4kg/blob/a8ab413d0cf9357c88a47b144ea9745f0936ed57/wn18rr/rank/bug.pl


:-ensure_loaded(train0).

:- multifile t/4.
:- dynamic a/2.
:-ensure_loaded(valid3).
:-ensure_loaded(test2).
:-ensure_loaded(out_conf_oi).

main(NL,LL):-
  mains,
  mainc,
  findall((G,L,L1),(i(G,_,L),ii(G,_,L1),L\=L1),LL),length(LL,NL).

mains:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  maplist(write_ins(Prog),TestAtoms).
  %concurrent_maplist(write_ins(Prog),TestAtoms).

This file has been truncated. show original

?- main(N,L).
N = 1,
L = [(tt(1, '00085626', '_derivationally_related_form', '03673594'), ['03673594', '00866505', '03845190', '03845360', '00712225', '14938907'|...], ['03673594', '00866505', '03845190', '03845360', '00712225', '14938907'|...])].

friguzzi · August 2, 2024, 7:25am

Apparently using tries instead of asserts makes the problem go away.

github.com

friguzzi/lp4kg/blob/9701597d91de35a19f1cb5b50379eeb2dfd598a9/wn18rr/rank/trie.pl


:-ensure_loaded(train0).

:- multifile t/4.
:- dynamic a/2.
:-ensure_loaded(valid3).
:-ensure_loaded(test2).
:-ensure_loaded(out_conf_oi).

main(NL,LL):-
  mains,
  mainc,
  findall((G,L,L1),(i(G,_,L),ii(G,_,L1),L\=L1),LL),length(LL,NL).

mains:-
  findall(tt(1,S,R,T),t(2,S,R,T),TestAtoms),
  out(R00),
  maplist(generate_cl,R00,Prog),
  maplist(write_ins(Prog),TestAtoms).
  %concurrent_maplist(write_ins(Prog),TestAtoms).

This file has been truncated. show original

jan · August 2, 2024, 2:20pm

This suggests there are ordering or atomicy problems. assert/retract and friends are thread safe. But, of course something like

 (    p(X)
 ->   true
 ;    assert(p(X))
 )

is not thread safe. You can end up with two (or more) p(X). That can be resolved using transaction/1 or a mutex. As opposed, adding to a trie is a single atomic operation that has no effect if the term is already in the trie.

friguzzi · August 2, 2024, 2:39pm

Hi @jan
thanks for your answer. There shouldn’t be problems of order or atomicity as the queries are processed in parallel but, for each query, the answer (and the asserts to check for unicity) are processed serially. In particular, given an answer a for a query q, I store the answer with

assert(a(q,a)),

where q is a ground term. Then, given a new answer b for q, I first check whether b is not present and, if so, I store it:

\+ a(q,b),
assert(a(q,b)),

Since each query is unique, the different asserts should not interfere…

jan · August 2, 2024, 2:52pm

Thanks. So far I suspect some issue with this rather than with concurrent_maplist/2,3. It has been in use for quite a while and never let me down Concurrency gets hard if there is global state involved. So, unless get an evident proof of the contrary I assume there is no bug in SWI-Prolog.

friguzzi · August 2, 2024, 3:15pm

Another question: should I explicitly destroy tries when they are no longer needed or are they automatically garbage collected?

jan · August 2, 2024, 3:19pm

Tries use a blob as reference and (thus) are subject to atom garbage collection. In most cases that should be enough. Tries can be big though and atom-gc may take a while.

friguzzi · August 4, 2024, 9:40am

Note that I use concurrency for handling different queries in parallel, while each query is answered by trying all rules serially because order matters, as the rules are weighted and ordered according to decreasing weight, so that the first answers are given by the rules with largest weight.

For the problem of variable sharing, I copy the terms before each query:

find_inst_100(Prog,Ex,CorrAns,I):-
  Ex=..[Pred,_,Ent,Rel,CorrAns],
  Ex1=..[Pred,1,Ent,Rel,Arg1],
  copy_term(Prog,Prog1),
  find_inst(Prog1,Ex,CorrAns),!,
  trie_new(Trie),
  trie_insert(Trie,CorrAns),
  copy_term(Prog,Prog2),
  find_inst(Prog2,Trie,count(1),Ex1,Arg1),
  findall(Ans,trie_gen_compiled(Trie,Ans),I).

find_inst_100(_Prog,Ex,CorrAns,[]):-
  Ex=..[_Pred,_,_Ent,_Rel,CorrAns].

find_inst(Prog,Trie,Count,Ex1,Arg1):-
  find_inst(Prog,Ex1,Arg1),
  \+ trie_lookup(Trie,Arg1,_),
  Ex1=tt(1,So,R,A),
  HTrain=..[t,So,R,A],
  HTest=..[t,2,So,R,A],
  HValid=..[t,3,So,R,A],
  \+ HTest,
  \+ HValid,
  \+ HTrain,
  trie_insert(Trie,Arg1),
  check_n(Count).
 
find_inst(_Prog,_Trie,_Count,_Ex1,_Arg1).

check_n(count(N)):-
  N=99,!.

check_n(Count):-
    Count=count(N),
    N1 is N+1,
    nb_setarg(1,Count,N1),
  fail.

find_inst(Prog,Ex,_Arg):-
  member((H,B),Prog),
  term_variables(B,Vars),
  H=Ex,B,
  oi(Vars).

Topic		Replies	Views
How to build a table in parallel General tabling , threads	19	461	March 21, 2024
Assertion Creep Help!	17	1574	December 16, 2020
Regression with concurrent_maplist/3 in 8.3.27? Predicate	3	604	August 1, 2021
Memory usage Help!	26	157	August 9, 2024
Help getting up and running with multithreading General	19	385	May 5, 2024

Difference in success between maplist and concurrent_maplist

Related topics