Six ways to iterate in Prolog

As a learning exercise and to create simple examples for myself, I worked through different ways to translate an input list into an output list, and to look at their efficiency using the time clause.

Here’s what I came up with for anyone interested.

:- module(translate, [translate_td/2,
                      translate_maplist/2,
                      translate_dcg/2,
                      translate_acc/2,
                      translate_q/2,
                      translate_dl/2
                     ]).

english_spanish("One", "Uno").
english_spanish("Two", "Dos").
english_spanish("Three", "Tres").
english_spanish("Four", "Cuatro").
english_spanish("Five", "Cinco").
english_spanish("Six", "Seis").
english_spanish("Seven", "Siete").
english_spanish("Eight", "Ocho").
english_spanish("Nine", "Nueve").
english_spanish("Ten", "Diez").

%% translate_td(?EnglishList, ?SpanishList) is det
% The traditional, top-down method of iteration in Prolog

translate_td([], []).

translate_td([English|EnglishList], [Spanish|SpanishList]) :-
  english_spanish(English, Spanish),
  translate_td(EnglishList, SpanishList).

/*
?- time(translate_td(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 21 inferences, 0.000 CPU in 0.000 seconds (92% CPU, 1561803 Lips)
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve", "Diaz"].
*/

%% translate_acc(+EnglishList, -SpanishList) is det
% Traditional bottom-up method of iteration, reverses output and is not bidirectional

translate_acc(EnglishList, SpanishList) :-
  translate_acc(EnglishList, [], SpanishList).

translate_acc([English|EnglishList], Acc, SpanishList) :-
  english_spanish(English, Spanish),
  translate_acc(EnglishList, [Spanish|Acc], SpanishList).

translate_acc([], SpanishList, SpanishList).

/*
?- time(translate_acc(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 22 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 1597792 Lips)
S = ["Diez", "Nueve", "Ocho", "Siete", "Seis", "Cinco", "Cuatro", "Tres", "Dos", "Uno"].
*/

%% translate_dl(+EnglishList, -SpanishList) is det
% Using a difference list.

translate_dl(EnglishList, SpanishList) :-
  translate_dl(EnglishList, Q-Q, SpanishList1),
  SpanishList-[] = SpanishList1.

translate_dl([], SpanishList, SpanishList).

translate_dl([English|EnglishList], Acc, SpanishList) :- 
  english_spanish(English, Spanish),
  enqueue(Spanish, Acc, Acc1),
  translate_dl(EnglishList, Acc1, SpanishList).

enqueue(X, Qh-[X|Qt], Qh-Qt).

/*
?- time(translate_dl(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 32 inferences, 0.000 CPU in 0.000 seconds (95% CPU, 2000125 Lips)
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve", "Diez"].
*/

%% translate_dcg(+EnglishList, -SpanishList) is det (thanks to cut)
% A simple definite clause grammar example

spanish([Spanish|SpanishList]) --> [English], spanish(SpanishList), { english_spanish(English, Spanish) }.
spanish([])                    --> [].

translate_dcg(EnglishList, SpanishList) :-
  phrase(spanish(SpanishList), EnglishList), !.

/*
?- time(translate_dcg(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 31 inferences, 0.000 CPU in 0.000 seconds (95% CPU, 1707519 Lips)
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve", "Diez"].
*/

%% translate_q(+EnglishList, -SpanishList) is det
% using append instead of a difference list is included to show why it's a bad idea

translate_q(EnglishList, SpanishList) :-
  translate_q(EnglishList, [], SpanishList).

translate_q([], SpanishList, SpanishList).

translate_q([English|EnglishList], Acc, SpanishList) :- 
  english_spanish(English, Spanish),
  append(Acc, [Spanish], Acc1),
  translate_q(EnglishList, Acc1, SpanishList).

/*
?- time(translate_q(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 77 inferences, 0.000 CPU in 0.000 seconds (95% CPU, 3830465 Lips)
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve", "Diez"].
*/

%% translate_maplist(?EnglishList, ?SpanishList)
% maplist falls under second order programming

translate_maplist(EnglishList, SpanishList) :-
  maplist(english_spanish, EnglishList, SpanishList).

/*
?- time(translate_maplist(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 23 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 1453948 Lips)
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve", "Diez"].
*/

Also a SWISH notebook.

3 Likes

Thanks, that looks useful.
I see that all the results show 0.000 seconds. Adding more iterations, or more data and higher resolution timing would give more conclusive results.

The classic trick in LISP is to build up the result in reverse order (LISP uses cons; Prolog would use Acc1 = [Spanish|Acc]) and then call reverse to get the result in the desired order. (This shows 34 inferences vs 77.) (Also, LISP doesn’t do TRO [Tail-Recursion Optimization] in this situation that Prolog does because LISP doesn’t have logical variables.)

translate_q2(EnglishList, SpanishList) :-
    translate_q2(EnglishList, [], SpanishList).

translate_q2([], SpanishListRev, SpanishList) :-
    reverse(SpanishListRev, SpanishList).
translate_q2([English|EnglishList], Acc, SpanishList) :-
    english_spanish(English, Spanish),
    translate_q2(EnglishList, [Spanish|Acc], SpanishList).

Make that seven. I forgot findall, and the list could be further expanded to include bagof and setof.

%% translate_findall(+EnglishList, -SpanishList) 

translate_findall(EnglishList, SpanishList) :-
  findall(Spanish, (member(English, EnglishList), english_spanish(English, Spanish)), SpanishList).

/*
?- time(translate_findall(["One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten"], S)).
% 42 inferences, 0.000 CPU in 0.000 seconds (96% CPU, 1323001 Lips)
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve"|...].
*/

Thanks Peter.

My second example, translate_acc, uses tail recursion.

My understanding of the advantage of tail recursion is for things like graph traversal where it’s important to filter out duplicates to avoid cycles.

translate_nd(EnglishList, SpanishList) :-
  translate_nd(EnglishList, [], SpanishListR),
  reverse(SpanishListR, SpanishList).

translate_nd([English|EnglishList], Acc, SpanishList) :-
  english_spanish(English, Spanish),
  \+memberchk(Spanish, Acc), !,
  translate_nd(EnglishList, [Spanish|Acc], SpanishList).

translate_nd([_English|EnglishList], Acc, SpanishList) :-
  % Skip adding Spanish word because it's already in Acc list
  translate_nd(EnglishList, Acc, SpanishList).

translate_nd([], SpanishList, SpanishList).

?- time(translate_nd(["One", "Two", "Two", "Three", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten", "Two"], S)), writeln(S).
% 129 inferences, 0.000 CPU in 0.000 seconds (99% CPU, 1803641 Lips)
[Uno,Dos,Tres,Cuatro,Cinco,Seis,Siete,Ocho,Nueve,Diez]
S = ["Uno", "Dos", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve"|...].

I was somehow under the impression this wouldn’t work for traditional, top-down iteration because the values are only filled in later on backtracking, but see I was wrong. The hitch is the last occurrence of an element is kept instead of the first:

%% translate_nd(+EnglishList, -SpanishList) is det
% Filter out duplicates from the output list

translate_nd([], []).

translate_nd([English|EnglishList], [Spanish|SpanishList]) :-
  english_spanish(English, Spanish),
  translate_nd(EnglishList, SpanishList),
  \+memberchk(Spanish, SpanishList), !.
  
translate_nd([_English|EnglishList], SpanishList) :-
  translate_nd(EnglishList, SpanishList).

?- time(translate_nd(["One", "Two", "Two", "Three", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten", "Two"], S)), writeln(S).
% 321 inferences, 0.000 CPU in 0.000 seconds (97% CPU, 4392507 Lips)
[Uno,Tres,Cuatro,Cinco,Seis,Siete,Ocho,Nueve,Diez,Dos]
S = ["Uno", "Tres", "Cuatro", "Cinco", "Seis", "Siete", "Ocho", "Nueve", "Diez"|...].

It’s a great list for reference of all the possible iteration methods, and I noticed now that the Lips info might have sufficient resolution to be a useful measure. Still, I wonder how conclusive the results are, given the small data sets used to benchmark. With that size, setup costs may overshadow actual costs of iteration.

2 Likes

LIPS is for pure Prolog typically a reasonable measure, but buildins like findall/3 and sort/2 as well as differences in garbage and thus GC may cause significant differences to measuring time. You can create a nice notebook on https://swish.swi-prolog.org, so we can all play :slight_smile:

2 Likes

Thanks Jan, I’ll try learn how to create a notebook to help others.

I think Prolog would be far more widely used if it wasn’t so intimidating for beginners, and more novice friendly documentation is something I’d be keen to contribute to.

I’ve found creating notes of the basics very helpful for myself, and am gradually getting to grips with the pros and cons of the different approaches.

Ideally I’d like to expand the examples to simple cases of when to use which technique.

1 Like

Great! I think notebooks are great for the stuff you showed. Announce them here. We can bundle them on swish, link them from the docs, etc.

More examples to the docs in general are very welcome. You can typically use git pull requests for those. If you download the source from git (swipl-devel.git), the documentation is in part in PlDoc in the sources and in part in man and the various package directories. The files are named (historically) .doc and first pre-processed by Prolog to create LaTeX. Just ask if things are not clear.

I’ve put up a notebook version here. Seems simple enough.

https://swish.swi-prolog.org/p/yeQhnQSk.swinb

2 Likes

Oops, I missed that (I must learn to read more carefully …)

Your timings weren’t comparing apples to apples. If you add in the call to reverse/2, the LIPS goes to 34. And, just to show how LIPS can be misleading, if the 2nd clause is changed to:

translate_acc([English|EnglishList], Acc, SpanishList) :-
    english_spanish(English, Spanish),
    Acc1 = [Spanish|Acc]
    translate_acc(EnglishList, Acc1, SpanishList).

then the LIPS jumps to 44.

The big advantage of tail recursion optimiation is that the stack doesn’t grow. Without tail recursion optimization, the stack grows with each “iteration”.

I think you mean that one advantage of passing a complete accumulator (as opposed to a difference list) is that you can use it to filter out duplicates. If performance is an issue, you can use something faster than a simple list, for example a red-black tree (library(rbtrees)). And if you use extended DCGs, you can make the form of the accumulator transparent. (BTW, don’t use a dict for an accumulator, for inserting 10,000 items, I measured it as 20x slower than a red-black tree.)

But for situations where you don’t need to check for duplicates, the difference-list approach (which requires logical variables to work) is a big win.

By comparison, in LISP, translate_q becomes something like the following:

(defun translate_q (english)
  (if (null english) ()
    (append (translate_q (cdr english))
            (list (english_spanish (car english))))))

As you can see, the final call is append, not translate_q, so this isn’t tail recursive. That’s why LISPers uses the “cons on the front of the list, then reverse the result” trick, which makes translate_acc2 tail recursive:

(defun translate_acc (english)
  (reverse (translate_acc2 english ())))

(defun translate_acc2 (english acc)
  (if (null english) acc
    (translate_acc2 (cdr english)
                    (cons (english_spanish english) acc))))

I really like this kind of beginner notebooks, they are very useful for starting in prolog.

They should probably be under an entry in the Help menu, something like Beginners' corner with all notebooks like this. Joeblog gives nice explanations too, so he can be a great contributor for these.

3 Likes

Agreed. I should look at the memory advantages of tail recursion more. For this small toy example, it’s hard to see this advantage.

It’s not just a memory advantage — there’s also a performance advantage in not having to create and tear down stack frames; the call/return sequence in effect is turned into a goto.
More on this here: https://sicstus.sics.se/sicstus/docs/4.0.4/html/sicstus/Last-Call-Optimization.html (which also has links about accumulating parameters and lists).

1 Like

My six ways of list processing in Prolog list has now reached nine with the addition of using SWI Prolog’s indexing predicates. (This seems obvious in retrospect, but it only occurred to me while working on Unit 5 of my web app tutorial in which I needed to keep count of where I was in a list to work out dates).

%% translate_idx(+EnglishList, -SpanishList) is det

translate_idx(EnglishList, SpanishList) :-
    length(EnglishList, Length),
    translate_idx_(1, Length, EnglishList, SpanishList).

% Base case, final list element
translate_idx_(Length, Length, EnglishList, [Spanish]) :-
    nth1(Length, EnglishList, English),
    english_spanish(English, Spanish).

translate_idx_(Idx, Length, EnglishList, [Spanish|SpanishList]) :-
    nth1(Idx, EnglishList, English),
    english_spanish(English, Spanish),
    IdxInc is Idx + 1,    
    translate_idx_(IdxInc, Length, EnglishList, SpanishList).

The full tutorial is at https://swish.swi-prolog.org/p/yeQhnQSk.swinb

I’ve also rewritten my web app tutorial at https://github.com/roblaing/swipl-webapp-howto quite substantially, splitting it into chapters, of which #5 (covering API in which I’m using getting a weather forecast from openweather.org in Json as the example, leading to the indexing. example above) will hopefully be completed soon.

bagof isn’t entirely obvious:

translate_bag([], []) :- !.
translate_bag(EnglishList, SpanishList) :-
    bagof(Spanish, English^(member(English, EnglishList),
                            english_spanish(English, Spanish)),
          SpanishList).

And, of course, setof removes duplicates and changes the order of the result (you can also use sort to remove duplicates).

Anyway, for a situation like this, maplist is probably the best approach; using bagof with member seems unnatural.

Richard O’Keefe’s The Craft of Prolog has a chapter on findall, bagof, setof.

1 Like

Thanks, I’ve added your translate_bag example, bringing the list to a round 10 (I could add a setof example, but it’s been mentioned already, so will leave for now).

I must admit I find the ^ sorting notation confusing, so adding some examples on how to use it with more complicated compound clauses than in these examples would probably be a good idea at some stage.

You can read English^some_pred(English, Spanish) as "there exists an English such that some_pred(English, Spanish)`.

If the English^ were missing, then there would be an implicit “there exists” outside the bagof and you’d generate only a single result. If you try translate_bag(["One", "Two"]) with English^ removed, you’ll see that it backtracks, giving one result at a time:

?- translate_bag(["One", "Two"], Z).
Z = ["Uno"] ;
Z = ["Dos"].

In general, I avoid findall (based on Richard O’Keefe’s advise), and instead use this idiom (which I did not use in my translate_bag; I leave the explanation as an “exercise for the reader”):

( bagof(X, some_pred(X), Xs) -> true ; Xs = [] )

because findall results in [] if some_pred has no solution but bagof fails. (The reason for these different behaviors have to do with soundness and Richard O’Keefe explains the details.)

1 Like

I’ve added setof to the list, bringing it to 11 (which should do for another two months) and noticed my initial attempt without English^(member…) resulted in just “Ocho” getting returned. With English^… it returns the list of Spanish numbers sorted alphabetically. But anyway, nice simple examples to learn from.

I’m busy working through Chapter 2 of the The Craft of Prolog while writing a tutorial on graph traversal on the swish site (stuff I’ve done before, but finding a lot tougher than I remember) and had to chuckle: Richard O’Keefe constantly uses findall in the early chapters, before telling readers in the final chapter to avoid using it.

His description of diference lists tops the list of confusing explanations. He first uses + to split the lists (another notation like \ in The Art of Prolog which doesn’t work at all in SWI Prolog) before switching to -, creating a complete mish-mash of notation.

I’m somewhat at a loss why people thump The Craft of Prolog like it’s the bible.

1 Like