Libraries: queues and heaps

While still working with O’Keefe’s book (and trying to make some progress), I came to the point of informed search algorithms like Dijkstra, A* and related subjects. In those cases the data structure to use are heaps/priority queues and I found that there is a dedicated Swi-Prolog library for this…so now I’m wondering: why doesn’t Swi-Prolog also offer a queues module/library to implement queues? Is there a specific reason or is it just “chance”?
Second question: I noticed that the heaps library makes use of a dedicated argument position with a standard integer to count the number of items in the heap as in:

heap(t(0, a, [t(4, d, []), t(2, s, [])]), 3).

why then O’Keefe in his book makes use of the more awkward unary s-notation (0 is 0, s(0) is 1, s(s(0)) is 2, and so on) instead of using plain integers? The reason is that with integers one has to do evaluation and that costs time and creation of additional variables? The unary notation doesn’t perform any maths and is therefore more efficient?
Thanks

1 Like

I’d like to know that, too. Maybe the implementation is so trivial and the queue data structure is so rarely needed in real life?

Normal integers are not discernible under syntactic unification. So while [] and [|] are different terms, 0 and 1 are just opaque atomics. You cannot use any syntax to tell any two integers apart; but in unary notation, 0 and s(0) are different terms, and if Y = s(X) you know that X is one less than Y.

So I strongly suspect the real answer is a combination of “easy to read, easy to write, easy to understand”. Note that anything you can do with 0, s(0), s(s(0)), ... you can also do with [], [_], [_,_], … but with the list you use up more space.

1 Like

Thanks.

They are trivial…sure, nonetheless in other languages one finds libraries also for those…so these other people possibly don’t find them so trivial as the prolog people do.
Referring to “real life” when talking about search algorithms might also sound a little bit like an overkill…thanks for the infos

Peano axioms - Wikipedia are nicely relational.

E.g. can express that X is Y + 5 without having to give them ground values.

1 Like

I don’t know. Maybe other Prolog’s have a library? It has never been requested AFAIK. A simple queue is of course too trivial. It is just a list. If you want to do fancy, you use a difference list, e.g., a list with a variable tail. Using a term q(Head,Tail) it gets as easy as

queue_add(El, q(H,[El|T]), q(H,T)).

queue_take(q(H0,T), q(H,T]), El) :-
    nonvar(H0),
    H0 = [El|H].

Not tested …

So it was more a relational view of natural numbers than a matter of efficiency in that case

On pages 50-51 of “The Craft of Prolog” (which I guess @Marco is now reading) there is a complete implementation of a queue. The base is three lines of code:

empty_queue(q(0, B, B)).

queue_head(X, q(N, F, B), q(s(N), [X|F], B)).

queue_last(X, q(N, F, [X|B]), q(s(N), F, B)).

There are additional helper predicates like queue_length/2 and list_queue/2 and so on.

I have just dumped those three lines in a program when I’ve needed a queue :man_shrugging:t2:

1 Like

Yeah - the situation is a little frustrating with integers:

I’ve occasionally yearned for a method in the middle of those 2, for relations between integers. But I’ve not fully-formed these thoughts.

At least today I think I understood something: the infamous estimated_distance_to_goal(Node, Estimate) O’Keefe talks about at pages 56-58 is not so mysterious as I thought: it is - I dare say - a trivial function mapping each node of the graph to a certain value, it has nothing to do with the movement costs from node to node, the weighted edges connecting the nodes. Of course numbers should be mapped to nodes in a way to guide the search. But movement costs and heuristic functions are two distinct things and one can mix the two values to obtain slightly different search behaviors. Is it correct? I hope I’m right

Things are rarely clear or full-formed, you’re right. And Peano was surely simplistic or maybe he was confused about his priorities, he should have possibly relied on a priority queue… :man_shrugging:

Specifically in the context of A* search, your heuristics must exhibit certain properties: your estimated “cost remaining” must be an underestimate or the exact cost. Only then you can prove that you found the shortest path, given that you expand the search from the node with smallest sum of cost so far and estimated cost remaining.

1 Like

It does, like here, this is a library by Lars Buitinck:

heaps/priority queues
https://www.swi-prolog.org/pldoc/doc/SWI/library/heaps.pl

There was also a long thread, which has something by @kwon-young:

Advice on implementing Dijkstra’s algorithm
https://swi-prolog.discourse.group/t/advice-on-implementing-dijkstras-algorithm/7112

1 Like

I’m making some progress using the Hamming distance as an heuristic to solve the 8-puzzle that O’Keefe proposes. Yesterday I looked for solutions with a simple breadth-first search and it took Prolog minutes to reach a goal-state. Today using the heuristic function it’s almost immediate. But… using a heuristic forces one to choose one and only one goal so that the estimated_distance_to_goal might be calculated…would the presence of several possible goal states allow in any way the search for other solutions? With backtracking for example? Or does Prolog always choose the topmost of the three? One could do the goal/1 predicate dynamic so that a goal is removed when reached and on backtracking the other goal-states can come into play?

Possibly the difference in performance was not due to the heuristic function…but from the “nearness” of the tiles’ combination to the goal-state in the book in comparison to other tiles’ start configurations. By the way: is the use of the set_flag/2, get_flag/2, flag/3 predicates the right choice, if one wants to count for example the number of moves made to reach a certain goal-state? Thanks

Still some difference of performance is measurable…better take some time to reflect on the results :smiley:
Reminder to myself

The mechanism of choice for non-backtrackable counting is a compound term and using nb_setarg/3. See for an example the implementation of call_nth/2. The big advantage of using terms is that we do not pollute the global name space and therefore the code is thread-safe, can be nested and does not require any cleanup. Finally, nb_setarg/3 with atoms or small integers is fast as no lookup and no thread synchronization is required.

1 Like

This is the code. It still makes use of set_flag/2 and related predicates, because I still have to understand exactly how to change it to use cleaner predicates like nb_setarg/3. You’ll see that while breadth-first needs 11779 moves to reach the goal-state, best-first using the hamming estimate makes it in 110. The problem is, I read, that not all start tile configurations necessarily reach a given goal-state. In that case you get red messages of stack limits exceeded

eight_puzzle.pl (7,3 KB)

1 Like

You are searching between two arbitrary positions?

start(board(1, 2, 3, 4, 8, 6, 7, 5, *)).
goal(board(1, 2, 3, 8, *, 4, 7, 6, 5)).

I thought the game definition is to reach board(*, 1, 2, 3, 4, 5, 6, 7, 8).

Also there is a problem with the Hamming distance. It is not admissible,
means it doesn’t give a lower bound, for example the Hamming distance
between these two positions has the value 2:

board(1, 2, 3, 8, *, 4, 7, 6, 5)
board(1, 2, 3, *, 8, 4, 7, 6, 5)

But it require only 1 move between the two. So I am not sure whether
you find shortest paths with the Hamming distance. This is called
admissibility of the heuristic function:

If the heuristic function is admissible – meaning that it
never overestimates the actual cost to get to the goal – A* is
guaranteed to return a least-cost path from start to goal.
A* search algorithm - Wikipedia

Maybe there are more drastic 8 puzzle start goal pairs where
the Hamming distance overshoots? But its also possible that it doesn’t
matter in this particular case the above result is an “if” and not an “iff”?

Edit 04.05.2024
Why do I mention the standard goal position. Well since I saw other heuristic
functions than Hamming distance on the internet. Not sure whether related
to this particular goal position? But similar like rubiks cube, one can

apply some gradual strategy like first put 8 into the right position, then 7, etc…
Which could give rise to a different heuristic function. Such gradual strategies
can often solve all solvable problems quickly. Disclaimer: Not 100% sure

start(board(1, 2, 3, 4, 8, 6, 7, 5, *)).
goal(board(1, 2, 3, 8, *, 4, 7, 6, 5)).

This is the goal position O’Keefe uses in the book

Didn’t know about this condition on the heuristic function…I must gather more information on this topic

Here is a proposal for a queue implementation that no one will ever need :smiley: It has been inspired by the discussed implementation in “The Art of Prolog” but is not identical.

:- module(queue,
    [ empty_queue/1,
      list_queue/2,
      queue_head/3,
      list_queue_head/3,
      queue_last/3,
      list_queue_last/3,
      queue_length/2 ]).

/** <module> A queue implementation

This provides a queue data structure that allows constant-time
insertion from either the front or the back of the queue, and
constant-time removal of elements from the front.

The implementation is based on the code presented on pages 50-51
of "The Craft of Prolog" by O'Keefe (1990).
*/

:- begin_tests(queue).

test(empty,
        [ true(L == []),
          true(N == 0) ]) :-
    empty_queue(Q),
    list_queue(L, Q),
    queue_length(Q, N).

test(empty_pop, [ fail ]) :-
    empty_queue(Q),
    queue_head(_, _, Q).

test(empty_and_pop, [ fail ]) :-
    list_queue([a,b], Q0),
    queue_head(_, Q1, Q0),
    queue_head(_, Q2, Q1),
    queue_head(_, _, Q2).

test(from_list,
        [ true(X == a),
          true(N == 2) ]) :-
    list_queue([a,b,c], Q0),
    queue_head(X, Q1, Q0),
    queue_length(Q1, N).

test(to_list, [ true(L = [a,b,c]) ]) :-
    empty_queue(Q0),
    foldl(queue_last, [a,b,c], Q0, Q1),
    list_queue(L, Q1).

test(stack, [ true(X == c) ]) :-
    empty_queue(Q0),
    foldl(queue_head, [a,b,c], Q0, Q1),
    queue_head(X, _, Q1).

test(queue, [ true(X == a) ]) :-
    empty_queue(Q0),
    foldl(queue_last, [a,b,c], Q0, Q1),
    queue_head(X, _, Q1).

test(pop_list_head, [ true(H == [x,y,a]) ]) :-
    list_queue([x,y], Q0),
    foldl(queue_last, [a,b,c], Q0, Q1),
    length(H, 3),
    list_queue_head(H, _, Q1).

test(push_list_head, [ true(X == x) ]) :-
    list_queue([a,b,c], Q0),
    list_queue_head([x,y,z], Q0, Q1),
    queue_head(X, _, Q1).

test(backtrack_list_head,
        [ all(H == [[], [a], [a,b]]) ]) :-
    list_queue([a,b], Q),
    list_queue_head(H, _, Q).

test(push_list_last,
        [ true(X == x),
          true(Y == y),
          true(Z == a) ]) :-
    list_queue([x,y], Q0),
    list_queue_last([a,b,c], Q0, Q1),
    queue_head(X, Q2, Q1),
    queue_head(Y, Q3, Q2),
    queue_head(Z, _, Q3).

:- end_tests(queue).

%! empty_queue(-Queue) is semidet.
%
%  Queue is an empty queue.
empty_queue(q(0, B, B)).

%! queue_length(+Queue, -Length) is semidet.
%
%  Length is the number of elements in Queue.
queue_length(q(S, F, B), N) :-
    ql(S, F, B, Len_expr),
    N is Len_expr.

ql(0, B, B, 0).
ql(s(S), [_|F], B, N+1) :-
    ql(S, F, B, N).

%! list_queue(?List, ?Queue) is nondet.
%
%  Two-way conversion between a list and a queue.
list_queue(L, q(S, F, B)) :-
    lq(L, S, F, B).

lq([], 0, B, B).
lq([X|Xs], s(S), [X|F], B) :-
    lq(Xs, S, F, B).

%! queue_head(?Head, ?Queue0, ?Queue1) is semidet.
%
%  Queue0 with Head at the front is Queue1.
%  Can be used to pop or push to the head of a queue.
queue_head(X, q(S, F, B), q(s(S), [X|F], B)).

%! list_queue_head(?List, Queue0?, Queue1) is semidet.
%
%  Queue0 with List at the front is Queue1.
%  Can be used to pop or push a list at the front of
%  the queue.
list_queue_head(L, q(S0, F0, B), q(S, F, B)) :-
    lqh(L, S0, F0, S, F).

lqh([], S, F, S, F).
lqh([X|Xs], S0, F0, s(S), [X|F]) :-
    lqh(Xs, S0, F0, S, F).

%! queue_last(+Last, -Queue0, +Queue1) is semidet.
%
%  Queue1 is Queue0 with Last inserted at its back.
queue_last(X, q(S, F, [X|B]), q(s(S), F, B)).

%! list_queue_last(+List, +Queue0, -Queue1) is semidet.
%
%  Queue1 is Queue0 with List at its back.
list_queue_last(L, q(S0, F, B0), q(S, F, B)) :-
    lql(L, S0, B0, S, B).

lql([], S, B, S, B).
lql([X|Xs], S0, [X|B0], s(S), B) :-
    lql(Xs, S0, B0, S, B).
1 Like

Personally I don’t think that this is useless. 1) For the sake of completeness and 2) in case someone needs a queue data structure and doesn’t have necessarily to refer back to O’Keefe’s book. A library would be worth adding with due credits to O’Keefe or to you