Pattern Matching Order

I’m using: SWI-Prolog versionSWI-Prolog version 9.2.9 for arm64-darwin.

I want the code to:
Solve problem number 8 from P-99: Ninety-Nine Prolog Problems: Eliminate consecutive duplicates of list elements.

Questions:

1 - Isn’t the pattern matching depending on the order of clauses of a multi-clause predicate? (pre-trained brain in Erlang/Elixir/F# style of pattern matching, hence the assumption)
2 - Why is H1 \= H2, needed in the second solution?
3 - Why the tests for the second solution succeed with a warning Test succeeded with choicepoint? How to fix it?

My code looks like this:

:- module(workbench, []).

% P08

% 1

% why this one fails with:
% ERROR:     Expected: [1,2]
% ERROR:     Found: [[1,2],[1,2],[1,2,2],[1,2,2],[1,2,2],[1,2,2],[1,2,2,2],[1,2,2,2],[1,1,2],[1,1,2],[1,1,2,2],[1,1,2,2],[1,1,2,2],[1,1,2,2],[1,1,2,2,2],[1,1,2,2,2],[1,1,2],[1,1,2],[1,1,2,2],[1,1,2,2],[1,1,2,2],[1,1,2,2],[1,1,2,2,2],[1,1,2,2,2],[1,1,1,2],[1,1,1,2],[1,1,1,2,2],[1,1,1,2,2],[1,1,1,2,2],[1,1,1,2,2],[1,1,1,2,2,2],[1,1,1,2,2,2]]
% does the back tracking breaks the assumption about the order of matches (which I inherited from Erlang and similar)?
% what is going on here? why do we need to put the condition (H1 \= H2)?
% no_duplicates([], []).
% no_duplicates([E], [E]).
% no_duplicates([H,H|T], Out) :-
%     no_duplicates([H|T], Out).
% no_duplicates([H|T], [H|L1]) :-
%     no_duplicates(T, L1).

% 2

% works (with: Test succeeded with choicepoint - how to fix it? why is there more than one answer?)
no_duplicates([], []).
no_duplicates([E], [E]).
no_duplicates([H,H|T], Out) :-
    no_duplicates([H|T], Out).
no_duplicates([H1,H2|T], [H1|L1]) :-
    H1 \= H2,
    no_duplicates([H2|T], L1).


:- begin_tests(test_suite).

test(no_duplicates) :-
    no_duplicates([], []),
    no_duplicates([1], [1]),
    no_duplicates([1,2], [1,2]),
    no_duplicates([1,1,1,2,2,2], [1,2]).

:- end_tests(test_suite).

Thanks for your help!

The task is to produce clumped/2 without the count - and (preferably) able to handle non-ground variables - which my code does at Logically pure/impure predicate implementations in the std lib - #21 by brebs

Example:

?- clumped2([a,b,X,d], L).
L = [a-1, b-1, X-1, d-1],
dif(X, b),
dif(X, d) ;
X = d,
L = [a-1, b-1, d-2] ;
X = b,
L = [a-1, b-2, d-1].

I don’t understand your first question, the heads of clauses of a predicate (like no_duplicates/2) are matched in sequence. Is this that worries you ?

About the second and third questions, the cut will prevent the choicepoint required by the following clause, since inequality cannot be expressed in the head:

no_duplicates([], []).
no_duplicates([E], [E]).
no_duplicates([H,H|T], Out) :-
    !, no_duplicates([H|T], Out).
no_duplicates([H1,H2|T], [H1|L1]) :-
    H1 \= H2,
    no_duplicates([H2|T], L1).


:- begin_tests(test_suite).

test(1) :-
    no_duplicates([], []).
test(2) :-
    no_duplicates([1], [1]).
test(3) :-
    no_duplicates([1,2], [1,2]).
test(4) :-
    no_duplicates([1,1,1,2,2,2], [1,2]).

:- end_tests(test_suite).

BTW, separating the test cases can help to locate some problems.

?- run_tests(test_suite).
[4/4] test_suite:4 ..
Warning: c:/users/cccar/documents/prolog/no_duplicates.pl:23:
Warning:     test test_suite:4: Test succeeded with choicepoint
% All 4 tests passed in 0.011 seconds (0.000 cpu)
true.

Also the interactive debugger (?- gtrace, no_duplicates([1,1,2], _).) would help, since it shows the choicepoints left in the execution trail.

1 Like

Awesome! Thanks!

I don’t understand your first question, the heads of clauses of a predicate (like no_duplicates/2) are matched in sequence. Is this that worries you ?

Yep! Because if that (being orderly) was the case the 4th clause no_duplicates([H1,H2|T], [H1|L1]) :- could simply be no_duplicates([H|T], [H|L1]) :- no_duplicates(T, L1).. And there would be no need for H1 \= H2, check because the H1 = H2 case would be already covered by the previous clause no_duplicates([H,H|T], Out) :- ....

The cut is still a very new concept for me - need yet to understand it.

If I remember correctly, Erlang is a committed choice language. If so there is an implicit cut/commit in the neck of every alternative.

There is not an implicit cut in Prolog so the concept of “already covered” must be done by an explicit cut to remove the choicepoint when the first two elements unify, or by an explicit condition, i.e., H1 \= H2 to ensure the condition for the last clause to succeed. You don’t need both, but if you don’t have the cut you will get a choicepoint, although it will fail immediately on backtracking. (It’s called a green cut because it doesn’t affect the semantics. It’s a red cut if you don’t specify the condition in the last clause because the semantics are affected.)

Good advice.

1 Like

Example: Programming cooperation - #20 by brebs

Agree. Note that on the toplevel you can also use the * command after a query succeeds with a choicepoint:

?- no_duplicates([1,1,1,2,2,2], X).
X = [1, 2] <press * here> 
% no_duplicates([2],[2]) left a choice point in alternate clause (after success)
%   /home/jan/src/swipl-exp/build.vmif-debug/nd.pl:2: clause succeeded
%   /home/jan/src/swipl-exp/build.vmif-debug/nd.pl:3: next candidate clause
% Called from
%   [16] no_duplicates([2,2],[2]) at /home/jan/src/swipl-exp/build.vmif-debug/nd.pl:4
X = [1, 2]

This is probably a bit harder to interpret than the graphical debugger …

3 Likes

(… in Erlang …) there is an implicit cut/commit in the neck of every alternative

That was so to the point (in the world of my incomplete understanding of how to think in “relations”)!

Managed to rewrite it with two cuts (and no conditions - and I like the fourth clause now):

:- module(workbench, []).

no_duplicates([], []).
no_duplicates([E], Out) :- !, Out=[E].
no_duplicates([H,H|T], Out) :- !, no_duplicates([H|T], Out).
no_duplicates([H|T], [H|L1]) :- no_duplicates(T, L1).

:- begin_tests(test_suite).

test(1) :-
    no_duplicates([], []).

test(2) :-
    no_duplicates([1], [1]).

test(3) :-
    no_duplicates([1,2], [1,2]).

test(4) :-
    no_duplicates([1,1,1,2,2,2], [1,2]).

test(5) :-
    no_duplicates([a,a,a,a,b,c,c,a,a,d,e,e,e,e], [a,b,c,a,d,e]).

:- end_tests(test_suite).

A couple of additional points:

  1. You can unify in the head of a clause so the second clause can just be:
no_duplicates([E], [E]) :- !.
  1. If you look at the doc “if-then-else” control predicate (written as Cond -> Goal1 ; Goal2), you’ll see that it does have an implicit cut. So you can replace the last two clauses with a single clause using this predicate. I’ll leave it to you to figure out how and decide which implementation you prefer. (I think both have their place.)
1 Like

Note that the 2nd clause is unneeded as the list of one element is covered by the last and first clauses. SWI-Prolog offers also this possibility:

no_duplicates([], Out) => Out = [].
no_duplicates([H,H|T], Out) => no_duplicates([H|T], Out).
no_duplicates([H|T], Out) => Out = [H|L1], no_duplicates(T, L1).

This is slightly different as the [H,H|T] only succeeds if the first two elements are equivalent rather than that they can be (and are) unified. I that results in cleaner semantics anyway. In general, => rules are great for defining deterministic “functions”, leading to much better error reporting and fewer logical errors. It is not really logic programming though. I always use the idea of “logic islands” for a real program: you use proper logic programming (pure Prolog, constraints, tabling, s(CASP), …) on islands that you connect to each other and the outside world (typically) using deterministic “functions” that provide the necessary data transformations and I/O operations.

2 Likes

Such code is flawed - it will only work as intended if its input variables are ground/1

Should be using dif/2

E.g. if given:

no_duplicates([X, Y], L)

… the output should be very similar to:

L = [X, Y],
dif(X, Y) ;
X = Y,
L = [Y].

Please consider that dif/2 was not deemed necessary in ‘Programming in Prolog’ by Clocksin-Mellish.

Well, ultimately, I don’t see how implying to beginners that fundamentally flawed code is OK, is a good thing :wink:

I view the handling of unknowns (i.e. var/1 and nonvar/1) as a major attraction of Prolog (example: binary addition) - otherwise, might as well program in Python :grin:

Beginners will be dismayed when they see their previously-believed-correct code produce wrong results.

Flawed code should at least throw/1 an exception instead of producing wrong results.

I think this is a little too pedantic. Although he didn’t explicitly state it, the O.P. implied from his examples that he was only interested in (and tested for) lists of ground terms. If so, the code is perfectly acceptable IMO. He’s not trying to publish a library for general use; in that case your criticism would be entirely valid.

This just emphasizes the need to be precise in defining the requirements/specifications up front.

Here’s my solution, which creates no choice-points. It uses an auxiliary predicate which passes around the previous element:

no_duplicates([], []).
no_duplicates([X1|Xs], Out) :-
    no_duplicates_(Xs, X1, Out).

no_duplicates_([], Prev, [Prev]).
no_duplicates_([X|Xs], Prev, Out) :-
    (   Prev = X
    ->  no_duplicates_(Xs, Prev, Out)
    ;   Out = [Prev|Out2],
        no_duplicates_(Xs, X, Out2)
    ).

This is probably a little faster as it avoids examining the list twice.

That suffers from the same issue as the OP, it may unify input elements. I think one should use ==/2 here and accept this not-pure predicate. Alternatively, use ?=/2 to detect cases where future instantiation may introduce duplicates again or do something complicated with dif/2 to cause future instantiation to backtrack all the way back to no_duplicates/2 and come with a new answer.

In most cases I’d go for the first :slight_smile:

:exploding_head:
Wow. Amazing feature I wasn’t aware of.

(?=)/2 fails if it would cause instantiation, so that doesn’t do the right thing (I would be OK with a variant of (?=)/2 that raises an error). (==)/2 has similar problems.

Instead, we can use freeze/2 and wrap_predicate/4 to ensure that execution is delayed until the input arguments are sufficiently instantiated(*):

:- wrap_predicate(no_duplicates(In, Out), no_duplicates_wrap, W, freeze_wrap(W, In, Out)).
:- wrap_predicate(no_duplicates_(In, Prev, Out), no_duplicates__wrap, W, freeze_wrap(W, In, Prev, Out)).

no_duplicates([], []).
no_duplicates([X1|Xs], Out) :-
    no_duplicates_(Xs, X1, Out).

no_duplicates_([], Prev, [Prev]).
no_duplicates_([X|Xs], Prev, Out) :-
    freeze(X,
           (   Prev = X
           ->  no_duplicates_(Xs, Prev, Out)
           ;   Out = [Prev|Out2],
               no_duplicates_(Xs, X, Out2)
           )).

freeze_wrap(Closure, In, _Out) :-
    freeze(In, Closure).

freeze_wrap(Closure, In, _Prev, _Out) :-
    freeze(In, Closure).

(*) I think I could have used block/1 and mode declarations instead of wrap_predicate/4 and freeze_wrap.

EDIT: simplified freeze_wrap.

What about
no choice-points, one predicate and one recursive call? (The first argument assumed ground.)

no_duplicates([], []).
no_duplicates([H|T], Out) :-
    no_duplicates(T, OutT),
    ( T=[H|_]
    -> Out = OutT
    ;  Out = [H|OutT]
    ).

Moving the recursive call to the end would make it tail-recursive.
(The order I used is maybe easier to understand.)

1 Like

Unfortunately, this isn’t tail-recursive, so it’ll use stack proportionate to the length of the list. The Prev parameter in my auxiliary predicate is a classic way of refactoring code to change its memory use from O(N) to O(1).

(Your code also evaluates [H|T] twice for each element, although that’s a minor efficiency point)