Fully understanding the "intricacies" of the Prolog code design process

Greetings,

Trying to delve into de depths of declarative code design and elaborating some notes for teaching my students I stumbled with the following: Solving a very basic excercise “Program a predicate that filters even numeric elements from a list (filter_evens/2)” I started by stating the declarative reasoning that associates the parameters:
(a) An empty list is the even numbers filter of another empty list…
(b) The even numbers filter of a list [X | Rest] is:
(b1) The even numbers filtering of Rest when X is an even number,
(b2) A list [X | R] where R is the even numbers filtering of Rest when X is not an even number.

That reasoning can be translated into the following code:

%  filter_evens(<List>, <FilteredList>).
filter_evens([ ], [ ]).
filter_evens([X|Rest], Answer):-
              filter_evens(Rest, FilteredRest),
              ( (number(X), (X mod 2) =:= 0)
                *->
                     Answer = FilteredRest
                  ;
                     Answer = [X|FilteredRest]
                ).

Then, testing this code I identified the four obvious “use cases” for the two parameters:
(1) constant-constant, (2)constant-variable, (3) variable-constant, and (4) variable-variable
(I don’t know if that is the correct name for this “phenomenon” but that is as I have called it).

Testing at the console reveals use-cases (1),(2) and (4) work as expected:

 ?-  filter_evens([1,2,3,4,5], [1,3,5]).   % Use-case (1)constant-constant...
       true.

 ?-  filter_evens([1,2,3,4,5], A).           % Use-case (2)constant-variable...
      A = [1, 3, 5].

 ?- filter_evens(List,Answer).             % Use-case (4)variable-variable...
      List = Answer, Answer = [] ;
      List = Answer, Answer = [ _ ] ;
      List = Answer, Answer = [ _ , _ ] ;
      List = Answer, Answer = [ _ , _ , _ ] .

But that is not the case for use-case (3) that gets stucked:

 ?- filter_evens(List, [1, 3, 5]).
      List = [1, 3, 5] ;
      ...  % here it gets stucked for ever

I first (cinically) assumed that use-case (3) didn’t work just because there was no way, in the code, to logically infere the first argument from the second. But the reality is that I do not fully understand why that happens, nor how to correct it, if it can be corrected at all. Maybe the intitial rationale is not sufficiently declarative. Even use-case (4) is not completely clear to me.

Can anyone help in understanding this, and maybe suggest other course of actions so that, if a use-case doesn’t work, at least it does not gets stuck?

Thanks in advance…

Just a very quick observation - that dot should be a comma. And loading the file should cause several warnings, due to the syntax violation.

Hi @brebs,

You are right. My mistake.
I have corrected the example…

Thank you!

Just out of curiosity, what kind of course is this? What is the topic and the aim of the course? Is it a programming course or a theoretical course with programming used as one of the teaching tools?

“filter” is ambiguous - include/3 and exclude/3 are specific on whether filtering in or out.

No need for the subtlety of *->/2 - it’s clearer with ->/2

The problem is that a list can have length of anywhere between zero and infinity, and can have infinite content variety like [horse, shoe apricot, ...]

Having infinite solutions is rather inconvenient. It is much easier to reason about a small number of possibilities - example of a generator (produces a sorted order):

include_evens(L) :-
    numlist(1, 4, NL),
    maplist(times2, NL, NL2),
    include_evens_(NL2, L).
    
times2(N, M) :-
    M is N * 2.
    
include_evens_(_, []).
include_evens_([H|T], [E|L]) :-
    select_forward(E, [H|T], NL0),
    include_evens_(NL0, L).

% A variation on select/3 which only goes forward through the list
select_forward(E, [H|T], F) :-
    select_forward_(T, H, E, F).

select_forward_(T, H, H, T).
select_forward_([H|T], _, E, F) :-
    select_forward_(T, H, E, F).

Result:

?- include_evens(L).
L = [] ;
L = [2] ;
L = [2, 4] ;
L = [2, 4, 6] ;
L = [2, 4, 6, 8] ;
L = [2, 4, 8] ;
L = [2, 6] ;
L = [2, 6, 8] ;
L = [2, 8] ;
L = [4] ;
L = [4, 6] ;
L = [4, 6, 8] ;
L = [4, 8] ;
L = [6] ;
L = [6, 8] ;
L = [8].

Can then do some reasoning, e.g.:

?- include_evens([2, I, 6]).
I = 4 ;
false.

Hi @Boris,

It is more like a multiple paradigm course, where students are confronted with the concepts of logic programming, declarative code, constraint solutions and inductive logic programming.

Programming is not the sole objective, but to conceptually identify those paradigms and the multiple ways in which they can be used together for problem solving AND growth of personal potential.

Hi @brebs,

Your answer looks interesting, but slightly deviated from my original question.
Maybe I didn’t make myself clear, I apologize for that.

What I am trying to understand is how to rationalize the design of a Prolog predicate, to maximize its “declarativeness” and ensure non-determinism of the code.

That is the reason I chose *-> instead of → to get rid of the extra-logic cut involved and be able to check every posible use-case…

The filter_evens/2 predicate was just a simple example to start with…

Is it correct to bind the degree of non-determinism of a Prolog predicate with the analysis of its use-cases?

How is that influenced by the mark of “input arguments” and “output arguments” or any other argument mode indicator?

But (number(X), (X mod 2) =:= 0) will never produce a choicepoint, so having *->/2 rather than ->/2 after it is just a distraction.

I was trying to explain that this is going to be fruitless:

?- filter_evens(List, [1, 3, 5]).

… because List can be an infinite variety of infinite length. So, you have to formulate it a different way - and I provided an example of such a way.

In case @brebs answer wasn’t completely clear, permit me to start with what I think is a fairly literal (and declarative) translation of your problem statement:

filter_evens([ ], [ ]).                   % Case (a)
filter_evens([X|Rest], Answer):-          % Case (b1)
	number(X), X mod 2 =:= 0,
	filter_evens(Rest,Answer).
filter_evens([X|Rest], [X|Answer]):-      % Case (b2)
	not((number(X), X mod 2 =:= 0)),
	filter_evens(Rest,Answer).

So

?- filter_evens([1,2,3,4,5], [1,3,5]). 
true ;
false.

?- filter_evens([1,2,3,4,5], A).
A = [1, 3, 5] ;
false.

?- filter_evens(List, [1, 3, 5]).
List = [1, 3, 5] ;
false.

?- filter_evens(List,Answer).
List = Answer, Answer = [] ;
List = Answer, Answer = [_] ;
List = Answer, Answer = [_, _] ;

Now this leaves a perfectly safe but unnecessary choicepoint since most Prolog implementations cannot not determine until backtracking that the second and third clauses are mutually exclusive. It also means that the “even” test is repeated for Case (b2). This can be remedied if you so choose with a green cut in the first clause which also means the test can be removed:

filter_evens0([ ], [ ]).
filter_evens0([X|Rest], Answer):-
	number(X), X mod 2 =:= 0, !,
	filter_evens0(Rest,Answer).
filter_evens0([X|Rest], [X|Answer]):-
	filter_evens0(Rest,Answer).

Testing:

?- filter_evens0([1,2,3,4,5], [1,3,5]).
true.

?- filter_evens0([1,2,3,4,5], A).
A = [1, 3, 5].

?- filter_evens0(List, [1, 3, 5]).
List = [1, 3, 5] ;
false.

?- filter_evens0(List,Answer).
List = Answer, Answer = [] ;
List = Answer, Answer = [_] ;
List = Answer, Answer = [_, _] .

If for some reason you want to avoid using a green cut, the equivalent using the standard conditional:

filter_evens1([ ], [ ]).                 % Case (a)
filter_evens1([X|Rest], Answer):-
	(number(X), X mod 2 =:= 0
	 -> filter_evens1(Rest,Answer)       % Case (b1)
	 ;  Answer = [X|NotEvens],           % Case (b2)
	    filter_evens1(Rest,NotEvens)
	).

resulting in

?- filter_evens1([1,2,3,4,5], [1,3,5]).
true.

?- filter_evens1([1,2,3,4,5], A).
A = [1, 3, 5].

?- filter_evens1(List, [1, 3, 5]).
List = [1, 3, 5] ;
false.

?- filter_evens1(List,Answer).
List = Answer, Answer = [] ;
List = Answer, Answer = [_] ;
List = Answer, Answer = [_, _] ;
List = Answer, Answer = [_, _, _] .

Note that when the first argument is a variable, all these implementations leave a choicepoint between the empty and non-empty list case, resulting in ever increasing List length on backtracking. So using the original, ?- filter_evens(List,[1,3,5]) fails on lists of length 0,1,and 2, succeeds when the List is of length 3, but if you then backtrack, it’ll try lists of length 4, 5, … which all fail with no termination in sight (same as test (4)).

But note that when the length of List exceeds the length of Answer you know there are no more solutions. Because this test (Answer = [X|FilteredRest] is after the List generation and number test, it never gets checked. By moving the recursive call to filter_evens to after this check (as in filter_evens1), non-termination (for this case) is avoided. This also permits last call optimization (usually a good idea) since the recursive call is now the last call.

Hi @ridgeworks,

Indeed, that is more clear than what @brebs showed.
I can see that in order to simplify things, the extra choicepoint can justify the use of a “standard conditional”.
But, what really caught my attention was your code on filter_evens1 , because while it is not so diferent than my original code, it avoids the infinite loop in use-case (3). Seems to me it is a very subtle modification, however as you said changing the order of the recursive call makes a great difference. Also the analysis of lenghts in both arguments clarifies for this case.

So, it seems that the initial rationale was not wrong, but the translation of that reasoning into code was … clumsy at least.

What do you think of non-determinism vs analysis of use-cases in general?

That is incomplete and therefore wrong, though, which is the point I was trying to make :grinning:

This should also be a valid value for List:

[-666, 1, 2, 2, 3, 4, 4, 4, 4, 4, 5, 6, 6, 6, 8, 8, 8, 1000, 2000, 3000]

etc.

This is what I meant by infinite length and infinite variety of things being filtered out.

Not so much wrong as incomplete as @brebs points out. He correctly argues that there are an infinite number of List’s that satisfy the query ?- filter_evens(List, [1, 3, 5]). But the problem statement says nothing about what the allowable values of List are. It does imply that Answer is List with all even valued integers removed.

Another example of incompleteness (or translation “clumsiness”): suppose the List contains a floating point or rational number (try it)? Or suppose List isn’t even a list, e.g., set(1,2,3,4,5)?

I think this all boils down to having a complete specification (problem statement) for your relation filter_evens/2 including the conditions for success, failure and generating an error. For example, you could say that List must be a list of integers and Answer is List with all even values removed and all other cases fail (or generate an error, or whatever).

I see…

But it is not yet clear to me if use-case analysis is adequate for designing a predicate that is desired to be non-deterministic…

Programming is an art form, requiring imagination.

So I think you’re right that use case analysis (if I understand your meaning) doesn’t cover non-deterministic behaviour since that is governed by the program semantics post call, i.e., what happens when a subsequent call fails (not something you normally encounter in other programming languages).

A non-deterministic predicate just succeeds multiple times (on backtracking) but the conditions for success are the same. In addition, failure occurs when there are no more solutions. I think it’s just a small step from the deterministic case but it should be part of the problem statement/specification. I think marking arguments as input/output is part of use case analysis (e.g., the marking may be different for different use cases) and not part of non-determinism specification.

Hey @sgodoyc, if that’s the case I would suggest writing the predicate as:

:- use_module(library(clpfd)).

filter_evens([], []).
filter_evens([N|Ns], Fs)     :- N mod 2 #= 0, filter_evens(Ns, Fs).
filter_evens([N|Ns], [N|Fs]) :- N mod 2 #= 1, filter_evens(Ns, Fs).

This has the upside of correctly modeling the inverse query:

length(L, _), filter_evens(L, [1,3,5]).
L = [1, 3, 5] ;
L = [_A, 1, 3, 5], _A mod 2#=0 ;
L = [1, _A, 3, 5], _A mod 2#=0 ;
L = [1, 3, _A, 5], _A mod 2#=0 ;
L = [1, 3, 5, _A], _A mod 2#=0 ;
L = [_A, _B, 1, 3, 5], _A mod 2#=0, _B mod 2#=0 ;
L = [_A, 1, _B, 3, 5], _A mod 2#=0, _B mod 2#=0 ;
...

if that appeals to you I would suggest checking Markus Triska’s youtube channel or prolog notes.

Agreed - using CLP (Constraint Logic Programming as in clpfd) restores the logical properties lost in ISO Prolog to arithmetic.

1 Like

You can also use freeze/2 or wait/2 to make things logical:

filter_evens([], []).
filter_evens([N|Ns], Fs)     :- even(N), filter_evens(Ns, Fs).
filter_evens([N|Ns], [N|Fs]) :- odd(N),  filter_evens(Ns, Fs).

even(N) :- freeze(N, N mod 2 =:= 0).
odd(N) :-  freeze(N, N mod 2 =\= 0).
?- length(L, _), filter_evens(L, [1,3,5]).
L = [1,3,5] ;
L = [_A,1,3,5],
freeze(_A,_A mod 2=:=0) ;
L = [1,_A,3,5],
freeze(_A,_A mod 2=:=0) ;
L = [1,3,_A,5],
freeze(_A,_A mod 2=:=0) ;
L = [1,3,5,_A],
freeze(_A,_A mod 2=:=0) ;
L = [_A,_B,1,3,5],
freeze(_A,_A mod 2=:=0),
freeze(_B,_B mod 2=:=0) 

However, length(L,_),include(odd,L,[1,3,5]) goes into an infinite loop – I haven’t investigated why.

I thank you all for your insights…

Undoubtedly there are many accomplished programmers in this forums, and that is an important part of the equation. Still, I would like some experienced tips to teach unexperienced students to help them conceptually understand the general laws, even when the code to do that is not optimum in performance or of common use in application developement…