Two-way deterministic zero-padding of a list

Hi there, I’ve been learning prolog for a few months now, on and off, and I thought I was getting it, but this one looked simple but has me stumped.

I should note I’m using CLP(FD) for this, which I have only just started using on the basis of the voices who say it is more relational and easier to understand its behaviour than the usual prolog arithmetic. I am quite new to it so maybe there is something I don’t understand.

I want to have a predicate that relates a list to its padded version and a number representing the length of a padded version, and I would like this to be two-way, and it should be deterministic as there is only one solution. And of course being pure would be nice (avoiding cuts etc).

E.g. lzp([1,2], L, 4) should give L=[0,0,1,2], and lzp(L, [0,0,1,2], 4) should give L=[1,2].

‘Great’, I thought. ‘An opportunity to try out my relational thinking!’

This seems straightforward enough. A list L is related to it’s padded version P and the padded length N by:

  1. P is the concatenation of a list of zeroes Z and L.
  2. the length of Z + the length of L = N
  3. the first element of L > 0

(condition 3 is true in my use case, and means stripping the padding is unambiguous — without a condition of this sort you can’t tell where the padding ends and the list begins, of course.)

(I should also note in passing N could be treated better in the below, at the moment in my implementations it always needs to be specified, even though it’s redundant in the stripping case and doesn’t allow for the most general case of lzp(X, Y, Z). This isn’t the biggest problem though…)

This seemed like a reasonable implementation:

lzp_rel(L, L1, N) :-
    length(L, L_len),
    #(L_len) #=< #(N), 
    #(N_es) #= N - L_len,
    #(N_es) #>= 0, 
    length(Es, N_es),
    L = [H |_ ],
    #(H) #> 0,
    maplist(=(0), Es),
    append(Es, L, L1).

and it works:

?- lzp_rel([1,2], L, 4). 
L = [0, 0, 1, 2].

?- lzp_rel(L, [0,0,1,2], 4). 
L = [1, 2];

except that last call isn’t deterministic, and if you hit that ; it goes off into infinity.

The reason is that length keeps trying larger and larger lists.

So I thought “maybe I need a CLP(FD) aware length”, leading to:

len([], 0).

len([H | T], N) :-
    #(N1) #= #(N) - 1,
    len(T, N1). 

len_to_n(L, N) :-
    Len in 1..N,
    length(L, Len). 

But this still winds up in infinity after the last ; :

?- len_to_n(L, 3). 
L = [_] ;
L = [_, _] ;
L = [_, _, _] ;

I don’t understand the behaviour here. doing a trace only shows the following after the last answer:

L = [_, _, _] ;
   Redo: (11) length([_5716, _8774, _11842|_11844], _4408{clpfd = ...}) ? 

and then no further output. It looks like it’s starting to try larger lists again?

(I think I can probably figure out how to get lists of length ≤ N on backtracking, so that’s my next avenue of attack.)

Anyway, so I tried a recursive approach, which seems, well, more like you’d do it in Lisp:

lzp_rec_(L, L, N) :-
    length(L, Length),
    #(Length) #= #(N).  

lzp_rec_(L, L2, N) :-
      #(N) #>= 0, %needed otherwise backtracking will take this to -inf. 
      L2 = [0 | L3],   
      #(N1) #= #(N) - 1,
      lzp_rec_(L, L3, N1). 

lzp_rec(L, L2, N) :-
    lzp_rec_(L, L2, N),
    L = [H | _],
    #(H) #> #(0). 

This behaves better but isn’t deterministic in either case, even though there’s only one
result.

?- lzp_rec(L, [0,0,1,2], 4). 
L = [1, 2] ;
false.

?- lzp_rec([1, 2], L, 4). 
L = [0, 0, 1, 2] ;
false.

The reason of course is that the second lzp_rec_ definition is there to try after the base case. Many prolog list recursion definitions have the base case and recursive case distinguished by pattern matching, e.g. [] in the first and [H | T] in the second, so there isn’t another definition that matches the pattern to try. This must be the first time I’ve ended up writing something different to that, so I haven’t had to think about that before…

I can of course ‘fix’ my definitions by wrapping them in once but that isn’t very satisfactory.

Anyway, I’ve got an avenue to try to writing something that’ll generate lists up to a certain length which might fix the first definition. And perhaps pattern matching (or (shiver) a cut) could fix the second.

But I feel like I must have missed something, because this seemingly simple problem has been hard to find a satisfactory solution to!

Hello there,

Instead of trying to reexplain badly this topic, let me link to a very well informed source about constraint programming: https://ridgeworks.github.io/clpBNR/CLP_BNR_Guide/CLP_BNR_Guide.html#toc4Example_1,_List_Length

And here is a general piece of advice: clpfd is constraint programming on integers, not lists ! The moment you mix clpfd with the predicate length/2, you already lost.

Hum, recursion is the most fundamental control flow we have in prolog, what does lisp have to do with anything here ?

So, here is my take on your original problem:

A list L is related to it’s padded version P (let’s ignore N for now) if:

  1. the first element of P is 0 and the rest of P is padded L.
  2. the first element of P is not 0 and the rest of P is L.

I notice that the condition distinguishing 1 and 2 is an arithmetic condition, so I will use clpfd to reify the condition to a boolean and pattern match on the boolean:

lzp([H | T], L) :-
  % if H = 0, then order is 1
  % if H \= 0, then order is 0 and the reverse
  % if Order = 1, then H = 0
  % if Order = 0, then H \= 0 
  (H #= 0) #<==> Order,
  lzp(Order, [H | T], L).

lzp(0, L, L).
lzp(1, [0 | T], L) :-
  lzp(T, L).

Now, let’s count the number of time we found a 0 prefix in P. clpfd here can be useful to avoid having an accumulator:

lzp([H | T], L, N) :-
  (H #= 0) #<==> Order,
  (N #> 0) #<==> Order,
  lzp(Order, [H | T], L, N).
lzp(0, L, L, 0).
lzp(1, [0 | T], L, N) :-
  N1 #= N - 1,
  lzp(T, L, N1).

Just keep adding elements of 0 to the front, until the length is reached (or to infinity).

I don’t see clpfd actually helping here.

First I assume from specification that if N=0 the padded list is the same as the unpadded one, i.e., L=P. Otherwise, N must be a positive integer or padding will fail. It follows that an undefined N is the same as the no-padding case. Following up on @brebs suggestion, how about:

lzp_rel(L,L,0).           % if N=0, padded = unpadded
lzp_rel(L,P,N) :- 
	integer(N), N>0,      % imposed conditions on N
	lzp_rel_(L,P,N).
	
lzp_rel_(L,P,N) :-
	P=[0|L],              % condition 1 - P starts with 0 
	length(P,N),          % condition 1 & 2 
	L \= [0|_].           % condition 3
lzp_rel_(L,[0|P1],N) :-   % first clause not true, add another 0 to the padding
	N > 0, N1 is N-1,
	lzp_rel_(L,P1,N1).    % true if remaining padding has adjusted length

Now:

?- lzp_rel([1,2],L,4).
L = [0, 0, 1, 2] ;
false.

?- lzp_rel(L,[0,0,1,2],4).
L = [1, 2] ;
false.

?- lzp_rel(L,P,4).
L = [],
P = [0, 0, 0, 0] ;
false.

?- lzp_rel(L,P,0).
L = P ;
false.

?- lzp_rel(L,P,N).
L = P,
N = 0 ;
false.

If the superfluous choicepoint bothers you, you can add a green cut to the first clause or manage it in the call, e.g.,

?- once(lzp_rel(L,[0,0,1,2],4)).
L = [1, 2].

?- lzp_rel([1,2],L,4) -> true.
L = [0, 0, 1, 2].

Note that without a green cut, the clause ordering doesn’t semantically matter, but if you add the cut it must appear in the first clause (whichever it may be).

Constraints (e.g., clpfd) can be helpful in restoring logical behaviour to arithmetic, but in this case, given you accept the conditions on N, there doesn’t seem to be any illogical behaviour to restore - indeed there’s very little arithmetic.

1 Like

Can tame a list using e.g.:

len_between(Min, Max, Lst) :-
    Len is Max - Min,
    len_list_tail(Min, Lst, Tail),
    max_len_list(Len, Tail).

len_list_tail(Len, Lst, Tail) :-
    len_list_tail_(Lst, Len, Tail).

len_list_tail_(Tail, 0, Tail) :- !.
len_list_tail_(Lst, Len, Tail) :-
    ( var(Len) -> true ; Len > 0 ),
    Len0 is Len - 1,
    len_list_tail_(Lst, Len0, [_|Tail]).

max_len_list(MaxLen, Lst) :-
    MaxLen >= 0,
    max_len_list_([], Lst, MaxLen).
    
max_len_list_(Lst, Lst, MaxLen) :-
    ( MaxLen == 0 -> ! ; true ).
max_len_list_(Upto, Lst, MaxLen) :-
    MaxLen >= 1,
    MaxLen0 is MaxLen - 1,
    max_len_list_([_|Upto], Lst, MaxLen0).

Can then use as:

?- len_list_tail(3, L, T).
L = [_, _, _|T].

?- max_len_list(2, L).
L = [] ;
L = [_] ;
L = [_, _].

Putting the two together:

?- len_between(3, 5, L).
L = [_, _, _] ;
L = [_, _, _, _] ;
L = [_, _, _, _, _].

Why couldn’t one use between/3 and length/2 for this? Like,

?- between(3, 5, N), length(L, N).

?

Can do. But where’s the fun in that? :grinning:

I don’t really understand how you reached this conclusion ?
Isn’t a more natural/classical interpretation that an undefined N describe all possible padded lists ?

Note that I am not saying that you are wrong or anything. Just asking if you could explain a bit more the reason why you took that decision ?

By the way, I forgot to add some examples of use for my implementation.

Here is the most general query:

?- lzp(P, L, N).
P = L, L = [_A|_],
N = 0,
_A in inf.. -1\/1..sup ;
P = [0, _A|_B],
L = [_A|_B],
N = 1,
_A in inf.. -1\/1..sup ;
P = [0, 0, _A|_B],
L = [_A|_B],
N = 2,
_A in inf.. -1\/1..sup ;
...

So by default, we generate longer and longer padding on a partial list with at least one element which should be different to 0.

Here, is a query with a fixed padding length:

?- lzp(P, L, 3).
P = [0, 0, 0, _A|_B],
L = [_A|_B],
_A in inf.. -1\/1..sup.

Notice the determinism of the solution here, given by reification of the condition on N: (N #> 0) #<==> Order.

Here is a query with a fixed padded list:

?- lzp([0, 0, 0, 1, 2, 3], L, N).
L = [1, 2, 3],
N = 3.

Again, note the determinism, given by the reification of the condition on P: (H #= 0) #<==> Order.

Finally, a query with a constrained N between 3 and 6:

?- N in 3..6, lzp(P, [1, 2, 3], N).
N = 3,
P = [0, 0, 0, 1, 2, 3] ;
N = 4,
P = [0, 0, 0, 0, 1, 2, 3] ;
N = 5,
P = [0, 0, 0, 0, 0, 1, 2, 3] ;
N = 6,
P = [0, 0, 0, 0, 0, 0, 1, 2, 3].

Again, without a superfluous choice point at the end.

I believe this implementation has multiple advantages:

  • fully pure, meaning it is very easy to debug declaratively
  • fully deterministic under all possible conditions (which is important for performance)
  • very compact definition by using constraints library like clpfd

This implementation has also disadvantages:

  • reification like #<==> is very slow, so bad for performance. one solution would be to use reified if_ library, but I don’t know it will be faster…
  • this implementation chains clpfd variables to compute N, which is going to be extremely slow for large N. I think it should be possible to change this to a grounded accumulator with basic prolog arithmetic.

Another aspect of the implementation, which is either bad or good depending on usage, is that the implementation is fully eager. This means that a non grounded N will lead to the eager generation of all possible values of N and instantiation of the padding in P.
I suppose a lazy version could be done if anyone is interested ?

Totally arbitrary decision on my part. The OP didn’t specify what undefined N meant so I gave it one based on:

but your suggestion/implementation, or e.g., failure/error is equally valid as long as it’s included in the problem specification.

Thanks to everyone who replied.

Sorry I haven’t been paying more attention to the answers — my posting wasn’t well timed from my own perspective as I was both recovering from an illness (Covid) and wrapping up the year at work - and now I’m finding myself busy with family for the holiday season. My own silly fault, I should have been a bit more restrained and held off for a couple of weeks!

Just a clarification that I am using CLPFD because I’m taking it to be recommended generally - not because I thought it would magically help this particular circumstance. I did tacitly suppose it might constrain the N in length, but I hadn’t really thought it through and the documentation for length made it clear it won’t do that.

(In my defence, other people make this assumption too :slight_smile: )

At the moment I’m leaning towards kwon-young’s solution, partly because I like the clear distinction between the two cases. I will need to think about it a little more in a couple of day’s time…

I merely meant my other implementation (lzp_rel) seemed (to me) more in keeping with the ethos of prolog: declaring the necessary relationships and expecting prolog to take care of the execution details for you. It also could not be written that way in Lisp, so it seemed more ‘prology’ in that respect also - a form of expression that prolog supports that most other languages don’t. Whereas recursion is something both languages utilize. I didn’t mean that recursion is somehow not prolog and I also understand that recursive definitions also express relationships.

(shades of this discussion I guess. I wouldn’t try to say recursion isn’t declarative, and I think z5h’s remark on the matter is a good one, but I can kind of see where emiruz is coming from)

Thanks for the link to the clpBNR documentation. This looks very useful - will read!

It is a constraint solver. If you are not solving constraints, it just gets in the way.

I wouldn’t know how it is possible to recommend this generally. Maybe the person or people who recommend it have vested interest. We will never know though…

This was also my first impression, but now I’m not as rigid. clpfd goes to some trouble to ensure that its performance is comparable to standard modal arithmetic when used that way. But the price is considerable code bloat and complicated debugging due to source modification (try listing/1 on a predicate with a clpfd constraint sub-goal).

But it only deals with the integer subset of standard prolog (no floats or rationals, no transcendental functions, …) so you still have to understand modal arithmetic if you want to use that functionality. Also I don’t much like the extended operator set (#=, #<, #*, …) visually but that’s a personal preference.

So I’ve pretty much come to the same conclusion - use it when fits the problem.

See CLP(FD) and CLP(Z): Prolog Integer Arithmetic ; a not unreasonable argument IMO, but not one I would argue for in a standard Prolog context at this point.

2 Likes

I read the whole textbook, The Power of Prolog, some years ago. I have not seen or forgotten the quote that the author had modestly placed at the start of the chapter you linked:

It is rather as if the professional community had been suddenly transported to another planet where familiar objects are seen in a different light and are joined by unfamiliar ones as well. (Thomas S. Kuhn, The Structure of Scientific Revolutions)

I will have to now read that book, too, before I can argue any further, on the philosophical level.

From a purely pragmatic day-to-day programmer point of view, The Power of Prolog falls short since it shows the shiny bits and hides the naughty bits.


PS: At the bottom of the page you linked, “Legacy predicates”… :rofl:

Here is a reactionary idea: what if some things stay around for a long time partly because they are good?

1 Like

Hard to say anything concrete based on longevity; some things that shouldn’t hang around do so because it’s always been done that way :slightly_smiling_face:

IMO modal arithmetic is not a particularly good idea. Colmerauer kept trying to do something about it but seemed to get overwhelmed by the standardization process largely driven by the Prolog vendors at the time IIRC. And to be honest, modal arithmetic is good enough most of the time.

Who is to judge what should and should not be? Do we leave it to the “wise men”? And who is to judge who they are?

While longevity is annoyingly factual.