Why does this predicate change a list with a tail, to a list with two elements, one ground, one var?

I’m using: SWI-Prolog version 8.1.9.

I’m surprised how much trouble I am trying to create what I thought would be a simple predicate. I created the predicate below to take a list of elements, and generate a new list that contains only the elements in the first list that are ground:

The first problem I had was that the usual check for the empty set to terminate a list evaluation, was causing the Tail of the List part of a difference list to unify with the empty set (clause #2), since that is what I was passing to the predicate as the first argument…

To solve that problem I created clause #1 that looks for that condition specifically and ends processing immediately (clause #1). However now I have a new problem and one I can’t seem to escape. The effect of clause #1 is to change a list that has an unground Tail to a list whose last element is a variable.

So:
[cat | Tail]
Becomes
[cat, Tail]

This too kills the List part of a difference list that is passed through the function. Worse, I can’t seem to come up with a way to fix this. I thought I remembered reading about a “not” test that did its work without unifying, but I can’t find it and I don’t know if that would fix things anyways.

How can I shape this predicate so that it does its work without affecting the list passed in as the first argument?

% ----->>>>> select_only_ground_elements(+List, -OnlyGroundElementsList)
 
% Description:  Simple predicate that takes a list and only accepts those 
%	elements that are ground.

% If the last element is a variable, exit now so we don't 
%	unify that variable with the empty set.
select_only_ground_elements([LastElement], []) :-
	var(LastElement),
	!.
% Out of elements and the last element is not a variable.  We are done.
select_only_ground_elements([], []) :- !.
select_only_ground_elements([H | T], [H | T_ground]) :-
	ground(H),
	!,
	select_only_ground_elements(T, T_ground).
select_only_ground_elements([_ | T], GroundElements) :-
	!,
	select_only_ground_elements(T, GroundElements).

Why are you depending on var (or ground)? If you’re using an unbound variable as a kind of “other” term, then you’re probably better off using something like other(...) rather than using an unbound variable. (As you’ve found out, programs that depend on not instantiating a variable can be tricky.)

If your use is legitimate, then \+ \+ can be used to test without instantiating anything.

BTW, here is a “legitimate” use of var:

%! base64_term(+Base64:string, -Term) is det.
%! base64_term(-Base64:string, +Term) is det.
%% Unlike base64/2, Base64 is always a string - compare base64(Z, "Zm9v"), Z = foo.
base64_term(Base64, Term) :-
    (  var(Base64)
    -> term_to_atom(Term, Atom),
       base64(Atom, Base64String),
       atom_string(Base64String, Base64)
    ;  base64(Atom, Base64),
       term_to_atom(Term, Atom)
    ).

Because I’m trying to detect the presence of a list with an unground tail, a very common situation with the List part of a difference list. For now I modified my select_only_ground_elements to use copy_term/2 to make a fresh copy of the list that may have an unground tail so that nothing bad can happen to the list during processing. Normally the downside to this approach would be that any variables in the original list would now be decoupled. But since the point of that predicate is to ignore variables in a list, it works out.

Understood, but that isn’t the root problem with my approach because the culprit appears to happen as soon as any clause head is matched. As soon as the clause head matches, it unifies the unground tail of the SubList in my difference list with whatever, turning the unground tail of the List part of the difference list into a simple variable or term, as explained above. In other words, it doesn’t matter that \+ doesn’t unify because the very act of matching the head of the clause causes my problem.

But why are you trying to detect the presence of a list with an unground tail? If you have a difference list, just unify the tail with [] and you’re done. (And phrase/2 will do that for you.)

That’s in my posts above on this thread. I’m trying to grab all the ground elements from the List part of a difference list without unifying with the variable at the very end of that list, otherwise other code working with the difference list will fail because the tail is no longer a pure variable. The only way to get that job done was to make a fresh copy of the list before processing it because any clause head that matches will alter that tail variable.

I agree with @peter.ludemann: you shoudn’t need to do this.

It shouldn’t. Difference lists are a pure Prolog data structure. For example, [1, 2, 3|X]-X is a difference list representation of the list [1, 2, 3] regardless of whether X is instantiated and what it is instantiated to. So [1, 2, 3, 4, 5]-[4, 5], [1, 2, 3, foo]-[foo] and [1, 2, 3]-[] all represent the same list namely the difference formed by taking the second list away as a tail from the first.

If you can show the other code that fails, that would be really helpful.

1 Like

In an abstract sense your are right that these are all the same difference list. However, if something instantiates the tail you cannot bind the tail yourself anymore (to a list).

It is not very common to do anything with a difference list before it is turned into a normal list. It may be needed and such code should typically be written such that a variable (or non-list) fail is handled the same as [].

If you want to know where a variable is instantiated, simply add this temporary to your code

freeze(V, backtrace(25)).
2 Likes

I reached a bit of Prolog enlightenment recently on a similar issue. Firstly, the easy way to find if the tail (or any other element) of a list is an incomplete data structure is:

ground([a,b,c|_]).

I was a bit confused about the difference between nonvar and ground before this, which is that ground works on entire terms, including lists, whereas var, unsurprisingly, expects an individual variable.

My problem was I’ve gotten into the habit using lists as quick and dirty dictionaries looking like [key1(value1), key2(value2),…] and then extracting the value for a key with member(key(Value), List).

The snag I hit was trying to extract a key-value pair from a list that had not been defined. The result below came as a bit of a surpise to me:

?- member(key(Value), List), format('Value ~w, List ~w~n', [Value, List]).
Value _8086, List [key(_8086)|_8496]
List = [key(Value)|_8496] .

I’d jumped to wrong conclusion that tryng extract something from an unground list would return false, so learnt the hard way to check ground(List) first.

If you want a dictionary, use a dict or red-black trees or association lists, or even regular lists.

But don’t use a variable to stand for “not there”. And don’t use ground to check. You should make your predicates logical, not procedural.

In essence: there is an insert/update predicate that creates a new data structure with the additional item, and a lookup predicate that can fail if the item isn’t in the data structure. If you come from a world of imperative programming and destructive vector updates, this looks horribly inefficient; but you’ll often be pleasantly surprised that it’s a rather minor part of the overall performance characteristics of your code, and having immutable data structures can make the code a lot easier to write. (Also, if immutable arrays were good enough for DIjkstra, they’re good enough for me.)

I take it that you have done neural networks.

No, I haven’t. But I have done compilers and it turns out that updating complex graphs becomes a nightmare; whereas traversing a graph and generating a new one isn’t necessarily slower but it is a lot cleaner.

The problem is is not about whether or not there are multiple lists that all represent essentially the same difference list, it’s about having a difference list whose SubList (i.e. - the second list in a difference list pair) that can unify with the other predicates in my code properly, in this case a recursive call to the same predicate. As Jan said:

The main predicate in my old code, before I refactored to a findall approach, had two recursive calls to itself that passed the difference list around. One recursive call only executed if the current node in the tree I was iterating had child nodes. The other was used to iterate the main list of items I was iterating on element at a time and was the last statement in the predicate.

The head of the SubList was used to add a new element (H) generated by mainpred to the difference list as shown in the line below:

mainpred([Item | T_item_list ], DiffList/[H | T_diff_list]) :-

This pushed the the element into DiffList and worked fine as long as I was descending downwards and the SubList remained a pure variable. But as results propagated back up, T_diff_list unified to the list elements I accumulated into the difference list during the descent.

So yes [1,2,3 | T]-T and [1,2,3,4,5 | T]- [4,5 | T] are essentially the same list, but now the SubList [4,5 | T] will cause the recursive call to mainpred to fail since as a list with elements it will no longer unify with `[H | T_diff_list]'.

In case this is the next question, I did try (frantically) to come up with a way to delay unification of the SubList by rewriting [H | T_diff_list]' to be simply SubList`, and then trying to delay unifying SubList as long as I could to avoid this problem. But I found out soon I was just pushing the problem around and because (my opinion) it’s really an issue that is inherent to using a difference list and the linkage a difference has between the List and the SubList. The problem manifests itself when you are using the difference list during recursive descent and reusing it at the same nesting level.

That was my experience at least. I could be wrong of course, but at this stage I would need to see a code sample to believe it.

I’m afraid I’m still not convinced you’re using difference lists correctly, but so be it.

After unifying your initial difference list with DiffList/[H|T_diff_list], the difference list that represents it with H added at the end is DiffList/T_diff_list; that is the term you need to propagate, not DiffList/[H|T_diff_list]. Otherwise you will have the problem you describe.

I suggest looking at the examples in section 15.1 Difference-Lists of The Art of Prolog, Second Edition, now freely available.

Except, at some point you have to bind T_diff_list to something to add new items to the difference list and at that point it is no longer a single variable. As I said in my reply above, at that point, I was just pushing that eventual unification operation around in my code instead of solving the proble. This because the root problem is the antagonistic pair of forces of:

  • needing to unify the SubList with a list at some point to add new elements to the List
  • The need to keep the SubList a pure variable so that it can unify with the two recursive calls in the code.

I can’t explain it better than that. If you ever feel like experimenting, try writing a predicate that:

  • Has a difference list as an argument
  • *Always * adds new elements to the difference list during each iteration/nesting depth
  • Makes (at least) two recursive calls to itself in a single clause at the same iteration/nesting depth

Either you will find that the linkage problem I’ve been pointing out defeats your efforts to do this, or you will find the error in my thinking. If it’s the latter, I’d love to see the code of course.

Regarding point #2 above: Makes (at least) two recursive calls to itself in a single clause. I don’t believe moving one of the recursive calls to another clause will make a difference. The reason switching to a findall/backtracking approach works, is because of the unbinding of the difference list across findall iterations, thereby avoiding the problem created by the predicate trying to make a second recursive call to itself after a previous call bound T_diff_list when it added new elements to the list.

One way to handle this is to define your own difference-list append:

append_d(A-B, B-C, A-C).
?- append_d([1,2,3|D1]-D1, [4,5,6|D2]-D2, Result-D3), D3 = [].
D1 = [4, 5, 6],
D2 = [], 
D3 = [],
Result = [1, 2, 3, 4, 5, 6].

and use this append throughout your code. When you’re done debugging, you can “compile away” the calls to append_d (or use term_expansion to do that for you).

I’ve found the book-keeping for difference lists to be error-prone, so I prefer using extended DCGs to keep track of my accumulators (they also allow me to define my own accumulator predicates, so that I’m not restricted to just lists).

Another option is to do your initial development using regular lists and regular append/3, and use profiling to decide whether there’s any advantage in switching to difference lists (probably there is, because this style of programming often breaks tail-recursion optimization). If so, it’s fairly straightforward to substitute the difference-list append for regular append, especially if you defined and use your own append_d(A,B,C) :- append(A,B,C).

Or use the standard functional programming trick of building your list in reverse order, then reversing it for the final result. This allows more possibilities of TRO and reduces the repeated appends from O(N2) to O(N).