Match on structure, bind whole value

In Erlang I can do something like this:

f(L = [_|_]) -> % code

This will match on any list that has at least one element and bind the whole list to L.

Is there some similar construct available in prolog?

I tried the following on an online playground (I’m on mobile without proper prolog interpreter).

f(A = [_|_], A).

Hoping it would act like identity for any list of length greater than or equal to 1, but it just replied false independent of how I called the predicate. Except for when I use it as a “generator” then it produces results to what I’d expect:

?- f(B, A).
B = (A=[_1514|_1516])
?- f([a], A).
false
?- f([a,b], A).
false

Using a list just for the example, the data that I do in fact expect will be a lot more complex compound terms.


Edit:

As this might also be specific to the online playground I used, it’s this one:

https://swish.swi-prolog.org/

You can’t do both in one go. Either

f(L) :-
    L = [_|_],
    % code using `L`

or

f([X|Xs]) :-
    % code using `[X|Xs]`
1 Like

Thanks, that seems to work in the online playground and I’m pretty sure it will do for the actual usecase later on as well.

If I do it in the head or “later” in the body shouldn’t matter much as I understand it, as prolog would check other clauses due to backtracking anyway?

Correct. And if at some point in the clause you don’t want others to be checked, you can use cut (!), but that is non-logical.

1 Like

True, but there is an efficiency issue.

If you do the test in the head, then Prolog can do some clever things, e.g.:

pred([], ...) :- ...
pred([X|Xs], ...) :- ...

allows the Prolog compiler to generate “indexing” code that so that at runtime it can go directly to the appropriate clause without creating any choice points.

If instead you have

pred([], ...) :- ...
pred(L, ...) :- L = [X|Xs], ...

then the indexing code isn’t generated, so extra choice points are left around.

An unnecessary choice point not only slows things down (by allowing unnecessary backtracking), but also can cause some values to go onto the heap rather than the stack. Adding a “cut” (e.g., pred(X, ...) :- L = [X|Xs], !, ... only partially improves the situation – it’s best to have enough information in the head of the clause to avoid the creation of a choicepoint in the first place.

If you do something like this:

pred([X|Xs], ...) :- ... pred2([X|Xs]), ...

there is a slight inefficiency in creating the 2nd list from [X|Xs] but it’s much less overhead than the gain from the indexing code – [X|Xs] can be as little as a one or two byte code instructions, and can create a cell on the stack, which is automatically removed by tail recursion optimization.
[In theory, a Prolog compiler could detect that the creation of [X|Xs] uses a value from the head, but I’m not aware of any compiler that does this – if you want to know more about the kinds of local and global optimizations that are possible, see Peter van Roy’s thesis on the Aquarius compiler and SWI Prolog deep indexing.]

The bottom line is: put as much structure as you can into the head of a clause and don’t worry about “duplicate” creation of terms.

4 Likes

Yes. Some Prolog systems (AFAIK) do include unification to head arguments that precedes other predicate into their clause indexing and thus the two clauses below are the same from an indexing perspective.

f(X) :- X = [H|T], ...
f([H|T]) :- ...

Others do capture code like f([H|T]]) :- g([H|T]) and avoid a structure copy. Current SWI-Prolog does neither though. I’m not really sure how much there is to be gained by implementing this. As @peter.ludemann claims, duplicating the term and accepting the structure copy is probably the best choice from a performance and logical point of view.

1 Like

I have to say that many times I’ve missed the handling for the first case:

f(X) :- X = [H|T], ...
f([H|T]) :- ...

Not only for lists but also for other terms decomposition in head that don’t prevent indexing.

To be forced to write:

f(p(_,_,_), T) :-
  ...
%% 10 more clauses for other terms

m :-
   X = p(1,2,3),
   f(X, X).

is indeed a pity (at least IMHO).

The Logtalk compiler folds left unifications in a clause body when compiling code in optimized mode. Thus, for the first clause above, you get as head argument [H|T]. This is a useful optimization as left unifications are often generated when compiling grammar rules.

Indexing of this:

Is not extremly difficult. You were about to index the first argument, and you find a variable X. So usually you then have no functor or type to index. But you have the variable X, and you can start analyzing the body of the clause, look what happens with the variable. Such an algorithm is documented here:

Step 2.2.2 of algorithm I of:
http://user.it.uu.se/~kostis/Papers/iclp07.pdf

I have implemented it for Jekejeke Prolog here:

static Object indexValue(int at, Clause clause)
https://github.com/jburse/jekejeke-devel/blob/c3e654b74a60b9d0f2f5b891e8ff0c0c7e6dd021/jekrun/headless/jekpro/model/rope/Index.java#L70

You see that it works here, that choice point elimination, which needs indexing, works fine. Here is a usual append:

app([],X,X).
app([X|Y],Z,[X|T]) :- app(Y,Z,T).

And here is an append with the functor or type to index in the body:

app2(A,X,X) :- A = [].
app2(A,Z,[X|T]) :- A = [X|Y], app2(Y,Z,T).

Here you see that SWI-Prolog cannot eliminate the choice point, since it doesn’t have enough indexing information:

Bildschirmfoto 2020-01-02 um 03.18.16

On the other hand here choice point elimination works fine, even when the a term is moved out of the head:

Bildschirmfoto 2020-01-02 um 03.20.28

Whats not yet working is deep indexing of the term. This is no wonder, since deep indexing does also not yet work in the head of a clause in my Prolog system. But what I would like to do, as an interim benefit, I would like to extend it to nonvar/1 checks.

Currently var/1 checks are also indexed. But nonvar/1 checks have also a few interesting use cases.

Probably true. The implementation should probably be part of improving the weak spot in SWI-Prolog’s indexing code: building efficient indexing for predicates with a small number of clauses (say up to about 10). Above that we are typically dealing with fairly uniform clauses and the current JIT indexing does a pretty good job. Some special few clauses cases have been implemented, but there is quite an important class where indexing is not great. Interesting project, but there is too much more pressing work right now.

The SWI-Prolog DCG compiler does this :slight_smile:

1 Like

Note that the problem is not (only) indexing, but to have the possibility to avoid term rebuilding without to lose indexing. E.g.:

Here we lose indexing but we avoid to rebuild the huge term:

p(g(X), X).
p(A, A) :-
  A = f(Many_Arguments...).
% Many other similar clauses

Here we have indexing but we are forced to rebuild the huge term:

p(g(X), X).
p(f(Many_Arguments...), A) :-
  A = f(Many_Arguments...).
% Many other similar clauses

Here we have both but we are forced to add another argument just to circumvent the limitation:

p(g(X), _, X).
p(f(Many_Arguments...), A, A).
% Many other similar clauses

% To be called with p(In, In, Out).

I understand. But if those source clauses are encapsulated in a Logtalk object (or category) they would be compiled into clauses with g(X) and f(Many_Arguments...) as first-arguments before calling the backend Prolog compiler on those clauses. Thus, you get indexing without having to rebuild the huge term. Without Logtalk, it’s easy to define a term expansion that removes left unifications from a clause.

AFAIK, XSB does this. I am thinking about doing something like that, in particular I consider to compile body terms that are equivalent to a head term to use the head term. That shouldn’t be particularly hard. With that in place, the source transformation by Logtalk does the job. I first need to make up my mind though.

Doing that in the way you describe changes slightly current semantic:

p1(p(X), A) :-
    A = p(X).

q1 :-
    A = p(3),
    p1(A, B),
    same_term(A, B).

p2(A, A) :-
    A = p(_).

q2 :-
    A = p(3),
    p2(A, B),
    same_term(A, B).

If you do what you are writing above these two examples will behave in the same way (i.e. semantic of p1 will change).

If instead the thing has an effect only on indexing for p2 the current semantic of p1 would be preserved.

True. But, this only applies when using destructive assignment. In the normal Prolog world the difference is not observable. So, the issue is whether there are realistic scenarios where one writes

f(p(X)) :- g(p(X)).

with the intend to make a copy and avoid destructive assignment affecting both terms? If not, it is a mere matter of documenting. From my experience with destructive assignment I don’t think this is a real issue.

1 Like

Isn’t this only an issue when using Declarative Semantics Killer ™ seterg/2 and friends?

Isn’t expecting clause:

p1(p(X), A) :-
    A = p(X).

to have different semantics than the clause:

p1(p(X), p(X)).

terrible practice?

This is exactly my point: i.e. doing things as Jan was considering to do leads to your two clauses to have different semantics.

OTOH as Jan W. is saying the difference is very subtle and seldom make a difference in real code.

Jan B. is instead showing how is possible to have the same benefit without to change current semantic.

I see some issues:

  • If we do not allow for transformations to introduce more sharing that is non-observable in the normal Prolog world we seriously harm any possibility for optimization.
  • The Wonder World™ :slight_smile: of setarg/3 and friends should probably only rely on
    • Terms passed as a variable are never copied.
    • duplicate_term/2 always makes a copy.
  • The copy behavior of other predicates such as copy_term/2, findall/3, etc. are undefined. For example, copy_term/2 does not copy ground (sub) terms. Constructs such as findall(X, member(X, List), List2) may or may not copy the elements of List (currently does).
  • [new] whether or not equivalent terms (==) in a clause are shared terms is undefined (I think the ISO term is implementation dependent).

As well as

  • Optimizing p([H|T]) :- ..., q([H|T]) is a good thing.
  • Surely indexing on p(X) :- X = p(Y) is technically possible. SWI-Prolog is not a WAM though and its indexing works quite different. My intuition is that this seriously complicates and probably noticeably slows down JIT indexing.
1 Like

I had also my reservations when I saw the proposal from the ICLP07 paper. But experience so far is that it doesn’t affect JIT indexing that much. Its not only a compile time technique, also dynamic facts don’t have a body so, there is nothing to analyse.

And rules have a body, but I am using the heuristic that only an initial segment of the clause is analyzed. So the indexer sees a clause as follows and only gathers indexing keys from the indexable conditions at the front:

 Head :- ( IndexableCondition )* ( OtherwiseLiteral )*
         \-- only this part ---/
             is analyzed

The proposed indexing by the ICLP07 paper does not affect the runtime, i.e retrieval or execution of clauses. At least in my Prolog system I did not need to change anything in the retrieval or execution of clauses. Maybe this is different for other Prolog systems.

Only the indexer at compile time or for dynamic clauses at runtime, when a clause is added or removed, needs some little smarts to gather indexing keys. Disclaimer: For a JIT-ter replace compile time by index building time during retrieval or execution.

2020 wont be boring I guess…