Picat style matching

jan · February 4, 2021, 2:46pm

Writing code that does rewrites of terms that may contain variables, I recall the pain doing such. One always has to be careful not to instantiate the input data. Thus,

rule(Var, Out) :-
    var(Var), 
    !,
    Out = Var.
rule((A0,B0), (A,B)) :-
    rule(A0, A),
    rule(B0, B).
...

This is sort of bearable. As the term in the head gets more complicated though, you need more and more var/nonvar/compound/etc. tests or you can write something as below

rule(In, Out) :-
    subsumes_term(f(g(X),h(y)), In),
    In = f(g(X),h(y)),
    ...

And, as the patterns contain lots of variables you typically have to use a cut to commit. Picat has the => notation for the neck (:-), which specifies one sided unification. Given that the head term and the argument do not share sub terms, this is semantically that same as the subsumes_term/2 example above, followed by a cut.

Although I do not like many aspects of Picat, I do like this (I notably dislike giving up the program-is-data aspect of Prolog). I once wrote a small prototype that implements => and ?=> in the VM. It keeps clause indexing mostly in place (the indexing could be more selective).

The general form of a Picat rule is Head, Condition => Body, where the Head is matched using one-sided unification and the => acts as :-, ! (there is also a variant ?=> that does the same unification but without implicit cut. I’m not sure we need that).

Note that this Picat style forces steadfast code as we cannot unify before the !, so broken rules like this do not work at all, which is better than Prolog where it works “somewhat”:

max(X,Y,X) :- X >= Y, !.
max(_,Y,Y).

You need to write the code instead, which is exactly what we want users to write.

max(X,Y,M),X>=Y   => M = X.
max(_,Y,M)        => M = Y.

I’ve seen so many people making mistakes that are avoided using => that I’m tempted to add this to SWI-Prolog. In that case I also propose to define that, if no rule matches, it is an error (you can always add a catch-all rule). That would

Make writing (semi) det code much less error prone: always steadfast, always deterministic clause selection and an error if no rule matches.
Make writing rewrite rules for non-ground terms much easier and much more efficient because clause indexing keeps doing its work.

This has very few consequences. The additions are simple and short. The tools are involved in some places, but nothing too serious I suspect. The only other problem I see is clause/2. We could have an alternative that has the full rule instead of the body as second argument:

?- rule(max(_,_,_), C).
C = max(X,Y,M),X>=Y => M = X ;
C = max(_, Y,M) => M = Y.

Picat seems to call these rules. The main disadvantage I see is that rule is not unlikely to clash with user code.

abramobagnara · February 4, 2021, 10:32pm

It sounds fantastic and I think that the fact that if no rule matches we get an error incredibly useful (I believe that #1 enemy in Prolog is unexpected failure).
It would be wonderful if there was a variant of => that prevents also the Body to fail (i.e. failure is an error).

jan · February 5, 2021, 8:08am

Some time ago I’ve been thinking about a way to have determinism annotations and check them at runtime in the VM. This is not easy though. The main problem is (if I recall well) last-call optimization. If the det predicate P calls -as last call- Q, this call to Q must also be det. This however need not be a general property of Q, but merely needs to be a property of under the conditions it is called by P. Maybe there are way to resolve this at near zero cost in the VM, but my previous attempt failed to find a solution.

abramobagnara · February 5, 2021, 9:04am

A “depart” variant that track the expectation is not enough?

jan · February 5, 2021, 9:27am

I don’t recall exactly. It might be enough to track deterministic success. We also have to deal with possible failure though and if no choice points gets created we may discard a lot of forward computation by just discarding parts of the environment (local) stack. I’m not sure I still have my notes on this somewhere. Its been a couple of years ago when I tried and abandoned the idea as unrealistic. Maybe I was wrong …

jan · February 7, 2021, 12:56pm

Small status update. In my demo version I can now write our famous fac/2 as below.

%!  fac(+N, -Fac) is det.

fac(N, F) =>
    fac(N, 1, F).

fac(1, F0, F) =>
    F = F0.
fac(N, F0, F), integer(N), N > 1 =>
    F1 is F0*N,
    N2 is N - 1,
    fac(N2, F1, F).

Besides nicely and deterministically computing the factorial we get some errors for free as shown below.

103 ?- fac(a, X).
ERROR: No rule matches fac(a,1,_10766)
ERROR: In:
ERROR:   [11] fac(a,1,_10814)

104 ?- fac(-1, X).
ERROR: No rule matches fac(-1,1,_1666)
ERROR: In:
ERROR:   [11] fac(-1,1,_1714)

105 ?- fac(_, X).
ERROR: No rule matches fac(_3428,1,_3432)
ERROR: In:
ERROR:   [11] fac(_3476,1,_3480)

Here is another example converted from the library. This gives us a free error if the input list is unbound (while the library version succeeds with 0 and a choicepoint and raises an error on backtracking) and an error if the input is not a list (while the library version fails silently).

%!  sumlist(+List, -Sum) is det.

sumlist(List, Sum) =>
    sumlist(List, 0, Sum).

sumlist([], Sum0, Sum) =>
    Sum = Sum0.
sumlist([H|T], Sum0, Sum) =>
    Sum1 is Sum0+H,
    sumlist(T, Sum1, Sum).

grossdan · February 7, 2021, 1:44pm

Can you explain what we see – how does this work / trace?

What is the equivalent prolog code. What is more done here?

Is this like a function?

Dan

jan · February 7, 2021, 2:21pm

The first message in this topic roughly describes what this means. Yes, you can quite easily translate this into (ISO) Prolog using source-to-source translation. The implementation is in the VM though and a lot more efficient.

grossdan · February 7, 2021, 2:25pm

Ah, ok, I thought it was about variable rewrite – and not directly related – should have continued to read.

What does “one-sided unification” mean and what is steadfast code?

jan · February 7, 2021, 4:05pm

I’m definitely not going to call this Picat. I will acknowledge that it is inspired by Picat. Yes, turning absence of a matching rules into an error is not what Picat does. It is IMO the most useful part of the proposal. The proposal is concerned with making “procedural” Prolog code easier to write, less error prone and easier to debug. Note that you can easily add a final rule to make a such a predicate fail silently on any input, e.g.,

fac(_,_) => fail.

jan · February 7, 2021, 4:39pm

All this is basically syntactic sugar, yes. Given a rule/pattern/… where Ai are arbitrary terms of this form

p(A1,A2,...An), C1, C2, ... => Body

This is semantically equivalent to

p(V1,V2,...Vn) :-
    Pattern = p(A1,A2,...An),
    Args = p(V1,V2,...Vn),
    subsumes_term(Pattern, Args),
    Pattern = Args,
    C1, C2, ...,
    !,
    Body.

So yes => is a hidden cut. It also helps doing what Richard O’Keefe promotes and @grossdan rediscovered for properly handling cuts:

Cut as early as possible
Delay output unification until after the cut.

Of course you can still fool the system by having the guard unify output arguments, but it gets a lot less natural to do so. Most Prolog programmers would go for fib(0,0) :- !.. You cannot do that using =>. You can write fib(0,F), F = 0 => true., but that is certainly not natural

Possibly the set of allowed guard conditions should be restricted. In my prototype they are not.

Yes. I see little reason to support ?=>. We have normal Prolog :- if we want nondeterministic code. The ?=> does exist under the hood as a rule with a guard is translated as

Head ?=> Guard, !, Body.

grossdan · February 7, 2021, 4:46pm

Can you explain - give an example – what you mean with a linear head – i am unfamiliar with the terminology.

thanks,

Dan

grossdan · February 7, 2021, 4:59pm

And, why is this called “linear” …

Btw, with your example, i now recall that one case O’Keefe discusses explicit unification (p.97) — its the max/3 example:

max(X, Y, Z) :- X >= T, !, Z=X
max(X, Y, Y).

He calls this property steadfast – to keep the “logic” out of the unification in the head – and ensure that the head can’t be tricked into giving the wrong answer.

But, what about this is linear …

Dan

jan · February 7, 2021, 5:00pm

I doubt this claim. linear(p(a)) is true, but given the Picat program below, Picat > p(_). fails while _ clearly unifies with a.

p(a) => println("a").

That is why we need the subsumes_term/2. That is not now it works internally though. The current implementation does the normal head unification and the translation of the neck verifies no variables are trailed during the head unification. That implies the head is more general than the actual arguments.

We can further optimize this. As is though, it preserves clause indexing, is simple and comes with practically no overhead.

What makes you think that? I wrote this Picat program that does exactly what you expect while there is a non-scalar [H|T] in the head.

sumlist(List, Sum) =>
    sumlist(List, 0, Sum).

sumlist([], Sum0, Sum) =>
    Sum = Sum0.
sumlist([H|T], Sum0, Sum) =>
    Sum1 = Sum0+H,
    sumlist(T, Sum1, Sum).

jan · February 7, 2021, 5:25pm

This is not a meaningless rule:

Picat> mem(A, [A,b]).

yes

Why? Term subsumption is well defined and understood. That is the semantics. We can also consider this single sided unification, i.e., the goal term is never affected by a head unification, but the variables that appear in the head are unified to the goal term. The support for functional notation in Picat may complicate this (unsure). I’m not proposing to modify Prolog wrt. function notation handling. In fact, I’m not proposing to change anything, only to add something that makes writing code that real logic programmers do not want to write in the first place. Many applications contain vast amounts of what one may can “procedural Prolog” and a lot of that code is wrong.

The rest is implementation.

grossdan · February 7, 2021, 5:42pm

I always felt that for many predicates its a (cumulative) overhead to have choice points and backtracking and related allocations and processing by default. And, wished I could indicate that somehow – and i did – i used many once/1 to indicate in the client that its all call steps deterministic.

Right now i am reviewing my code to see which predicate I can turn into deterministic by placing in their body cuts in the end – and there will likely be many. And, remove the once/1 I placed in the client code.

I think this will make the code easier to read, but it hides what is deterministic and what not – perhaps, i can use a prefix for that – just as a name convention. And/or the rdet directive.

jan · February 7, 2021, 7:58pm

member/2 is probably not something we’d like to tackle this way as it is a multi-moded pure predicate. Starting from SWI-Prolog’s implementation it is a nice exercise though.

member(El, [H|T]) :-
    member_(T, El, H).

member_(_, El, El).
member_([H|T], El, _) :-
    member_(T, El, H).

We start with a trivial rewrite:

member(El, [H|T]) =>
    member_(T, El, H).

member_(_, El0, El) ?=>
    El0 = El.
member_([H|T], El, _) =>
    member_(T, El, H).

What does this mean? The good news is that this type of call keeps working:

?- member(X, [1,2]), X = 2.
X = 2.

I think the call below is a gain:

?- member(X, a).
ERROR: No rule matches member(_1872,a)

But,

?- member(1, X).
ERROR: No rule matches member(1,_1872)

This may or may not be what we want. We can however get the normal Prolog behavior by being explicit about this. The two-layer solution requires an extra clause for both predicates. The result is quite nice though: the predicate behaves as the normal member/2, except for raising an error if the second argument is not a list. The performance is not significantly different. I like this Maybe we need to keep ?=> after all.

member(El, [H|T]) =>
    member_(T, El, H).
member(El, L), var(L) =>
    L = [El|_].

member_(_, El0, El) ?=>
    El0 = El.
member_([H|T], El, _) =>
    member_(T, El, H).
member_(List, El, _), var(List) =>
    List = [H|T],
    member_(T, El, H).

jan · February 8, 2021, 8:13am

You get this

103 ?- member(x, Z).
Z = [x|_14878] ;
ERROR: No rule matches member_(_15976,x,_15980)

104 ?- member(x, [y|T]).
ERROR: No rule matches member_(_1996,x,y)
ERROR: In:
ERROR:   [11] member_(_2044,x,y)

That is an interesting thought. Not sure how that would work though. The main problem with the current design is that it may unify goal arguments with big terms just to make => conclude this was not a good idea and backtrack. As the head unification instructions all roughly do “If the current argument is bound, check it is bound to X; else bind it to X”. We can either add a runtime check before the second step or generate specialized versions of these instructions that do “If the current argument is bound, check it is bound to X; else fail” As the fallback strategy (checking the trail) is cheap we can limit specialization to some of the more popular head unification instructions.

That all is just implementation detail though.

jan · February 8, 2021, 1:00pm

Thanks for the remarks. I don’t really see the point of the Picat fresh/1 example. How is that different from simple Prolog X==Y failing? We know the pitfalls of ==/2 on insufficiently instantiated data.

I still think sum_list(L,S) should not succeed with L=[], S=0 as it does in many Prolog implementations and neither should sum_list(a,S) fail silently. Fixing that in traditional Prolog requires more code and involves slowdown. More code could be the right solutions for the library. The issue remains that we want nice and concise predicates that perform well and behave well, ideally throwing an exception on input for which it was not intended. This proposal fulfills these requirements. For sumlist/2 the only change we need is to replace :- with =>. It runs at the same speed as the original and raise exceptions on the two wrong calls.

The rules I was talking about are e.g.

rule((If->Then;Else), Output) =>
     ...

As a Prolog rule that may easily instantiate the (first) input argument. So we get something ugly like below. This is not only ugly, but it is also hard to get indexed (not impossible) and many programmers get it wrong

rule(IfThenElse, Output) :-
    nonvar(IfThenElse),
    IfThenElse = (IfThen ; Else),
    nonvar(IfTen),
    IfThen = (If->Then),
    ...

So yes, also with => there are still fully logical variables that allow you to mess up. It just forces code as Richard O’Keefe claims one should write semidet single-moded predicates. That is not a game changer, just a small step making programming in Prolog for a large class of problems easier. Granted, this class can barely be called logic programming.

grossdan · February 8, 2021, 3:27pm

j4n_bur53:

sumlist(List, Sum) :-
    sumlist(List, 0, Sum).

sumlist(X, _, _) :- var(X), 
    throw(error(instantiation_error,_)).
sumlist([], Sum0, Sum) :- !,
    Sum = Sum0.
sumlist([H|T], Sum0, Sum) :- !,
    Sum1 is Sum0+H,
    sumlist(T, Sum1, Sum).
sumlist(X, _, _) :-
    throw(error(type_error(list,X),_)).

I love seeing examples like these – that show how to program professionally …

Topic		Replies	Views
Single Sided Unification rules manual entry -- I am reading it but don't graps it ...why? Help! ssu	7	751	March 25, 2021
Do the Picat operators `=>` and `?=>` have a formal name in SWI-Prolog? Help!	9	1192	February 13, 2021
Is this a good pure way to deal with side-effects? Help! discussion	36	2367	April 11, 2021
Ann; SWI-Prolog 8.3.19 Releases	12	871	February 11, 2021
Series of "procedural" predicates -- removing choice points Help!	33	2382	June 4, 2019

Picat style matching

Related topics