Trying to understand disjunction in DCG body

I was messing around with some data and needed to convert some date/time strings to different formats and found the need to prefix certain sequences with a leading ‘0’: anything with a single character should get a leading zero; anything with two characters shouldn’t.

I became curious about more general cases, like what about prepending a leading zero to any string (really just a list, I’m not worried about the quotes and environment and all that) containing a single character, and leaving any string longer than 1 character as-is. I also wanted it to be deterministic.

I wrote this DCG rule:

with_leading_zero(X) -->
    ( { length(X,L), L > 1 }
    -> X
    ; [0|X] ).

I can call phrase/2 in these ways with success:

?- phrase(with_leading_zero([1]), X).
X = [0, 1].
?- phrase(with_leading_zero([1,2,3]), X).
X = [1, 2, 3].
?- phrase(with_leading_zero(X), [1,0]).
X = [1, 0].
?- phrase(with_leading_zero(X), [0,1]).
X = [0, 1].

This fails, which makes sense:

?- phrase(with_leading_zero(X), [1]).
false.

But what I don’t understand is why does this fail unless I call phrase/3 with the remainder?

?- phrase(with_leading_zero(X), [1,2,3]).
false.

I think you’re commiting your generated string to be of length 2, and then it will fail to match [1,2,3]

That’s not a full definition. How does the “anything” end? What actually is it - e.g. an integer? Give it a sensible name.

Defining a zero is easy:

zero --> "0".

For determinism, there are some hints in scryer-prolog/src/lib/serialization/json.pl at 299df50066cd82acafda0bb5dd797853f20b47f0 · mthom/scryer-prolog · GitHub - see the “choice point” comments, which point at Argument Indexing in Prolog video.

… however, can just wrap the parsing phrase call in once/1

Can use e.g. code_type/2 to help define a non-zero.

If the string is intended to be left vague, can use e.g. sequence from SWI-Prolog -- library(dcg/high_order): High order grammar operations , or seq from Prolog DCG Primer

it looks as if the first and last query show an inconsistency? The second argument is [0,1] for both, but the first argument is [1] or [1,2]. Is this part of the spec or an oversight?

Generic wrapper:

once_if_ground(Var, Goal) :-
    (   ground(Var)
    ->  Goal, !
    ;   Goal
    ).

Wow, thanks for the really good responses to such a poorly articulated question.

The “Argument Indexing in Prolog” video by Markus Triska was a really good explanation of the concepts that I was unsure about.

I was able to use what I learned about argument indexing to make my predicate almost semi-deterministic while not using ! or once and also retaining the ability to pose the most general query. I revised my code like this:

with_leading_zero_([L], [], [0,L]).
with_leading_zero_([L|Ls], [_|_], [L|Ls]).
with_leading_zero_if_singleton([L|Ls], Y) :- with_leading_zero_([L|Ls], Ls, Y).

and I was able to make these inferences in the top-level shell:

?- with_leading_zero_if_singleton([1], Y).
Y = [0, 1].

?- with_leading_zero_if_singleton([1,2], Y).
Y = [1, 2].

?- with_leading_zero_if_singleton([1], [0,1]).
true.

?- with_leading_zero_if_singleton(X, [1]).
false.

?- with_leading_zero_if_singleton(X, Y).
X = [_A],
Y = [0, _A] ;
X = Y, Y = [_, _|_].

I also found one case where there is a choice point like in the most general query:

?- with_leading_zero_if_singleton(X, [0,1]).
X = [1] ;
X = [0, 1].

That seems pretty obvious, but it made me think about the underlying first-order logic and how the disjunction is implied by the with_leading_zero_ clauses.

One thing that I was originally confused about was how lists work with argument indexing. I had thought that the length of lists was taken into account, like you get in a lot of Lisp pattern matching modules like https://srfi.schemers.org/srfi-204/srfi-204.html. However, it looks like, even with deep indexing, only empty vs. non-empty lists count for argument indexing. Is that right or am I missing something?

For this really simple example, can use:

:- use_module(library(reif)).

with_leading_zero([], 0).
with_leading_zero([H|T], LZ) :-
    if_(H = 0, LZ = [H|T], LZ = [0,H|T]).

This uses if_/3 from reif pack for determinism.

Result:

?- with_leading_zero(L, LZ).
L = [],
LZ = 0 ;
L = LZ, LZ = [0|_] ;
L = [_A|_B],
LZ = [0, _A|_B],
dif(_A, 0).

However - in a DCG, and while learning the basics of Prolog, I strongly recommend to not care about unwanted choicepoints - it’s far too much distracting effort, for far too little gain.

Instead, just write the DCG so that the only choice (or the most desirable choices) are first, then wrap in once/1, or the above once_if_ground. There is an art to this, of course :grinning: