Controlling backtracking in a DCG grammar?

DCGs are used many ways and for many problems, the problem is that most of the talk and examples found by Googling may lead someone first learning about them to believe that they are only used for deterministic parsing or NLP which does DCGs a disservice; they can do so much more.

In your original clause (dcg_item_model//0) it was used with no parameters except the two hidden variables used with DCGs and because it was used with phrase/2, the input for the first hidden parameter had an argument. Upon checking dcg_item_specifier//0 it was found to cause an infinite loop when used with the most general query.

So your code worked as expected based on what you wrote it for your need, the problem started when I used a specific clause (dcg_item_specifier//0) with the most general query. The natural thing with the clause being a DCG was to make it work as the most general query, in this case as a generator, and thus the question.

In trying to understand how to get the DCGs to both parse and generate it became apparent that there was nothing wrong with dcg_item_specifier\\0 for recognizing, (a parser returns something and a recognizer just returns true or false)

dcg_item_specifier --> 
   [Specifier], 
   { item_specifier(Specifier) }, 
   dcg_item_specifier.
dcg_item_specifier --> [].

the problem was using it for generating.

So the code needs to be modified to work as a generator.

dcg_item_specifier//0 modified into
dcg_item_specifier//1 to be used as a generator.

dcg_item_specifier([Specifier|Specifiers]) -->
    [Specifier],
    { item_specifier(Specifier) },
    dcg_item_specifier(Specifiers).
dcg_item_specifier([]) --> [].

Still works as a parser,

dcg_item_specifier_test_case([]                 , []                  ,[]).
dcg_item_specifier_test_case([youngest,son,room], [youngest,son,room] ,[]).
dcg_item_specifier_test_case([room]             , [room]              ,[]).

test(001, [nondet,forall(dcg_item_specifier_test_case(Input,Expected_specifiers,Expected_rest))]) :-
    DCG = dcg_item_specifier(Specifiers),
    phrase(DCG,Input,Rest),
    assertion( Specifiers == Expected_specifiers ),
    assertion( Rest == Expected_rest ).

:- end_tests(dcg_item_specifier).

and still fails as a generator

?- dcg_item_specifier(Specifiers,S0,S).
ERROR: Stack limit (1.0Gb) exceeded
ERROR:   Stack sizes: local: 0.8Gb, global: 0.1Gb, trail: 73.5Mb
ERROR:   Stack depth: 2,548,715, last-call: 0%, Choice points: 5,097,417
ERROR:   Possible non-terminating recursion:
ERROR:     [2,548,715] user:dcg_item_specifier(_30584596, _30584598, _30584600)
ERROR:     [2,548,714] user:dcg_item_specifier([length:1|_30584628], [length:1|_30584634], _30584622)

but as hard as this may be to believe, the answer is now much closer than it seems.

While dcg_item_specifier//1 is now working as a parser and not as a generator, the rest the code needs to be converted to work as both a parser and a generator.

item_specifier(youngest).
item_specifier(son).
item_specifier(oldest).
item_specifier(room).

dcg_item_model(Atom) -->
	dcg_item(Item),
	dcg_item_connector_phrase(Connector),
    dcg_item_specifier(Specifiers),
    {
        append([[Item],Connector,Specifiers],List),
        atomic_list_concat(List, ' ', Atom)
    }.

dcg_item(key) --> [key].

% Specifiers can occur in a row.
dcg_item_specifier([Specifier|Specifiers]) -->
    [Specifier],
    { item_specifier(Specifier) },
    dcg_item_specifier(Specifiers).
dcg_item_specifier([]) --> [].

dcg_item_connector_phrase([for,the]) --> [for, the].
dcg_item_connector_phrase([to,the]) --> [to, the].
dcg_item_connector_phrase([who,is,the]) --> [who, is, the].
dcg_item_connector_phrase([belong,to]) --> [belong, to].

Still works as a parser,

:- begin_tests(dcg_item_model).

dcg_item_model_test_case([key,for,the,youngest,son,room], 'key for the youngest son room' ,[]).
dcg_item_model_test_case([key,who,is,the,oldest,son]    , 'key who is the oldest son'     ,[]).
dcg_item_model_test_case([key,for,the,room]             , 'key for the room'              ,[]).

test(001, [nondet,forall(dcg_item_model_test_case(Input,Expected_sentence,Expected_rest))]) :-
    DCG = dcg_item_model(Sentence),
    phrase(DCG,Input,Rest),
    assertion( Sentence == Expected_sentence ),
    assertion( Rest == Expected_rest ).

dcg_item_model_fail_test_case([]).
dcg_item_model_fail_test_case([one]).

test(002, [fail,forall(dcg_item_model_fail_test_case(Input))]) :-
    DCG = dcg_item_model(_Sentence),
    phrase(DCG,Input,_Rest).

test(002,[nondet]) :-
    Input = [key,for,the,room],
    DCG = dcg_item_model(Sentence),
    phrase(DCG,Input,Rest),
    assertion( Sentence == 'key for the room' ),
    assertion( Rest == [] ).

:- end_tests(dcg_item_model).
?- run_tests(dcg_item_model).
% PL-Unit: dcg_item_model ...... done
% All 6 tests passed
true.

and still fails as a generator

?- dcg_item_model(Sentence,S0,S).
ERROR: Stack limit (1.0Gb) exceeded
ERROR:   Stack sizes: local: 0.8Gb, global: 0.1Gb, trail: 73.5Mb
ERROR:   Stack depth: 2,548,715, last-call: 0%, Choice points: 5,097,416
ERROR:   Possible non-terminating recursion:
ERROR:     [2,548,715] user:dcg_item_specifier(_30584614, _30584616, _30584618)
ERROR:     [2,548,714] user:dcg_item_specifier([length:1|_30584646], [length:1|_30584652], _30584640)

% Execution Aborted

Now for the serendipity.

As noted in an earlier reply, I found How to write an efficient DCG acceptor for strings of unique tokens? to be of value. The part of value is in the response by Paulo

have separate grammars for generating and recognizing.

If you use DCGs for both parsing and generating often you will see the problem of how to do both with one clause is very common. The reference was used to give a hint without giving the answer away.

When I first stated coding with Prolog I was writing code as book answers, and not as real world code. Then having spent some time looking at the SWI-Prolog code on GitHub started to see more uses of guard statements and conditionals.

Also years ago learned about using length/2 to generate a bunch of holes that needed filling.

?- length(N,L).
N = [],
L = 0 ;
N = [_914],
L = 1 ;
N = [_914, _920],
L = 2 ;
N = [_914, _920, _926],
L = 3 ;
N = [_914, _920, _926, _932],
L = 4 ;
N = [_914, _920, _926, _932, _938],
L = 5 ;

When dcg_item_specifier//1 generates it really just needs to take values from item_specifier/1 and drop the values in the holes created by the result of length/2. The problem with using length/2 is that it will go on forever and again back to an infinite loop, but by constraining the values for N to be between 1 and the number of item_specifier/1 clauses and making sure the values placed in the holes are unique for a response, the generator should work. Notice how that is a declarative description of the problem and not a procedural description of the problem.

dcg_item_generator(Specifiers,S0,_S) :-
    aggregate_all(count, item_specifier(_Specifier), Count),
    between(1,Count,N),
    length(Specifiers,N),
    dcg_item_specifier(Specifiers,S0,[]),
    list_distinct(Specifiers).

list_distinct(List) :-
    list_to_set(List, Set),
    length(List,List_length),
    length(Set,List_length).
:- begin_tests(dcg_item_generator).

test(001, [all(Specifiers ==
    [
     [youngest],[son],[oldest],[room],
     [youngest,son],[youngest,oldest],[youngest,room],
     [son,youngest],[son,oldest],[son,room],
     [oldest,youngest],[oldest,son],[oldest,room],
     [room,youngest],[room,son],[room,oldest],
     [youngest,son,oldest],[youngest,son,room],[youngest,oldest,son],[youngest,oldest,room],[youngest,room,son],[youngest,room,oldest],
     [son,youngest,oldest],[son,youngest,room],[son,oldest,youngest],[son,oldest,room],[son,room,youngest],[son,room,oldest],
     [oldest,youngest,son],[oldest,youngest,room],[oldest,son,youngest],[oldest,son,room],[oldest,room,youngest],[oldest,room,son],
     [room,youngest,son],[room,youngest,oldest],[room,son,youngest],[room,son,oldest],[room,oldest,youngest],[room,oldest,son],
     [youngest,son,oldest,room],[youngest,son,room,oldest],[youngest,oldest,son,room],[youngest,oldest,room,son],[youngest,room,son,oldest],[youngest,room,oldest,son],
     [son,youngest,oldest,room],[son,youngest,room,oldest],[son,oldest,youngest,room],[son,oldest,room,youngest],[son,room,youngest,oldest],[son,room,oldest,youngest],
     [oldest,youngest,son,room],[oldest,youngest,room,son],[oldest,son,youngest,room],[oldest,son,room,youngest],[oldest,room,youngest,son],[oldest,room,son,youngest],
     [room,youngest,son,oldest],[room,youngest,oldest,son],[room,son,youngest,oldest],[room,son,oldest,youngest],[room,oldest,youngest,son],[room,oldest,son,youngest]
    ]
    )]) :-
    DCG = dcg_item_generator(Specifiers),
    phrase(DCG,_Input,[]).

:- end_tests(dcg_item_generator).
?- run_tests(dcg_item_generator).
% PL-Unit: dcg_item_generator . done
% test passed
true.

dcg_item_generator//1 is correctly generating the specifiers and more importantly using dcg_item_specifier//1, albeit without using comb/2 as mentioned earlier.

To use the parser version and the generator version of the clause requires a choice to be made based on the input given. If the parameter to dcg_item_model//1 is a variable then use the generator, if not then use the parser.

dcg_item_specifiers(Specifiers,S0,S) :-
    (
         var(S0)
    ->
        % Generate if var(S0)
        dcg_item_generator(Specifiers,S0,S)
    ;
        % Parse if not var(S0)
        dcg_item_specifier(Specifiers,S0,S)
    ).

and modify dcg_item_model//1 to use the new clause.

dcg_item_model(Atom) -->
	dcg_item(Item),
	dcg_item_connector_phrase(Connector),
    dcg_item_specifiers(Specifiers),
    {
        append([[Item],Connector,Specifiers],List),
        atomic_list_concat(List, ' ', Atom)
    }.

A complete test case would be to large, so here is just dcg_item_model//1 partially run as the most general query.

?- dcg_item_model(Sentence,S0,S).
Sentence = 'key for the youngest',
S0 = [key, for, the, youngest] ;
Sentence = 'key for the son',
S0 = [key, for, the, son] ;
Sentence = 'key for the oldest',
S0 = [key, for, the, oldest] ;
Sentence = 'key for the room',
S0 = [key, for, the, room] ;
Sentence = 'key for the youngest son',
S0 = [key, for, the, youngest, son] ;
Sentence = 'key for the youngest oldest',
S0 = [key, for, the, youngest, oldest] ;

and to show that dcg_item_model//1 is not an infinite loop, all the results of most general query for the clause will be counted.

?- aggregate_all(count,dcg_item_model(Sentence,S0,S),Count).
Count = 256.

Here is the final version of the code.

item_specifier(youngest).
item_specifier(son).
item_specifier(oldest).
item_specifier(room).

dcg_item_model(Atom) -->
	dcg_item(Item),
	dcg_item_connector_phrase(Connector),
    dcg_item_specifiers(Specifiers),
    {
        append([[Item],Connector,Specifiers],List),
        atomic_list_concat(List, ' ', Atom)
    }.

dcg_item(key) --> [key].

dcg_item_specifiers(Specifiers,S0,S) :-
    (
         var(S0)
    ->
        % Generate if var(S0)
        dcg_item_generator(Specifiers,S0,S)
    ;
        % Parse if not var(S0)
        dcg_item_specifier(Specifiers,S0,S)
    ).

% Specifiers can occur in a row.
dcg_item_specifier([Specifier|Specifiers]) -->
    [Specifier],
    { item_specifier(Specifier) },
    dcg_item_specifier(Specifiers).
dcg_item_specifier([]) --> [].

dcg_item_connector_phrase([for,the]) --> [for, the].
dcg_item_connector_phrase([to,the]) --> [to, the].
dcg_item_connector_phrase([who,is,the]) --> [who, is, the].
dcg_item_connector_phrase([belong,to]) --> [belong, to].

dcg_item_generator(Specifiers,S0,_S) :-
    aggregate_all(count, item_specifier(_Specifier), Count),
    between(1,Count,N),
    length(Specifiers,N),
    dcg_item_specifier(Specifiers,S0,[]),
    list_distinct(Specifiers).

list_distinct(List) :-
    list_to_set(List, Set),
    length(List,List_length),
    length(Set,List_length).

I did not take the time to tweak it, just made it to demonstrate how to do it. :smiley:

The part I don’t like is how dcg_item_generator//1 has to calculate the number of clauses using aggregate_all/3 for item_specifier/1 each time it enters the clause and why I was thinking about tabling in the earlier comment.


FYI

While the use of length/2 for generating holes to fill may be like a door opening on new ways to use Prolog, for an example that generates a structure more complicated than a list and that even changes the structure as query progresses see: Hints to understand splendid program to solve Queens

Remember the first part of generate and test is generate as in using a DCG to generate data.

1 Like