DCG is too slow with existing search strategy

I’m using: SWI-Prolog version 8.4.2

I want the code to verify an id consists of the right characters (by generating all possible ids)

But what I’m getting is:

A process that effectively never terminates. Also, it generates an empty id which isn’t ideal.

My code looks like this:

id --> [].
id --> 
    [Char], 
    id,
    { char_type(Char, csymf) }.

If I query phrase(id, Id)., I get what I want, but when if I do something like phrase(id, Id), length(Id, 5)., it takes absurdly long.

I also tried something like this:

id(Id) :- forall(member(Char, Id), char_type(Char, csymf)).

But, this doesn’t work when I query something like Id(X).

In the end, I’m hoping to parse kind expressions like kind identifier type -> type - type end for a programming language I’m working on. Id checking needs to be really fast.

Rearrange the constraint order, for sensible efficiency. And to avoid []:

% Want at least 1 char
id --> id_char, id_chars.
    
id_char --> [Char], { char_type(Char, csymf) }.
    
% Can terminate
id_chars --> [].
id_chars --> id_char, id_chars.

Results:

?- time(phrase(id, Id)).
% 11 inferences, 0.000 CPU in 0.000 seconds (86% CPU, 459578 Lips)
Id = ['A'] ;
% 6 inferences, 0.000 CPU in 0.000 seconds (85% CPU, 291149 Lips)
Id = ['A', 'A'] ;
% 6 inferences, 0.000 CPU in 0.000 seconds (82% CPU, 302770 Lips)
Id = ['A', 'A', 'A'] ;
?- length(Id, 5), time(phrase(id, Id)).
% 23 inferences, 0.000 CPU in 0.000 seconds (90% CPU, 618280 Lips)
Id = ['A', 'A', 'A', 'A', 'A'] ;
% 6 inferences, 0.000 CPU in 0.000 seconds (80% CPU, 364277 Lips)
Id = ['A', 'A', 'A', 'A', 'B'] ;

I know you got a good answer and you marked it as a solution, but this is too interesting :smiley: . There are several points of time that you might know that length(Id, 5).

  • you can know this when you write your code
  • you can know this at run time, but before your program evaluates phrase(id, Id)
  • or you only know it at run time, after your program evaluated phrase(id, Id).

Which one is it? This is important because you might have better (more efficient, easier to write) solutions if you can provide this knowledge to the Prolog compiler.

If the list length is known later, then coroutining might be useful:

list_length_freeze(Lst, Len) :-
    when(
        (nonvar(Lst) ; nonvar(Len)),
        list_length_freeze_(Lst, 0, Len)
    ).

list_length_freeze_([], Upto, Len) :-
    (   Upto == Len
    ->  !
    ;   Upto = Len
    ).
list_length_freeze_([_|T], Upto, Len) :-
    inc_nonvar_lt(Upto, Len, Upto1),
    when(
        (nonvar(T) ; nonvar(Len)),
        list_length_freeze_(T, Upto1, Len)
    ).

inc_nonvar_lt(I, V, I1) :-
    (   nonvar(V)
    ->  I < V
    ;   true
    ),
    I1 is I + 1.

Example:

?- list_length_freeze(L, Len), Len = 3.
L = [_, _, _],
Len = 3.

Efficiently incorporating a known, desired length:

% Want at least 1 char
id(Len) --> id_char, { succ(Len0, Len) }, id_chars(Len0).

id_char --> [Char], { char_type(Char, csymf) }.

% Can terminate - cut, for efficiency
id_chars(0) --> !, [].
id_chars(Len) --> id_char, { succ(Len0, Len) }, id_chars(Len0).

Result:

?- time(phrase(id(5), Id)).
% 28 inferences, 0.000 CPU in 0.000 seconds (91% CPU, 779358 Lips)
Id = ['A', 'A', 'A', 'A', 'A'] ;
% 6 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 172682 Lips)
Id = ['A', 'A', 'A', 'A', 'B'] ;
% 6 inferences, 0.000 CPU in 0.000 seconds (88% CPU, 240993 Lips)
Id = ['A', 'A', 'A', 'A', 'C'] ;