DCGs: only generate a token once in a sequence

Suppose I have the following grammar, representing types with one or more optional supertypes:

types([T]) --> type(T).
types([S|Ts]) --> super_type(S), types(Ts).

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].

It generates the following output:

?- phrase(types(T), P).
T = P, P = [t1] ;
T = P, P = [t2] ;
T = P, P = [s1, t1] ;
T = P, P = [s1, t2] ;
T = P, P = [s1, s1, t1] ;
T = P, P = [s1, s1, t2] ;
T = P, P = [s1, s1, s1, t1] .

However, I would like each supertype to appear at most once in an output list. How do I avoid generating an ever-expanding sequence of s1's?

To clarify, I’d like a set of rules that would give me the following output:

?- phrase(types(T), P).
T = P, P = [t1] ;
T = P, P = [t2] ;
T = P, P = [s1, t1] ;
T = P, P = [s1, t2] ;
T = P, P = [s2, t1] ;        %  Added as per EricGT's comment
T = P, P = [s2, t2] ;        %  Added as per EricGT's comment
T = P, P = [s1, s2, t1] ;
T = P, P = [s1, s2, t2] ;
T = P, P = [s2, s1, t1] ;
T = P, P = [s2, s1, t2] ;
false

How can I do that with a DCG?

Note that I considered using memberchk/2 to test membership of a supertype in the output list, but that doesn’t work well because the output list is a difference list.

Edit: The ordering of results is not important, but the order of tokens makes a difference, so for example [s2,s1,t1] is not the same as [s1,s2,t1]. See comments below.

Should the result be


?- phrase(types(T), P).
T = P, P = [t1] ;
T = P, P = [t2] ;
T = P, P = [s1, t1] ;
T = P, P = [s1, t2] ;
T = P, P = [s2, t1] ;        %  Missing in your version
T = P, P = [s2, t2] ;        %  Missing in your version
T = P, P = [s1, s2, t1] ;
T = P, P = [s1, s2, t2] ;
T = P, P = [s2, s1, t1] ;  % IMHO same as  [s1, s2, t1] 
T = P, P = [s2, s1, t2] ;  % IMHO same as  [s1, s2, t2] 
false

One way of doing is to use tabling.

e.g.

:- untable(types/3).
:- table types/3.

types([T]) --> type(T).
types([S|Ts]) --> super_type(S), types(Ts), { \+ memberchk(S,Ts) }.

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].
super_type(s3) --> [s3].

If you change the DCG you would need to clear the table.

Edit added Non-Tabling Version
You could also carry around the used super types in a separate argument.

types([T],_) --> type(T).
types([S|Ts],Used) --> super_type(S), {\+ memberchk(S,Used) }, types(Ts,[S|Used]).

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].

and use phrase(types(T,[]), P) as the generator.

Neither of these will generate solutions in the exact order you require however.

@Ian Thanks, the tabling version is fine - I don’t need specific ordering. I’m a bit unhappy that your solution is not tail-recursive but that’s probably alright for my needs (I don’t need to generate too big sequences in practice).

I don’t understand how it works though. For example, I don’t understand why memberchk/2 suddendly magickally works, when Ts is still a cyclic tree - or is it? What does tabling change, here?

I also don’t understand why the tabling version “goes infinite” (or at least seems to take a very long time to terminate) if I don’t include memberchk/2.

I’d appreciate an explanation- but I’m grateful because your solution works.

@EricGT

T = P, P = [s2, t1] ;        %  Missing in your version
T = P, P = [s2, t2] ;        %  Missing in your version

Thank you for the correction. My bad for ommitting those.

T = P, P = [s2, s1, t1] ;  % IMHO same as  [s1, s2, t1] 
T = P, P = [s2, s1, t2] ;  % IMHO same as  [s1, s2, t2] 

Ordering changes the meaning of type sequences in my use case, so those are not the same.

From what I can tell, memberchk/2 works there because it’s after the recursive call to types(Ts), so then Ts is a proper list at that point – nothing to do with tabling.

I am far from an expert on tabling, but from what I can tell, the tabling is used to avoid recomputing all the same variants of s1, s2, s3, t1, t2 – once it’s encountered a solution once, it doesn’t need to do another recursive call, preventing the infinite regress.

Tabling and DCGs don’t necessarily play nicely together: http://xsb.sourceforge.net/shadow_site/manual1/node113.html

My solution would be to use 2 accumulators – one to track the items that have been generated/recognized and one regular DCG accumulator.

Oh, right. Thanks.

To be honest, I was trying to keep the number of arguments in the heads of productions to a minumum. But that horse has bolted so I might as well.

EDIT:

Huh. I think this explains why types//1 without the memberchk/2 at the end seems to go infinite:

:-untable(types/3).
:-table(types/3).

:-untable(super_type/3).
:-table(super_type/3).

types([T]) --> type(T).
types([S|Ts]) --> super_type(S), types(Ts).

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].


?- phrase(types(T), P).
ERROR: '$tbl_wkl_add_answer'/4: Not enough resources: private_table_space

And that page you linked to started so promising…

Using EDCGs, you can have multiple accumulators (they don’t show in the clause heads), and they aren’t restricted to lists. (I have some code for type inferencing that uses one accumulator for a symbol table (using red-black trees), one for intermediate expressions that need evaluating, and one for outputting inferred type information.)

Thanks, I had a look at edcg earlier today. I might be able to use it but the new operators might be a problem.

It’s complicated to explain why and it’d probably bore you :slight_smile:

But thanks, it’s a good suggestion.