DCGs: only generate a token once in a sequence

stassa.p · July 23, 2020, 3:02pm

Suppose I have the following grammar, representing types with one or more optional supertypes:

types([T]) --> type(T).
types([S|Ts]) --> super_type(S), types(Ts).

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].

It generates the following output:

?- phrase(types(T), P).
T = P, P = [t1] ;
T = P, P = [t2] ;
T = P, P = [s1, t1] ;
T = P, P = [s1, t2] ;
T = P, P = [s1, s1, t1] ;
T = P, P = [s1, s1, t2] ;
T = P, P = [s1, s1, s1, t1] .

However, I would like each supertype to appear at most once in an output list. How do I avoid generating an ever-expanding sequence of s1's?

To clarify, I’d like a set of rules that would give me the following output:

?- phrase(types(T), P).
T = P, P = [t1] ;
T = P, P = [t2] ;
T = P, P = [s1, t1] ;
T = P, P = [s1, t2] ;
T = P, P = [s2, t1] ;        %  Added as per EricGT's comment
T = P, P = [s2, t2] ;        %  Added as per EricGT's comment
T = P, P = [s1, s2, t1] ;
T = P, P = [s1, s2, t2] ;
T = P, P = [s2, s1, t1] ;
T = P, P = [s2, s1, t2] ;
false

How can I do that with a DCG?

Note that I considered using memberchk/2 to test membership of a supertype in the output list, but that doesn’t work well because the output list is a difference list.

Edit: The ordering of results is not important, but the order of tokens makes a difference, so for example [s2,s1,t1] is not the same as [s1,s2,t1]. See comments below.

EricGT · July 23, 2020, 3:22pm

Should the result be


?- phrase(types(T), P).
T = P, P = [t1] ;
T = P, P = [t2] ;
T = P, P = [s1, t1] ;
T = P, P = [s1, t2] ;
T = P, P = [s2, t1] ;        %  Missing in your version
T = P, P = [s2, t2] ;        %  Missing in your version
T = P, P = [s1, s2, t1] ;
T = P, P = [s1, s2, t2] ;
T = P, P = [s2, s1, t1] ;  % IMHO same as  [s1, s2, t1] 
T = P, P = [s2, s1, t2] ;  % IMHO same as  [s1, s2, t2] 
false

Ian · July 23, 2020, 3:53pm

One way of doing is to use tabling.

e.g.

:- untable(types/3).
:- table types/3.

types([T]) --> type(T).
types([S|Ts]) --> super_type(S), types(Ts), { \+ memberchk(S,Ts) }.

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].
super_type(s3) --> [s3].

If you change the DCG you would need to clear the table.

Edit added Non-Tabling Version
You could also carry around the used super types in a separate argument.

types([T],_) --> type(T).
types([S|Ts],Used) --> super_type(S), {\+ memberchk(S,Used) }, types(Ts,[S|Used]).

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].

and use phrase(types(T,[]), P) as the generator.

Neither of these will generate solutions in the exact order you require however.

stassa.p · July 23, 2020, 4:54pm

@Ian Thanks, the tabling version is fine - I don’t need specific ordering. I’m a bit unhappy that your solution is not tail-recursive but that’s probably alright for my needs (I don’t need to generate too big sequences in practice).

I don’t understand how it works though. For example, I don’t understand why memberchk/2 suddendly magickally works, when Ts is still a cyclic tree - or is it? What does tabling change, here?

I also don’t understand why the tabling version “goes infinite” (or at least seems to take a very long time to terminate) if I don’t include memberchk/2.

I’d appreciate an explanation- but I’m grateful because your solution works.

@EricGT

T = P, P = [s2, t1] ;        %  Missing in your version
T = P, P = [s2, t2] ;        %  Missing in your version

Thank you for the correction. My bad for ommitting those.

T = P, P = [s2, s1, t1] ;  % IMHO same as  [s1, s2, t1] 
T = P, P = [s2, s1, t2] ;  % IMHO same as  [s1, s2, t2]

Ordering changes the meaning of type sequences in my use case, so those are not the same.

jamesnvc · July 23, 2020, 5:25pm

From what I can tell, memberchk/2 works there because it’s after the recursive call to types(Ts), so then Ts is a proper list at that point – nothing to do with tabling.

I am far from an expert on tabling, but from what I can tell, the tabling is used to avoid recomputing all the same variants of s1, s2, s3, t1, t2 – once it’s encountered a solution once, it doesn’t need to do another recursive call, preventing the infinite regress.

peter.ludemann · July 23, 2020, 5:57pm

Tabling and DCGs don’t necessarily play nicely together: http://xsb.sourceforge.net/shadow_site/manual1/node113.html

My solution would be to use 2 accumulators – one to track the items that have been generated/recognized and one regular DCG accumulator.

stassa.p · July 23, 2020, 6:14pm

Oh, right. Thanks.

To be honest, I was trying to keep the number of arguments in the heads of productions to a minumum. But that horse has bolted so I might as well.

EDIT:

Huh. I think this explains why types//1 without the memberchk/2 at the end seems to go infinite:

:-untable(types/3).
:-table(types/3).

:-untable(super_type/3).
:-table(super_type/3).

types([T]) --> type(T).
types([S|Ts]) --> super_type(S), types(Ts).

type(t1) --> [t1].
type(t2) --> [t2].

super_type(s1) --> [s1].
super_type(s2) --> [s2].


?- phrase(types(T), P).
ERROR: '$tbl_wkl_add_answer'/4: Not enough resources: private_table_space

And that page you linked to started so promising…

peter.ludemann · July 23, 2020, 6:36pm

Using EDCGs, you can have multiple accumulators (they don’t show in the clause heads), and they aren’t restricted to lists. (I have some code for type inferencing that uses one accumulator for a symbol table (using red-black trees), one for intermediate expressions that need evaluating, and one for outputting inferred type information.)

stassa.p · July 23, 2020, 6:53pm

Thanks, I had a look at edcg earlier today. I might be able to use it but the new operators might be a problem.

It’s complicated to explain why and it’d probably bore you

But thanks, it’s a good suggestion.

Topic		Replies	Views
Trivial parsing and generating DCG Help!	5	870	August 3, 2019
What's the idiomatic way of developing DCGs? Help!	8	652	December 17, 2020
Controlling backtracking in a DCG grammar? Help!	10	1176	July 11, 2019
Entry level Question: DCGs over structured alphabets Help! how-to	2	140	February 13, 2024
Most efficient DCG for text parsing? Algorithm	4	1850	July 5, 2021

DCGs: only generate a token once in a sequence

Related topics