Using DCGs for parsing a mustache-like template

Hi Everyone,

I want to write a portable library using DCGs to parse mustache-like templates. The contents of anything inside the mustaches is a term (ie: foo {{some(term)}} bar).
]

I know about quasi-quotation but I want something more flexible that has these properties:

  1. Can be embedded in a single file like a markdown or org file, which can be viewed in it’s “raw” form as a form of documentation
  2. Easy to port to other prolog implementations (ie a DCG)
  3. Have two versions of the string, the raw implementation that shows the terms inside the mustache that are unprocessed for documentation reasons, and another that can replace callable terms with their values like this:

Raw form:

This is an example of {{append}}:
{{append(X, Y, [1,2,3,4])}}

And you can extract the atom and compound term in the example like this:

Extracted = [append, append(X, Y, [1,2,3,4])]

And then an interpolated form where the atom is interpreted “as is” (depending on how the double_quotes directive is set), and the callable term is run and the result is stringified:

This is an example of append:
?- append(X, Y, [1,2,3,4]).                                                                                                                     
X = [],                                                                                                                                         
Y = [1, 2, 3, 4] r                                                                                                                              
X = [1],                                                                                                                                        
Y = [2, 3, 4] r                                                                                                                                 
X = [1, 2],                                                                                                                                     
Y = [3, 4] r                                                                                                                                    
X = [1, 2, 3],                                                                                                                                  
Y = [4] r                                                                                                                                       
X = [1, 2, 3, 4],                                                                                                                               
Y = [] r                                                                                                                                        
false. 

I’m a newbie at writing DCGs, this is what I have so far:

mustache(Atom) -->
    string_without("{{", _), "{{", string(Atom), "}}", string(_).

flip(Functor, X, Y) :-
    Goal =.. [Functor, Y, X],
    call(Goal).

?- once(phrase(sequence(mustache, Mustaches), `foo{{bar}}baz{{quxx(foo)}}norf`, L)), maplist(flip(atom_chars), Mustaches, M), atom_chars(L2, L). 
Mustaches = [[98, 97, 114], [113, 117, 120, 120, 40, 102, 111|...]],
L = [110, 111, 114, 102],
M = [bar, 'quxx(foo)'],
L2 = norf.

Very nice. Two comments and a question.

Comment 1: You probably want to use the predicate indicator instead of the name only, so, instead of append you’d write append/3 or maybe append/2, depending on which definition of “append” you mean. This might also help validating the contents of your template.

Comment 2: The DCG rule string_without//2 takes, in the first argument, a list of character codes that cannot appear in string. It does not take a string that cannot appear. Do you see the difference? One way to get what you need is to match just any string with string//1 and then the delimiter:

string(X), "{{", !

What happens is that string//1 matches arbitrary strings of increasing length on backtracking; then the literal "{{" is matched and the whole thing succeeds. The cut is there so that you only match the first occurrence of the delimiter in your input.

Question: What would be the expected result/output of the following snippet:

This is an example of {{length/2}}:
{{length(X, Y)}}

If you plan on having nested mustache braces, e.g.

{{outer 1 {{inner}} outer 2 }}

then you will need to have the parsing recursive otherwise the closing braces of the most inner mustache braces will be considered the closing brace for the first mustache brace and the remainder of the braces will just cause issues/errors.

One option is to first tokenize to separate out text, block starts, block content and block ends. The tokens would be represented by terms like block_start, block_content(Atom) etc. This would be the first set of DCGs. The other set of DCGs would work on top the stream of these tokens to parse the (potentially nesting) structure of the blocks.

This is how GitHub - rla/simple-template: Text templating processor for SWI-Prolog. works.

1 Like