Using DCGs for parsing a mustache-like template

Risto-Stevcev · August 11, 2021, 2:20am

Hi Everyone,

I want to write a portable library using DCGs to parse mustache-like templates. The contents of anything inside the mustaches is a term (ie: foo {{some(term)}} bar).
]

I know about quasi-quotation but I want something more flexible that has these properties:

Can be embedded in a single file like a markdown or org file, which can be viewed in it’s “raw” form as a form of documentation
Easy to port to other prolog implementations (ie a DCG)
Have two versions of the string, the raw implementation that shows the terms inside the mustache that are unprocessed for documentation reasons, and another that can replace callable terms with their values like this:

Raw form:

This is an example of {{append}}:
{{append(X, Y, [1,2,3,4])}}

And you can extract the atom and compound term in the example like this:

Extracted = [append, append(X, Y, [1,2,3,4])]

And then an interpolated form where the atom is interpreted “as is” (depending on how the double_quotes directive is set), and the callable term is run and the result is stringified:

This is an example of append:
?- append(X, Y, [1,2,3,4]).                                                                                                                     
X = [],                                                                                                                                         
Y = [1, 2, 3, 4] r                                                                                                                              
X = [1],                                                                                                                                        
Y = [2, 3, 4] r                                                                                                                                 
X = [1, 2],                                                                                                                                     
Y = [3, 4] r                                                                                                                                    
X = [1, 2, 3],                                                                                                                                  
Y = [4] r                                                                                                                                       
X = [1, 2, 3, 4],                                                                                                                               
Y = [] r                                                                                                                                        
false.

I’m a newbie at writing DCGs, this is what I have so far:

mustache(Atom) -->
    string_without("{{", _), "{{", string(Atom), "}}", string(_).

flip(Functor, X, Y) :-
    Goal =.. [Functor, Y, X],
    call(Goal).

?- once(phrase(sequence(mustache, Mustaches), `foo{{bar}}baz{{quxx(foo)}}norf`, L)), maplist(flip(atom_chars), Mustaches, M), atom_chars(L2, L). 
Mustaches = [[98, 97, 114], [113, 117, 120, 120, 40, 102, 111|...]],
L = [110, 111, 114, 102],
M = [bar, 'quxx(foo)'],
L2 = norf.

Boris · August 11, 2021, 6:22am

Very nice. Two comments and a question.

Comment 1: You probably want to use the predicate indicator instead of the name only, so, instead of append you’d write append/3 or maybe append/2, depending on which definition of “append” you mean. This might also help validating the contents of your template.

Comment 2: The DCG rule string_without//2 takes, in the first argument, a list of character codes that cannot appear in string. It does not take a string that cannot appear. Do you see the difference? One way to get what you need is to match just any string with string//1 and then the delimiter:

string(X), "{{", !

What happens is that string//1 matches arbitrary strings of increasing length on backtracking; then the literal "{{" is matched and the whole thing succeeds. The cut is there so that you only match the first occurrence of the delimiter in your input.

Question: What would be the expected result/output of the following snippet:

This is an example of {{length/2}}:
{{length(X, Y)}}

EricGT · August 11, 2021, 8:15am

If you plan on having nested mustache braces, e.g.

{{outer 1 {{inner}} outer 2 }}

then you will need to have the parsing recursive otherwise the closing braces of the most inner mustache braces will be considered the closing brace for the first mustache brace and the remainder of the braces will just cause issues/errors.

rla · August 11, 2021, 9:41am

One option is to first tokenize to separate out text, block starts, block content and block ends. The tokens would be represented by terms like block_start, block_content(Atom) etc. This would be the first set of DCGs. The other set of DCGs would work on top the stream of these tokens to parse the (potentially nesting) structure of the blocks.

This is how GitHub - rla/simple-template: Text templating processor for SWI-Prolog. works.

Topic		Replies	Views
Tokenizing files with DCGs General dcg	3	657	August 12, 2022
What's the idiomatic way of developing DCGs? Help!	8	675	December 17, 2020
Quasiquotations and EDCG Help!	2	560	August 29, 2020
Most efficient DCG for text parsing? Algorithm	4	1891	July 5, 2021
A tokeniser I've written. Any suggestions on how to improve it? Help!	28	2969	April 3, 2019

Using DCGs for parsing a mustache-like template

Related topics