 # Solving left recursion in sub-production rule

#1

Hi, logic programmers! I’m writing a C-like language parser that looks like this:

``````expression(Input) --> simple_expression(Input).
expression(Input) --> invocation(Input).
simple_expression(Input) -->
literal(Input) | symbol(Input) | function(Input).
invocation(funcall(Func, Args)) -->
expression(Func), ['('], invocation_args(Args), [')'].
function(func(Params, Body)) -->
['['], function_params(Params), [']'],
['{'], expression(Body), ['}'].
``````

For example, I’d like the following program to be valid and create the following parse tree:

``````?- string_chars("[x]{ [y]{ x * y } }(2)(3)", Cs), phrase(expression(Tree), Cs, []).
Tree = funcall(
funcall(
func([symbol(x)],
func([symbol(y)],
op('*', symbol(x), symbol(y)))),
[number(2)]),
[number(3)]).
``````

The issue is that this grammar is left-recursive on `expression -> invocation -> expression`. I’m trying to solve it from some resources online but would appreciate the input of more experienced people:

1. factoring out the left recursion, the relevant part of the grammar becomes
``````expression(Input) --> simple_expression(Input).
expression(funcall(Func, Args)) -->
simple_expression(Func), inv_args(Input).
inv_args(Args) -->
['('], invocation_args(Args), [')'].
inv_args(????) -->
['('], invocation_args(Args1), [')'], inv_args(Args2).
``````

This would change the parsing tree, and I don’t know how best to handle it.

1. keep the left recursion and include an additional difference list to `invocation`, expecting to consume two chars:
``````invocation(funcall(Func, Args), [_,_|T], T) -->
expression(Func), ['('], invocation_args(Args), [')'].
``````

In this case, how should the difference list be “propagated”? Do I need to change all production rules to include it?

1. Finally, if I were using a procedural language, I would use some local state to keep the expression read so far, and lookahead for a parenthesis to parse an invocation.
``````def parse_expression(stream):
expr = parse_simple_expression(stream)
while stream.peek() == '(':
args = parse_inv_args(stream)
expr = FunCall(expr, args)
return expr
``````

Can this be emulated using semicontext pushback?

Thanks for any pointers,

Bruno Kim.

#2

You can use tabling - indeed, parsing left recursive languages was one of the primary targets of this extension recently added to SWI-Prolog.

``````:- module(lr_parse, [expression//1]).

% deprecated ?
% :- use_module(library(tabling)).

:- use_module(library(dcg/basics)).
:- use_module(library(dcg/high_order)).

:- table expression//1.

expression(Input) --> simple_expression(Input).
expression(Input) --> invocation(Input).

simple_expression(Input) -->
literal(Input) | symbol(Input) | function(Input).

invocation(funcall(Func, Args)) -->
expression(Func), "(", invocation_args(Args), ")".

function(func(Params, Body)) -->
"[", function_params(Params), "]",
"{", expression(Body), "}".

literal(N) --> number(N).

symbol(symbol(S)) -->
code_type(csymf,First),
sequence(code_type(csym),Rest),
{atom_codes(S,[First|Rest])
,debug(lr_parse,'~w',symbol(S))
}.

code_type(Type,C) -->
[C], {code_type(C,Type)}.

function_params(Ps) -->
sequence(symbol, (blanks,",",blanks), Ps).

invocation_args(Args) -->
sequence(expression, (blanks,",",blanks), Args).
``````

I don’t understand your grammar too well, and I provided some of the missing non terminals and tokenize phase with freedom, but your question was on point for me to explore library(dcg/high_order).

While debugging tabled grammars, you could need to issue

``````?- abolish_all_tables.
``````

otherwise you could get a failure but no tracing (goals already failed aren’t retried).

My test:

``````?- phrase(expression(E), `[x]{[y]{x}}(2)(3, 4, u)`).
E = funcall(funcall(func([symbol(x)], func([symbol(y)], symbol(x))), ), [3, 4, symbol(u)]).
``````

HTH, Carlo

#3

Thanks for the tips, it worked well! I also didn’t know about `dcg/basics`, had already recreated some of its stuff.

Regarding your comment on the need to import tabling: I had to import it in my system (SWI-Prolog version 7.6.4).

#4

Sorry, but if not impossible, it will be very difficult.

If under pressure, maybe I could try to emulate (limited) tabling, ‘bracketing’ expression//1 call ports (you know, those verbs you see starting a trace line when debugging…) with a memo mechanism, storing at least the first hidden argument and the outcome of expression//1.

But surely would be buggy, slow, and painful to develop and debug…

#5

I don’t know how to retrieve when tabling has been implemented, but remember my first test was with a DCG. Have you tried to uncomment this directive ? What swipl 7.6.4 does ?

``````% deprecated ?
% :- use_module(library(tabling)).
``````

#6

Oh, I think I expressed myself badly, I was just remarking that I had to uncomment the `use_module` line to make it work on my system – but it did work. No need to fake tabling #7

Phew …

You expressed rather clearly, but I’m so bad at English, had read too much in your post…