Hi, logic programmers! I’m writing a C-like language parser that looks like this:
expression(Input) --> simple_expression(Input).
expression(Input) --> invocation(Input).
simple_expression(Input) -->
literal(Input) | symbol(Input) | function(Input).
invocation(funcall(Func, Args)) -->
expression(Func), ['('], invocation_args(Args), [')'].
function(func(Params, Body)) -->
['['], function_params(Params), [']'],
['{'], expression(Body), ['}'].
For example, I’d like the following program to be valid and create the following parse tree:
?- string_chars("[x]{ [y]{ x * y } }(2)(3)", Cs), phrase(expression(Tree), Cs, []).
Tree = funcall(
funcall(
func([symbol(x)],
func([symbol(y)],
op('*', symbol(x), symbol(y)))),
[number(2)]),
[number(3)]).
The issue is that this grammar is left-recursive on expression -> invocation -> expression
. I’m trying to solve it from some resources online but would appreciate the input of more experienced people:
- factoring out the left recursion, the relevant part of the grammar becomes
expression(Input) --> simple_expression(Input).
expression(funcall(Func, Args)) -->
simple_expression(Func), inv_args(Input).
inv_args(Args) -->
['('], invocation_args(Args), [')'].
inv_args(????) -->
['('], invocation_args(Args1), [')'], inv_args(Args2).
This would change the parsing tree, and I don’t know how best to handle it.
- keep the left recursion and include an additional difference list to
invocation
, expecting to consume two chars:
invocation(funcall(Func, Args), [_,_|T], T) -->
expression(Func), ['('], invocation_args(Args), [')'].
In this case, how should the difference list be “propagated”? Do I need to change all production rules to include it?
- Finally, if I were using a procedural language, I would use some local state to keep the expression read so far, and lookahead for a parenthesis to parse an invocation.
def parse_expression(stream):
expr = parse_simple_expression(stream)
while stream.peek() == '(':
args = parse_inv_args(stream)
expr = FunCall(expr, args)
return expr
Can this be emulated using semicontext pushback?
Thanks for any pointers,
Bruno Kim.