Comma as atom?

In addition to its other uses, SWI-Prolog parses an unquoted comma as atom, leading to:

?- atom(,).
true.

?- (,,) =.. L.
ERROR: Syntax error: Unexpected `,' before `)'
ERROR: (,
ERROR: ** here **
ERROR: ,) =.. L . 

?- (,,,) =.. L.
L = [',', ',', ','].

This appears to be due to the fact that ‘,’ is defined as a solo character atom like ‘!’ and ‘;’. On researching a couple of other Prolog’s with formal syntax specifications (Eclipse and Sicstus), they only define ‘!’ and ‘;’ as solo-character atoms.

And I’m not sure this is still the case, but a draft of the ISO standard (Covington, 1994) also does not permit an unquoted ‘,’ to be an atom.

So I think this is a unnecessary source of incompatibility and perhaps not the intent, as the following error message might suggest:

?- atom(|).
ERROR: Syntax error: Operand expected, unquoted comma or bar found
ERROR: atom
ERROR: ** here **
ERROR: (|) . 

IMO, it would be better to treat an unquoted comma like an unquoted bar. (Note that this would not preclude its use as an operator in unquoted form, just like the bar.) Is there a reason for its current implementation?

2 Likes

As your example demonstrates, Prolog operator expressions seem to be an endless source of ambiguities since there’s no syntactic difference between operators and operands. I suppose in this particular case ‘,’ is an infix operator with two ‘-’ atoms as operands. But ‘-’ is also a prefix operator so the first two “tokens” could be parsed as -(','). With sufficient look-ahead (2 tokens?) I guess most, but apparently not all, parsers avoid this interpretation.

Even those Prologs with a formal grammar specification seem to have a number of separate operator related restrictions (caveats?) which helps define what happens when dealing with these ambiguities, but they don’t appear to be “standardized” in any real sense, so I guess it’s not surprising that there are differences between Prolog implementations. And, as you say, there is no “Living Standard” project that would facilitate resolving these issues.

That aside, I do think parsing ‘,’ as a solo-character atom isn’t right.

For SWI’s parser, it should be unambiguous even without lookahead, because - has a lower priority number (i. e. binds more tightly) than ,:

?- current_op(Prio, fy, (-)).
Prio = 200.
?- current_op(Prio, xfy, (',')).
Prio = 1000.

This is a bit difficult to test with , because of its special meaning in the syntax, but you can test the same thing with another infix operator. For example, : binds less tightly than -, so SWI’s parser does not allow a bare : after -:

?- X = - : .
ERROR: Syntax error: Unbalanced operator
ERROR: X = - :
ERROR: ** here **
ERROR:  . 
?- X = - (:).
X = - (:).

On the other hand, combinations where the prefix operator does not bind more tightly are allowed by SWI:

?- X = - ^ . % equal priorities
X = - (^).
?- X = (\+ ^ ). % \+ binds less tightly than ^
X =  (\+ (^)).

Proving your point though, not all Prologs care about operator priorities the same way here. SICStus allows a prefix operator before any infix operator, regardless of priorities:

| ?- X = - : .
X = - (:) ? 
yes
| ?- X = - ^ .
X = - (^) ? 
yes
| ?- X = (\+ ^ ).
X = (\+ (^)) ? 
yes

Don’t ask me how ISO-compliant any of this is… My guess is that it isn’t, considering that both SICStus and SWI add clarifying parens even though their parsers don’t need them here.

But I think it’s ambiguous in the general sense in that syntactically the first ‘-’ could be either a prefix operator acting on the atom ‘,’ or an atom which is the first operand to the infix operator ‘,’. So are these “tokens” operands or operators.

Your interpretation assumes the first token is an operator with a precedence, but how does a parser make that decision? Note that the SWI Prolog parser does not interpret it as an operator, which is why it is parenthesized in the output.

Although I would argue, if it wasn’t ambiguous all parsers would behave the same. Or perhaps these are just bugs, except that would imply there’s a documented standard that defines how they should be parsed, and I haven’t seen one of those anywhere.

Errors are a perfectly good result from a parse - they just dictate where additional parentheses/quotes must be used. Different Prologs have made different decisions in implementing their parser. Just another source of incompatibility, except it’s pretty subtle and largely ignored.

1 Like

I’ ve pushed a patch to prevent the comma to be a plain atom. Seems to work fine and I think it is indeed far more likely to catch errors than lead to problems. Let me know if this poses problems to you. Note that SWI-Prolog reads compound arguments with normal high operator precedence, stopping the term at the first toplevel comma or closing parenthesis. That is while x(,) reads as x(’,’,’,’). No more …

Operator precedence handling and reducing operators has been discussed here before. Bottom line is that many systems, including the standard respecting SICStus consider the ISO rules too restrictive. Unfortunately there is no de-facto standard as to what to accept or what term is produced if the multiple reductions are possible. Most systems probably agree on stuff like fx xfy to produce fx(xfy) and a couple of such unambiguous cases. I’m afraid that is what it is :frowning:

1 Like

Works for me. I agree that operator/operand ambiguity is a much bigger issue and unlikely to be resolved pending an unlikely resurrection of standards activities.

Can we have a flag to get the old behavior? There is a dataset I regularly work with that contains Prolog files with plain-atom commas.