Some small omissions in prolog_read_source_term/4?

I’ve been working on adding automatic code formatting support to my LSP server (thanks to @logicmoo). It currently has support for some basic formatting, but I’ve encountered some issues with prolog_read_source_term/4: While it does give very complete information about the location of terms in the file, terms that are defined as operators are somewhat tricky to deal with. For instance, if a source file contains the below:

:- dynamic foo/1.
:- dynamic(foo/1).

Then the only difference when comparing the terms is that the second one ends one character later in the file. I’m currently resorting to such “hacks” to guess whether the original source file had parenthesis (or, e.g. 'foo' vs foo are both the atom foo, so I have to observe that the boundaries of the former are two greater than just the length of the atom).

First question: Is there any better/more reliable way for me to ascertain if a term was written with parentheses?

Similarly, separators in lists aren’t indicated, so the lines

:- [1,    2].
:- [1   , 2].

aren’t distinguishable. This I’m less concerned about, since part of what the formatter does is normalize comma positions, but I’m kind of curious if there’s any way to handle this if I did care.

Good to hear. Would be great to have a reformatter. Note that there is a lot of useful stuff in GitHub - JanWielemaker/reindent: Re-indent SWI-Prolog code This is a bit more limited in scope and was made to merely update the layout conventions of the SWI-Prolog libraries.

The subterm_positions is great building block that tells you were the various parts of the term come from, but indeed not the location of interpunction or the lexical representation of terms. I.e., 0.1 and 1.0e-1 are the same number and the subterm_position only tell where they are. Also quoted atoms and strings have different options for escaping.

What I normally do is to read the content into a string. Now you can use sub_string/5 to quickly get the lexical representation of literals as well as the space between two consecutive literals.

1 Like