Careful with format/2

Just saw this little discrepancy. The format library from Markus Triska:

/* Scryer Prolog v0.9.1-61-g84583da5 */
?- format("abc ~f def", [123.45]).
abc 123.45 def   true.

?- format("abc ~f def", [123.456]).
abc 123.456 def   true.

On the other hand in SWI-Prolog:

?- format("abc ~f def", [123.45]).
abc 123.450000 def

?- format("abc ~f def", [123.456]).
abc 123.456000 def

Bug or feature. Don’t know, was this already discussed?

Edit 12.01.2023
Ok I see, SICStus Prolog explicitly defaults:

‘~Nf’
‘~NF’
N defaults to 6.

https://sicstus.sics.se/sicstus/docs/latest/html/sicstus/mpg_002dref_002dformat.html

Maybe it is a little unwise to divert from that?

Nobody knows. format/1-3 was introduced by Quintus AFAIK. The initial version in SWI-Prolog simply implemented this spec. Later some options were added. AFAIK none of these broke full compatibility with the original. Many Prolog systems implemented format/1-3, but many of these implementations are partial. The original specs are quite involved, in particular the filling and tab sequences.

If you want to print a float with the minimum number of digits, use ~w.

Should have been part of the standard long ago :frowning:

Interestingly ~g works as well on the Windows platform:

/* SWI-Prolog 9.1.2 */
?- format("abc ~g def", [123.45]).
abc 123.45 def
?- format("abc ~g def", [123.456]).
abc 123.456 def

But ~g is not supported by Scryer Prolog:

?- format("abc ~g def", [123.45]).
   error(domain_error(format_string,"~g def"),format_//2).

More lucky with Trealla Prolog:

?- format("abc ~g def", [123.456]).
abc 123.456 def   true.

In the end, SWI-Prolog format/3 for floats simply calls snprintf() using %<arg><c> where <arg> is the numeric argument (default 6) and <c> is the float conversion specifier ([eEfgG]). Seems Quintus, SICStus and Trealla do the same.

There is a bug in the SWI-Prolog ~Ng format. The specification says:

g
Floating point in e or f notation, whichever is shorter.

You can now try, there is something wrong:

?- format("abc ~2f def", [123.45]).
abc 123.45 def
?- format("abc ~2e def", [123.45]).
abc 1.23e+02 def
?- format("abc ~2g def", [123.45]).
abc 1.2e+02 def

The ~Ng result is neither the ~Ne nor the ~Nf result.

Edit 12.01.2023
But Python does the same, maybe its not a bug?

/* Python 3.11.0rc1 (main, Aug  8 2022, 11:30:54) */
>>> "%.2e" % 123.45
'1.23e+02'
>>> "%.2g" % 123.45
'1.2e+02'

Possible explanation, ~Ng takes a precision digit count, ~Ne takes an
exponential fraction digit count and ~Nf takes a fixed fraction digit count.

The Prolog community could standardisize it voluntarily
without the need of the ISO body. Just have a common place,
where somebody puts up a kind of specification document.

Something like a format/2 Prolog Enhancement Proposals (PEP).

Now I found more discrepancies between Markus Triskas
pure Prolog implementation and the SWI-Prolog implementation.
Of course these discrepancies are nifty small details, but anyway:

SWI-Prolog tolerates precision parameter, and ignores it I guess?

?- format("abc ~w def", [123.456]).
abc 123.456 def

?- format("abc ~2w def", [123.456]).
abc 123.456 def

Markus Triska doesn’t tolerate precision parameter for ‘w’ formatting:

?- format("abc ~w def", [123.456]).
abc 123.456 def   true.

?- format("abc ~2w def", [123.456]).
   error(domain_error(format_string,"~2w def"),format_//2).

Edit 13.01.2023
Unfortunately Markus Triskas pure Prolog implementation is
a little biased. It doesn’t accept an atom as a first argument:

?- format('abc ~w def', [123.456]).
   error(type_error(list,'abc ~w def'),must_be/2).

A type error that says char_list would be also better, since
Scryers Prologs double quotes are char lists. I didn’t check
yet, did I call format/2 with a string in the case of SWI-Prolog?

Maybe format/2 cannot be standardisized?
Too much effort because of too much diversity?

A middle ground could be to standardisized only float formatting.
So that there are some primitives that do float formatting,
and various string interpolations and portraying could be

bootstrapped from it. I find some rudimentaries here from ROK:

float_codes(Float, Codes, Format) :-
     % like number_codes/2 but only for floats

http://www.cs.otago.ac.nz/staffpriv/ok/pllib.htm

So the standardisation would takle what the ‘%’ operator can
do in Python, when the left argument is a string and the right
argument is a float. But there is much to be demanded, what

if the right argument is an integer, especially a bigint and not
a smallint, a bigint that cannot be converted to float. So ROKs
take is a little outdated, since is not bigint aware.

SWI-Prolog is currently bigint aware:

?- format("abc ~2f def", [123123123123123123123]).
abc 123123123123123123123.00 def

?- format("abc ~2f def", [123123123123123123123.0]).
abc 123123123123123126272.00 def

Trealla Prolog doesn’t tolerate integer:

?- format("abc ~2f def", [123123123123123123123]).
   error(type_error(float,123123123123123123123),format/2).

Scryer Prolog does sometimes nonsense for float:

?- format("abc ~2f def", [123123123123123123123.0]).
abc 1.23 def   true.

Edit 13.01.2023
My conclusion, to reach the level of SWI-Prolog,
a number_codes with a format parameters is needed, and
not a float_codes that is restricted to floats.

With a number_codes that also accepts integers, it will go
smooth to also format integers, as SWI-Prolog does.
On my side I started defining a new built-in:

atom_number(-Atom, +Atom, +Integer, +Number)

The above built-in takes a slightly different turn, not codes
but atom is the currency for number conversion. The
input atom is ‘f’ or ‘e’, and the input integer is the requested

precision. But it is currently too stupid for bigint, working on it.

That makes some sense.

Traditionally format/3 accepts atoms and classical Prolog strings. Of course it also accepts SWI-Prolog’s strings. I guess they will add atoms at some point as there is a lot of code using atoms as format argument. Arguable, strings are cleaner. In the old days they were also rather expensive though. Notably if you do format/3 in a low level language. Using an atom, Prolog simply passes the atom and the low level language gets a pointer to the text. No garbage, no translation. Using a string we push a list of character codes and then parse these back into a C string …

As is, format/3 accepts an arithmetic expression for numerical arguments. This is somewhat dubious, but was needed when rationals where terms rdiv(N,D). After that, the float format may use LibBF/GMP big floats and their formatting if necessary. So, you can do

?- format('~1000f~n', [1r3]).


Makes sense. My first intuition would go to number_codes(+Number, -Codes, +Integer, +Number).' or possible number_to_codes` as it is not bi-directional. But, it probably also needs a locale argument. That means we need to standardize locale handling. The Prolog standard is very outdated in the modern IT landscape :frowning:

Yes, I got something to this end in formerly Jekejeke Prolog,
there is a module library(system/locale):

format_atom(F, A, S):
format_atom(L, F, A, S):
The predicate formats the arguments A from the format F and
unifies the result with S. The quaternary predicate allows
specifying a locale L.

But its an easy implementation, that simply delegates to the
Java formatter. Which accepts format specifiers in the form “%…”
and not in the form “~…”. The above API doesn’t accept “~…”

and it is a little bit weak what concerns Prolog term formatting.
Given the many specifics of the Prolog based format specifiers
“~…” and that I now deal with Python and JavaScript, I have

to reinvent the wheel. For JavaScript and/or Python there are
surely some packages around that can do more, than what I am
trying to currently provide as a minimum local-less format/[2,3].

So there would be a second PEP or a combined PEP:

format/2 with locale Prolog Enhancement Proposals (PEP).

In my format_atom/[3,4] I have only scratched the surface, Java has
also things like number shaper for more exotic regions. This rather looks
like a lot of work, that could keep 10 people busy for 10 years.

Edit 13.01.2023
PEPs are probably anyway a dead end. Best would be if Prolog
systems were designed around a Novacore. So atom_number/4
would probably fit into a Novacore, but format/[2,3] not.

format/[2,3] would have a pure Prolog implementation, that
can be shared across Prolog systems. Same for locales, a pure
Prolog implementation, or some Semantic Net from ChatGPT.

LoL

A little correction. I was cheating myself. The ~g format specifier
seems to have also default N=6. So it does not the same like
the ~w format specifier. Here is a test case that shows the default N=6:

?- format("abc ~g def", [120.4567]).
abc 120.457 def

Edit 14.01.2023
Now I found a nice test case for format/[2,3]:

/* SWI-Prolog 9.1.2 */
?- E = sin(pi/6), R is E, format(
   'Evaluating ~w gives ~4f rounded to 4 digits', [E,R]), nl.
Evaluating sin(pi/6) gives 0.5000 rounded to 4 digits
E = sin(pi/6),
R = 0.49999999999999994.

Which Scryer Prolog fails:

/* Scryer Prolog v0.9.1-61-g84583da5 */
?- E = sin(pi/6), R is E, format(
   "Evaluating ~w gives ~4f rounded to 4 digits", [E,R]), nl.
Evaluating sin(pi/6) gives 0.4999 rounded to 4 digits
   E = sin(pi/6), R = 0.49999999999999994.

Astonishingly there is a use-case to also have ~NF:

/* Available in SWI-Prolog 9.1.2 */
?- X is inf, format('abc ~2f def', X).
abc inf def

With a capital format specifier it could print:

/* Not available in SWI-Prolog 9.1.2 */
?- X is inf, format('abc ~2F def', X).
abc INF def

Credits Wikipedia:


https://en.wikipedia.org/wiki/Printf_format_string#Type_field

Edit 13.01.2023
For example the format specifiers ~Ng and ~NG do that already:

?- X is inf, format('abc ~2g def', X).
abc inf def

?- X is inf, format('abc ~2G def', X).
abc INF def
1 Like

Although I am not a big fan in also standardisizing error terms.
But just a slight notice here, seems SWI-Prolog generates some
non-standard error term:

/* SWI-Prolog 9.1.2 */
?- catch(format('abc ~w def', []), error(E,_), true).
E = format('not enough arguments').

?- catch(format('abc ~w def', [a,b]), error(E,_), true).
E = format('too many arguments').

I tried something else. Used the standard syntax_error/1 error
term, and it now gives:

/* Dogelog Player 1.0.3 */
?- catch(format('abc ~w def', []), error(E,_), true).
E = syntax_error(superflous_format).

?- catch(format('abc ~w def', [a,b]), error(E,_), true).
E = syntax_error(format_missing).

Why syntax error? This error term sees the culprit in the given
template string. A more elobrate error could also return an error
location in the template string, this would be especially useful

for the error result syntax_error(superflous_format), if it would
give some extra information. Similar like Java MissingFormatArgumentException
does. Have to check what the other Prolog systems do.

Edit 17.01.2023
Trealla Prolog has somehow adopted to not locate the
culprit in the give template, but in the given varargs list,
but it doesn’t check too many arguments:

/* Trealla Prolog 2.7.30 */
?- catch(format('abc ~w def', []), error(E,_), true).
   E = domain_error(missing_args,[]).

?- catch(format('abc ~w def', [a,b]), error(E,_), true).
abc a def   true.

Thats of course another take to use more standard
error terms, to use for example domain_error/2.

I do agree that the error terms from format/1-3 are not great. On the other hand, they are still error(Formal, Context) terms and I doubt anyone ever wants to catch these and do something smart with the error term rather than simply printing it. So, it doesn’t matter much.

Syntax error would (to me) suggest there is something wrong with 'abc ~w def', which is not the case. Standard ISO messages are quite inadequate in this case. We could say there is no term for ~w, so it would be an existence_error. The arguments to that are a type and a term, so that doesn’t fit either. domain_error surely comes to mind, but doesn’t fit very well either. I think the ISO set needs to be extended. There are a lot of things missing :frowning:

Also consider this:

?- format('~s', aap(noot)).
ERROR: Illegal argument to format sequence ~s: aap(noot)
ERROR: In:
ERROR:   [10] format('~s',aap(noot))

?- catch(format('~s', aap(noot)), E, true).
E = error(format_argument_type(s, aap(noot)), context(system:format/2, _)).

I guess a type_error would be appropriate here. Still, it is rather hard to get the context such that we can produce a good error message. In my view, if the error is caused by a wrong program rather than running a correct program in unexpected context (e.g., a file is missing) the main goal is to produce a message that helps the programmer as much as possible to find the cause of the exception.

You could blame 'abc ~w def', that somebody accidentially put
an extra format specifier, in case of ‘not enough arguments’. This
is also what Java does, give some additional information,

about the extra format specifier. Here is what printf from
Java gives me. It has already parsed and isolated the offending
format specifier, which expects a next argument, which is then missing:

/* Jekejeke Prolog 1.5.5 */
?- catch(printf('abc %s def', []), error(E,_), true).
Error: Unknown template: representation_error('Format specifier ''%s''')
/* java.util.MissingFormatArgumentException extends IllegalFormatException */

?- catch(printf('abc %s def', [p,q]), error(E,_), true).
abc p def

The above shows also that Java printf tolerates additional arguments
in the second test case, unlike what Prolog does. Not sure what C does,
the above is Java, maybe Java adopted it from C? Here Prolog,

at least as pioneered by SWI-Prolog, the approach is more strict.
Not sure what Quintus did, was Quintus also that strict? But the
additional argument check is also adopted by Scryer Prolog:

/* Scryer Prolog v0.9.1-65-gf9e3bdb6 */
?- catch(format("abc ~w def", []), error(E,_), true).
   E = domain_error(non_empty_list,[]).

?- catch(format("abc ~w def", [a,b]), error(E,_), true).
   E = domain_error(empty_list,"b").

Its a case where the context spans two arguments of the predicate.
A part of the first argument to format/2 is involved and a part of the
second argument to format/2 is involved. I guess it is safe to call

the second argument the varargs argument, although SWI-Prolog
doesn’t support varargs as for example in call/n and it is reified into
a list. Or maybe I am mistaken? On the other hand the first argument

is the format template. In old Prolog documentations it is called format
control. But I think the new terminology is format template. Both are
somehow correct. I find Wikipedia telling me:

The printf format string is a control parameter used
by a class of functions in the input/output libraries of C
and many other programming languages. The string is
written in a simple template language: characters are
usually copied literally into the function’s output, but
format specifiers, which start with a % character, indicate
the location and method to translate a piece of data
(such as a number) to characters.
https://en.wikipedia.org/wiki/Printf_format_string

Most errors have also a location in the format template, which one
can use in an error culprit. But type_error/2 is an example that
is usually emitted even when it happens who knows where.

I am not yet there, making such reference, you need a character
offset to give a useful message, and not a line number. And
the format template is not some stream which would have a filename.

Just eschewing on the idea making it syntax_error/1.

Edit 17.01.2023
About the varargs question. I find that SWI-Prolog does automatic boxing:

?- format('abc ~w def', a).
abc a def

Which is a little dangerous:

?- X = [a,b,c], format('abc ~w def', X).
ERROR: Format error: too many arguments

It would be less dangerous if it would not be done at runtime
and instead, if it would be rather done during compile time. Varargs
for format/2 in general cannot work, the case of more than one

output item, since there is already a conflict with format/3:

?- format('abc ~w def ~w', a, b).
ERROR: stream `'abc ~w def ~w'' does not exist

Doing boxing the right way can be really hard.
I am not sure whether Ciao Prolog presents me with
some innovation, in that boxing is controlled by the

format template. Thats an interesting behaviour. Did
Quintus have this behaviour? Need to check other
Prolog systems as well. It seems that if there is a single

format specifier in the format template, the second
argument of format/2 is always boxed? Not quite right,
see second test case:

/* Ciao Prolog 1.22.0 */
?- format('abc ~w def', []).
abc [] def

?- format('abc ~w def', [p]).   %%% no-boxing exception [_]
abc p def

?- format('abc ~w def', p).
abc p def

?- format('abc ~w def', [p,q]).
abc [p,q] def

Because of this format template dependency Ciao Prolog
doesn’t emit the same error messages as SWI-Prolog when
there is only one item to print.

Edit 17.01.2023
I guess this makes the format/2 predicate a little brittle,
when used to print lists. One cannot distinguish [p] and
p in the output, not sure whether it is a tolerable flaw?

It is tolerable in as far as there exists a work around. The
work around is to avoid boxing when there is only
one item to print, same advice applies to SWI-Prolog:

/* Ciao Prolog 1.22.0 */
?- format('abc ~w def', [[p]]).
abc [p] def

?- format('abc ~w def', [p]).
abc p def

So when you are Pavlov’s dog and have learned to use
and enjoy Ciao Prologs or SWI-Prolog feature, because
of some positive feedback where it works, you have to

unlearn this feature nevertheless, which needs the
administration of electric shocks to the programmer.
At least I guess for term output this could be advisable.

For built-ins (i.e., C defined predicates) it does support variadic arguments. There is no way to define such things in Prolog and I don’t think it is a good idea to promote them because f/1 and f/2 are as far as Prolog is concerned different functions.

The current behavior that a single argument need not be placed in a list is inherited from Quintus. I think this should be deprecated as it is indeed very easy to make mistakes.

Currently not sure whether it is de facto deprecated now,
in that Scryer Prolog, Trealla Prolog, Tau Prolog, etc…
don’t implement it anymore. There could be always surprises!

Even if it is deprecated, they might implement something
for some backward compatibility. Maybe Logtalk has some
linter to spot such code smells? Delegating the problem

to tooling could be the solution.

Edit 17.01.2023
Markus Triskas pure Prolog doesn’t provide boxing anymore:

/* Scryer Prolog v0.9.1-65-gf9e3bdb6 */
?- format("abc ~w def", p).
   error(type_error(list,p),must_be/2).

Oki Doki

check/0 validates the number of arguments. Currently not if a variable is passed. Possibly we should warn in that case too, provided that the template is provided.

edit It already issues a deprecated warning.