Possibly too discriminatory behaviour by atom_string/2

A tiny problem with atom_string/2

We find (runt with ?- rt(_).)

In SWI Prolog: 8.1.32

:- begin_tests(atom_string).

% ---
% Type-Transforming leniently 
% ---

% Takes string or atom on the left but generates string on the right

test(stringout_right_1,true(S == "atom")) :- atom_string('atom',S).
test(stringout_right_2,true(S == "atom")) :- atom_string("atom",S).

% Takes string or atom on the right but generates atom on the left

test(atomout_left_1   ,true(S == 'atom')) :- atom_string(S,'atom'). 
test(atomout_left_2   ,true(S == 'atom')) :- atom_string(S,"atom").

% ---
% Comparing leniently
% ---

% Atom on the left is good, whatever is on the right (agree)

test(same1) :- atom_string('atom',"atom").
test(same4) :- atom_string('atom','atom').

% ---
% Comparing strictly
% ---

% String on the left is bad, whatever is on the right 

test(same2,fail) :- atom_string("atom",'atom'). % Consistent but essentially surprising
test(same3,fail) :- atom_string("atom","atom"). % Inconsistent with test stringout_right_2

% ---
% Tests are not expected to be surprising
% ---

test(notsame1,fail)       :- atom_string('mota',"atom").
test(notsame2,fail)       :- atom_string("mota",'atom').
test(notsame3,fail)       :- atom_string("mota","atom").
test(notsame4,fail)       :- atom_string('mota','atom').

:- end_tests(atom_string).

rt(atom_string) :- run_tests(atom_string).

IMHO, the “comparing strinctly” should really succeed, in both cases having it fail is inconsistent for one case, surprising for the other.


(Is this the old problem of a predicate getting a freshvar and binding it to something is really not the same as getting a boundvar to work with? Yes, it is! Predicates are really mashups of functions that handle specific cases of queries based on a compressed representation of the actual relation (i.e. “code”) - and it will stay that way).

Agree. This violates the general idea of SWI-Prolog’s text processing that any text representation is valid as input and the predicate type is only effective for output arguments. Fixed.

After some further reflection, the twist is that … it no longer a fully logical predicate.

All of the cases below are true:

atom_string('atom', "atom"). 
atom_string('atom', 'atom').  
atom_string("atom", "atom").
atom_string("atom", 'atom'). 

But that would make the predicate nondeterministic. That’s not good.

So there is a specific “argument use case” that applies:

atom_string('atom', "atom").  Used for "generate left", "generate right" and test
atom_string('atom', 'atom').  Used for "generate left" and test
atom_string("atom", "atom").  Used for "generate right" and test
atom_string("atom", 'atom').  Used for test only

i.e. the predicate behaves secretly as a “oncer”:

atom(X,Y) :- which_use_case(X,Y,UseCase), once(atom(X,Y,UseCase).

One can live with that of course. Or can one?

It never was. If it were, atom_string(A,S), S = "world" should say A = world. Type conversion is typically not part of the logic of your application, just synchronizing types for interfacing with other parts of the application that represents data a little different. These predicates are logically correct provided you respect the mode restrictions and use them with the intended types. That is why SWI-Prolog tries to follow the rule “accept anything unambiguous as input” and “if the argument is a variable, bind it to the type suggested by the name”. This means atom_string/2 using mode (+,+) can be used to efficiently verify two texts may have different representations, but represent the same sequence of characters, thanks to your suggested improvement :slight_smile:

Similar discussions have been raised for e.g., number_codes/2. And no, we do not want

?- number_codes(10, Codes).
Codes = `10` ;
Codes = `0xa` ;
Codes = ...

For floats that gets worse as there are an infinite number of lexical representations of the same float (just keep adding 0s after the dot), so programs would typically not terminate.


Ulrich Neumerkel had sometimes similar ideas, i.e. that certain predicates should type check also output arguments. Take this query:

SWI-Prolog behaviour:

Welcome to SWI-Prolog (threaded, 64 bits, version 8.3.0)

?- functor(f(x,y),f,a). 

GNU Prolog behaviour:

GNU Prolog 1.4.5 (64 bits)

?- functor(f(x,y),f,a).
uncaught exception: error(type_error(integer,a),functor/3)

Which one is right? The problem is that it might be implementation dependent in which order different modes are handled inside a built-in. If functor(-,+,+) is handled before functor(+,-,-), then you usually get the error. Otherwise not.

The error free ordering of modes for functor/3 is more efficient, because in functor(+,-,-) you just unpack the functor and arity, whereas in functor(-,+,+) you need to build a compound. But also realizing more modes for a predicate is

more runtime consuming, like if there were a third redundant mode functor(+,+,+). Somehow I interpreted Ulrich Neumerkels stance to have such a mode and do a type check there. Now I interpret this thread that there should be an extra mode atom_string(+, +)

but instead of extra errors the desiderata would be extra conversion.


Agree. As for functor/3 the additional test is just a little code and constant time. ISO has some rather horrible ones though. findall/3 should check that the 3th argument is a partial list. So suppose we call findall with a long instantiated list and few answers, then we need to walk the entire list. Another famous one is call/1 which should validate the call before starting execution. SWI-Prolog does so as it compiles the body term, opting for fast execution if there is a lot of internal backtracking in the body term. Other systems such as ECLiPSe execute the body term directly. So, using (fail, 1) ECLIPSe fails and SWI-Prolog raises an error. Both seems ok to me, but ISO demands the SWI-Prolog behavior.

Note that checking output arguments also break the general rule that p(In,Out) is the same as p(In,X), X = Out. I’ve always been fighting for the standard defining the valid domain for the arguments of the predicate in which the behavior is defined and leave the rest as undefined. The domain should include cases for which an error might be something the user anticipates, i.e., open/3 should be defined to raise an existence_error in the case the target file does not exist. functor/3 may be defined over (+compound, -, -) and (-,+atomic,+integer) (with the strange arity=0 cases), leaving the rest unspecified (compile time error, runtime error, failure or even a fatal system error).

1 Like

That sounds like an excellent approach. Sometimes I wish there were flags that one could associate somehow to argument positions to say “check this stuff deeply” during development and unit test runs and “check this stuff shallowly (not at all)” in production.

Related, and probably an old hat, but I found a simple trick to add checking to a predicate that can be easiy commented out. Just add a prefix clause that checks with predicates that are deterministic and raise an exception if there is a problem and fail at the far end.

% The check in the first line of pm/2 is done *at every call* of pm/2.
% If throws if out-of-domain, but fails otherwise, leaving actual processing
% to be done by subsequent pm/2 clauses.

pm(PN,NN) :- acceptable_p(PN),acceptable_n(NN),fail.

% Actual definition of pm/2:

pm(z, 0).
pm(s(N), X) :- X #> 0, X #= Y+1, pm(N, Y). 

See also assertion/1. Also consider the Ciao base assertion language subset with runtime checks created by @edison.

1 Like