Possibly too discriminatory behaviour by atom_string/2

dtonhofer · May 31, 2020, 11:13am

A tiny problem with atom_string/2

We find (runt with ?- rt(_).)

In SWI Prolog: 8.1.32

:- begin_tests(atom_string).

% ---
% Type-Transforming leniently 
% ---

% Takes string or atom on the left but generates string on the right

test(stringout_right_1,true(S == "atom")) :- atom_string('atom',S).
test(stringout_right_2,true(S == "atom")) :- atom_string("atom",S).

% Takes string or atom on the right but generates atom on the left

test(atomout_left_1   ,true(S == 'atom')) :- atom_string(S,'atom'). 
test(atomout_left_2   ,true(S == 'atom')) :- atom_string(S,"atom").

% ---
% Comparing leniently
% ---

% Atom on the left is good, whatever is on the right (agree)

test(same1) :- atom_string('atom',"atom").
test(same4) :- atom_string('atom','atom').

% ---
% Comparing strictly
% ---

% String on the left is bad, whatever is on the right 

test(same2,fail) :- atom_string("atom",'atom'). % Consistent but essentially surprising
test(same3,fail) :- atom_string("atom","atom"). % Inconsistent with test stringout_right_2

% ---
% Tests are not expected to be surprising
% ---

test(notsame1,fail)       :- atom_string('mota',"atom").
test(notsame2,fail)       :- atom_string("mota",'atom').
test(notsame3,fail)       :- atom_string("mota","atom").
test(notsame4,fail)       :- atom_string('mota','atom').

:- end_tests(atom_string).

rt(atom_string) :- run_tests(atom_string).

IMHO, the “comparing strinctly” should really succeed, in both cases having it fail is inconsistent for one case, surprising for the other.

P.S.

(Is this the old problem of a predicate getting a freshvar and binding it to something is really not the same as getting a boundvar to work with? Yes, it is! Predicates are really mashups of functions that handle specific cases of queries based on a compressed representation of the actual relation (i.e. “code”) - and it will stay that way).

jan · May 31, 2020, 3:28pm

Agree. This violates the general idea of SWI-Prolog’s text processing that any text representation is valid as input and the predicate type is only effective for output arguments. Fixed.

dtonhofer · May 31, 2020, 5:07pm

After some further reflection, the twist is that … it no longer a fully logical predicate.

All of the cases below are true:

atom_string('atom', "atom"). 
atom_string('atom', 'atom').  
atom_string("atom", "atom").
atom_string("atom", 'atom').

But that would make the predicate nondeterministic. That’s not good.

So there is a specific “argument use case” that applies:

atom_string('atom', "atom").  Used for "generate left", "generate right" and test
atom_string('atom', 'atom').  Used for "generate left" and test
atom_string("atom", "atom").  Used for "generate right" and test
atom_string("atom", 'atom').  Used for test only

i.e. the predicate behaves secretly as a “oncer”:

atom(X,Y) :- which_use_case(X,Y,UseCase), once(atom(X,Y,UseCase).

One can live with that of course. Or can one?

jan · June 1, 2020, 7:53am

It never was. If it were, atom_string(A,S), S = "world" should say A = world. Type conversion is typically not part of the logic of your application, just synchronizing types for interfacing with other parts of the application that represents data a little different. These predicates are logically correct provided you respect the mode restrictions and use them with the intended types. That is why SWI-Prolog tries to follow the rule “accept anything unambiguous as input” and “if the argument is a variable, bind it to the type suggested by the name”. This means atom_string/2 using mode (+,+) can be used to efficiently verify two texts may have different representations, but represent the same sequence of characters, thanks to your suggested improvement

Similar discussions have been raised for e.g., number_codes/2. And no, we do not want

?- number_codes(10, Codes).
Codes = `10` ;
Codes = `0xa` ;
Codes = ...

For floats that gets worse as there are an infinite number of lexical representations of the same float (just keep adding 0s after the dot), so programs would typically not terminate.

jan · June 2, 2020, 7:29am

Agree. As for functor/3 the additional test is just a little code and constant time. ISO has some rather horrible ones though. findall/3 should check that the 3th argument is a partial list. So suppose we call findall with a long instantiated list and few answers, then we need to walk the entire list. Another famous one is call/1 which should validate the call before starting execution. SWI-Prolog does so as it compiles the body term, opting for fast execution if there is a lot of internal backtracking in the body term. Other systems such as ECLiPSe execute the body term directly. So, using (fail, 1) ECLIPSe fails and SWI-Prolog raises an error. Both seems ok to me, but ISO demands the SWI-Prolog behavior.

Note that checking output arguments also break the general rule that p(In,Out) is the same as p(In,X), X = Out. I’ve always been fighting for the standard defining the valid domain for the arguments of the predicate in which the behavior is defined and leave the rest as undefined. The domain should include cases for which an error might be something the user anticipates, i.e., open/3 should be defined to raise an existence_error in the case the target file does not exist. functor/3 may be defined over (+compound, -, -) and (-,+atomic,+integer) (with the strange arity=0 cases), leaving the rest unspecified (compile time error, runtime error, failure or even a fatal system error).

dtonhofer · June 5, 2020, 9:49am

That sounds like an excellent approach. Sometimes I wish there were flags that one could associate somehow to argument positions to say “check this stuff deeply” during development and unit test runs and “check this stuff shallowly (not at all)” in production.

Related, and probably an old hat, but I found a simple trick to add checking to a predicate that can be easiy commented out. Just add a prefix clause that checks with predicates that are deterministic and raise an exception if there is a problem and fail at the far end.

% The check in the first line of pm/2 is done *at every call* of pm/2.
% If throws if out-of-domain, but fails otherwise, leaving actual processing
% to be done by subsequent pm/2 clauses.

pm(PN,NN) :- acceptable_p(PN),acceptable_n(NN),fail.

% Actual definition of pm/2:

pm(z, 0).
pm(s(N), X) :- X #> 0, X #= Y+1, pm(N, Y).

jan · June 5, 2020, 10:15am

See also assertion/1. Also consider the Ciao base assertion language subset with runtime checks created by @edison.

Topic		Replies	Views
Suppress confusion! Two suggestions for the online docs (atom_codes/2, atom_chars/2) Request For Comments discussion	3	703	April 26, 2020
Alphabetical order between two free variables using term_to_atom/2 Help! how-to	4	491	November 30, 2020
Friday code drop: Transforming properly between atom, string, codelist and charlist Announce	0	317	June 11, 2021
Btw. I found 8 possibilities to concat atoms in Swi Prolog Nice to know	6	885	March 24, 2022
Windows, unit test fails for semweb:ntriples General	1	376	March 31, 2022

Possibly too discriminatory behaviour by atom_string/2

Related topics