What's the rationale for `must_be/2` to throw on type mismatch?

In Type tests in Prolog (YouTube presentation of the series “The Power of Prolog”), the type-testing predicates atom_si/1, integer_si/1, list_si/1 (si standing for “sufficiently instantiated”) are presented.

These predicates throw an instantiation error if their parameter is not “sufficiently instantiated” to make a decision, and otherwise succeed or fail depending on whether the test passes or not. Let’s say they behave in a succeed/fail/throw manner.

The standard type tests like atom/1, integer/1 fail without further ado if they receive a fresh variable, there is no “unknown for now” for them. They behave in a succeed/fail/fail manner. Not nice.

Markus Triska writes on quora

First, the type-testing predicates like atom/1, integer/1 and compound/1 would (and should) throw instantiation errors if their arguments are not sufficiently instantiated.

This is also what the original versions of Prolog did. However, DEC 10 Prolog chose to replace instantiation errors by silent failures, and this has been perpetuated in the Edinburgh tradition for type tests including the ISO standard.

The fact that the original behaviour was not kept is very unfortunate and easily leads to programs that are incomplete.

There is a related discussion on Stack Overflow: Safer type tests in Prolog. Quote:

An answer false to integer(X) means that there is no integer. The reason integer(X) fails is that DEC10 did not have errors, whereas its predecessor (Marseille) Prolog I did have errors, but nobody read the (French) papers.

So, I thought predicate must_be/2 from library(error) would be more like it, throwing if unsure, succeeding if the check passes, and failing otherwise (being succeed/fail/throw). But actually not! must_be/2 doesn’t fail, it throws instead (i.e. its behaviour is succeed/throw/throw).

On the other hand, is_of_type/2 uses the old-school succeed/fail/fail behaviour.

What’s the design rationale behind must_be/2, and is there a predicate that does succeed/fail/throw?

(Also I don’t understand current_type/3 … what does it do?)

Here is some test code (update: fixed because the catcher in test(_,throws(Catcher)) cannot be a free variable that is set to a catcher term in the test body. (But why))

:- begin_tests(must_be).

% ---
% Old school testing for atom-icity
% Silently fails on fresh variable (instead of throwing).
% ---

test(oldschool_yes)           :- atom(foo).
test(oldschool_no, fail)      :- atom(1/137).
test(oldschool_unknown, fail) :- atom(_).

% ---
% must_be/2: Likes to throw too much.
% Test "must_be_no" had better fail instead of throwing?
% ---

test(must_be_yes) :-
   must_be(atom,foo).
   
test(must_be_no, error(type_error(_,_),_)) :-
   must_be(atom,1/137).
  
test(must_be_unknown, error(instantiation_error)) :-
   must_be(atom,_).

% doesn't run backwards

test(must_be_fwd_only, error(instantiation_error)) :-
   bagof(X, must_be(X,foo), _Types).

% ---
% is_of_type/2: Behaves old-school-ish
% ---

test(is_of_type_yes)           :- is_of_type(atom,foo).
test(is_of_type_no, fail)      :- is_of_type(atom,1/137).
test(is_of_type_unknown, fail) :- is_of_type(atom,_).

% doesn't run backwards

test(is_of_type_fwd_only, error(instantiation_error)) :-
   bagof(X, is_of_type(X,foo), _Types).

:- end_tests(must_be).

rt :- run_tests(must_be).
1 Like

Throwing in the event of incorrect instantiation allows us to see where we have logical errors in the domain. Silent failure is not always a nice way to proceed when the domain has been accidentally expanded to a class we didn’t expect. If you mean for a predicate to choose amongst options, or silently fail on a given type then you don’t want must_be/2, but if you think the contract of a predicate should ensure that it is never called without passing a given type, then you can ensure the buck stops there.

I’ve found silent failure to be very irritating and confusing during debugging and so I often either insert assertions, write smaller chunks or code defensively. Ultimately it would be nice to have a very seamless system that allows a combination of static and dynamic reasoning to avoid these problems. Maven gets us part of the way there - another useful piece of the puzzle is Schriver’s “type-check” - but neither of them are not quite ergonomic enough to really be used in anger (yet).

A perhaps less important point is that it allows you to reason about predicates in such a way that we can use flow analysis to determine whether type checks are required. This could (in principle) be used to do things like unboxing or other optimisations while being less precise (subsequent disjunction/choice-points do not need to be treated specially) than would be required otherwise.

For instance:


p(X) :- 
  must_be(integer, X).

r(X,Y) :- 
  Y is X + 1.

q(X,1) :- 
  p(X).
q(X,Y) :-
  r(X,Y).

Once we have encountered p/1 we know that X is going to be an integer in r/2.

1 Like

You know the virtual machine of Prolog is WAM. Now some decades
ago there was an attempt to add types to the WAM. Maybe should do that?

C. Beierle and E. Börger.
Refinement of a typed WAM extension by polymorphic order-sorted types .
Formal Aspects of Computing, 8(5):539-564, 1996.
https://www.fernuni-hagen.de/wbs/research/papers/res/BeierleBoerger96_Journal_FAC_PartII.abstract.html

Disclaimer: Many Prolog systems use derivations from the WAM,
so it not really became a practical byte code, but still good for discussions.

That’s exactly the point. In fact, I have come to this conclusion:

  1. Traditional Prolog type tests atom/1, integer/1 etc. are faulty because they “silently fail”:
  • Type match: Success
  • Type mismatch: Failure
  • Cannot decide because a fresh variable was passed: Silent failure
  1. The intent of must_be/2 is to do contract checks on predicate entry (or exit). That’s why it is in library(error). So its behaviour is as expected:
  • Type match: Success
  • Type mismatch: Throw
  • Cannot decide: Throw
  1. The intent of is_of_type/2 is to mirror the behaviour of traditional type-checking in a more flexible way. Thus:
  • Type match: Success
  • Type mismatch: Failure
  • Cannot decide because a fresh variable was passed: Silent failure
  1. We still need the atom_si/1 , integer_si/1 , list_si/1 etc. which are nowhere to be found currently, so that we can have:
  • Type match: Success
  • Type mismatch: Failure
  • Cannot decide: Throw

It seems that typing has been on on-and-off-affair since the 80s (there is even the old-ish paper collection, June 1992: Types in Logic Programming - I have a copy). It would probably not hurt to have at least “sorts” one day. (And now the functional programming languages are starting to use Dependent Types … what does that imply?)

For SWI-Prolog there is this (among others):

Clojure people seem to avoid typing, even though Typed Clojure exists (someone said "we stopped using it, spending too much type on getting the type definitions right), instead, “specs” (runtime checks, precondition, postcondition, invariants) are preferred: clojure.spec. Another case of “worse is better” (or at least more flexible)?

Prolog has a its own spec library plspec, a work in progress. Provides the possibility to perform runtime checks on precondition, postconditions and invariants:

Here is Covington et al. giving a bit of dubious advice:

Covington et al. says on page 30:

Develop your own ad hoc run-time type and mode checking system. Many problems during development (especially if the program is large and/or there are several developers involved) are caused by passing incorrect arguments. Even if the documentation is there to explain, for each predicate, which arguments are expected on entry and on successful exit, they can be, and all too often they are, overlooked or ignored. Moreover, when a “wrong” argument is passed, erratic behavior can manifest itself far from where the mistake was made (and of course, following Murphy’s laws, at the most inconvenient time).

In order to significantly mitigate such problems, do take the time to write your own predicates for checking the legality of arguments on entry to and on exit from your procedures. In the production version, the goals you added for these checks can be compiled away using goal_expansion/2.

The above is fun, but is not a scalable approach.

I find it easier to mark predicates that must succeed because my errors tend to be not matching complex structures rather than having int/atom style mismatches. library(rdet) is a good start, but needs some work (e.g., it’s also useful to check that a deterministic predicate is truly deterministic but library(rdet) doesn’t do that). [library(rdet) throws an error, of course]

1 Like

Its a difficult topic as we have both types and instantiation. The Ciao team probably has the most mature view on these things. They (hope I rephrase correctly) claim indeed that atom/1, etc are wrong. What they want is is_type/1 for every type and type/1 for every type. type/1 then means the argument is compatible with the type. We should read that the argument either is instantiated to something that satisfies the type or can still be instantiated to something that satisfy the type. So, a variable passes all type/1 tests.

Finally we have modes that determine det/semidet/nondet/… for inputs satisfying some type and instantiation. If you want to describe all that such that the programming style that is permitted and commonly used in dynamically typed Prolog you can a rather complex annotation that is computationally not tractable with acceptable performance.

That (in my understanding) is why we have not yet seem a type system that has been widely accepted in the Prolog community. My vision out there in the long run is to have a rather simple type and mode annotation system and try to find obvious type violations rather than proving correctness. This gets closer to the linting approach with which we do have some success and that Paulo took a step further in Logtalk recently.

4 Likes

If must_be/2 would not translate type names, and if there were predicates for every type name. Then the shortest implementation of must_be/2 would be:

must_be(_, X) :- var(X), !, throw(error(instantiation_error,_)).
must_be(T, X) :- call(T, X), !.
must_be(T, X) :- throw(error(type_error(T,X),_)).

But problem is that both name space of type tests and also error convention diverge. Like you have a type test acylic, but the predicate acylic_term/1 is used. And you throw sometimes a domain_error/2 instead of a type_error/2.

The above implementation assumes that call(T, X) implies nonvar(X), or formulated as its contraposition that var(X) implies \+ call(T, X), which is not true for all types. But the above must_be/2 can be easily unfolded. Assume we have this code:

square(X, Y) :- 
   must_be(number, X), 
   Y is X*X.

The must_be/2 is redundant since (is)/2 does also a test. But just for demonstration purpose. If we unfold must_be/2 we get the following. Warning this is a sloppy unfolding using cut (!), just an illustration, to make a point about clause indexing:

square(X, _) :- 
   var(X), !, throw(error(instantiation_error,_)).
square(X, Y) :- 
   number(X), !, 
   Y is X*X.
square(X, _) :- 
   throw(error(type_error(number,X),_)).

In some Prolog systems the above code is more efficient. For example a WAM based Prolog system can use a type switch on the first argument, a form of clause indexing. Does SWI-Prolog unfold must_be/2 or are there never performance issues?

1 Like

must_be/2 is not optimized in any way. It is no more and no less than a cheap and explicit way to add type checks intended primarily for public APIs of libraries. Some people tend to stick it in everywhere. That may seriously slow down the code. I almost never use it in cases where built-in predicates do a reasonable job anyway.

I think the most pragmatic solutions is what @edison is using (AFAIK): a simplified version of the Ciao assertion language for type declarations that is used to insert runtime checks, I guess typically use only during development.

1 Like

Ciao Prolog’s assertion system: REGTYPE

This gives us a relatively compact way to represent a lot of information, the mode, the compatible types, pre-conditions which must be satisfied on calls, post-conditions which must be satisfied on exit, and global conditions. And even a comment which can provide the warning/error text in the event of failure.

I think this is very useful documentation about what a predicate is supposed to achieve in terms of interface, even if what you write is completely ignored by the host system.

However, we could do as Jan suggested, and take a linting approach with three phases.

  • Static: In cases that we can determine a conflict with the conditions, we throw a warning or error.
  • Dynamic: In test mode we insert dynamic checks to ensure that the properties hold or throw an error.
  • Production: In production mode we leave the code as written.

In terms of dependent types, representing these in prolog is not terribly difficult. You just need to parameterise your types with concrete terms. However, calls like must_be would need to be able to take a parametric type argument. A Linting approach could attempt to check some reasonable fragment - and at runtime / dynamically this is really quite simple - just run the code and see if it fails.

3 Likes

Sooner or later you need such an assertion language. Maybe it could cooperate with must_be/2. Currently must_be/2 is kind of closed world. The type and error mapping is even not multifile, you cannot make a Prolog project which extends must_be/2.

If you have such an assertion language you can also use it for offline batch type checkers. You might even call this checker a lint tool, even if it only checks types and not other assertions. This would give a little different architecture than the Logtalk lint,

which seems to be rather closed world as well. 30 years ago was working with an open world type checker, that I wrote on my own although inspired by a talk of Thom W. Frühwirth(*) about type checking back then. Feel free to sniff the code:

Type Checker
http://github.com/jburse/ethz-proquel/blob/8f59be71cdb7ea90a79b43b151a83e7770617ed4/pro_quel/pro_flat_incore/lint.p

Types for ISO Prolog (simplified)
http://github.com/jburse/ethz-proquel/blob/8f59be71cdb7ea90a79b43b151a83e7770617ed4/pro_quel/pro_flat_incore/predef.p

The type checker was two pass. The first pass was reading the types, and the second pass was doing the checking. This allowed mutual recursive types in different files. The main predicate reads as follows:

lint(F) :-
  (retract(l_rec(_,_,_,_)), fail; true),
  (retract(l_err_count(_)), fail; true),
  l_consult1('predef.p'),
  assertz(l_err_count(0)),
  l_consult1(F),
  l_consult2(F).

Think was routinely running this type checker lint/1 over the code, which help a lot tracking changes for example. Often when doing a change one might forget about some dependent code doing the change there. Today I would add a refactoring tool to the

lint checker, so that transitions in the type system can be automatically applied to the code. Like adding a parameter or some such. A most beautiful solution would integrate the type checker and refactoring tool into some IDE.

The above type system is not really what counts nowadays as dependent types. But it allows a form of type disjunction and a form of parameterized types. The ISO predef arithmetic types does not support automatic conversion of int to float and there is no modelling of strict versus non-strict.

(*)
Polymorphic type checking with subtypes in Prolog - Thom W. Frühwirth
DISCO 1990: Design and Implementation of Symbolic Computation Systems pp 121-130
https://pdfs.semanticscholar.org/79c1/e7438c6889ed188bf097deed9fb2ec216e69.pdf

1 Like

This sounds like the thread’s “solution” (if it needs to have any). Lots of good pointers, thank you everyone.

As final post, here is a simple plunit block to illustrate the various “approaches” (can one call them that) to type testing:

Any reason that must/1 doesn’t show up in the Seearch Documentation drop down?
I find it via Google but not from within the SWI-Prolog web site:

must (:Goal) is nondet
https://www.swi-prolog.org/pldoc/doc_for?object=must_sanity%3Amust/1

The drop-down only shows the manual. This comes from a pack. The documentation does include the documentation of predicates in the packs but does not include it in the search. Probably that is a bad idea and should be removed.

Ok, was only looking at the web site, which say “Predicate XXX” and not “Predicate XXX from Pack YYY”, so I thought it belongs to SWI-Prolog and not some package.

The URL also gives me only a module “must_sanity”. What if two different packages contain the same module and the same predicate name? But I guess its the package “logicmoo_utils”.

LogicMoo made a video recently (08.06.2020)
https://www.youtube.com/watch?v=Sy32UXCV7dE